[303] | 1 | % ----------------------------------------------------------------------- |
---|
| 2 | % intro.tex: Introduction, and guide to what Duchamp is doing. |
---|
| 3 | % ----------------------------------------------------------------------- |
---|
| 4 | % Copyright (C) 2006, Matthew Whiting, ATNF |
---|
| 5 | % |
---|
| 6 | % This program is free software; you can redistribute it and/or modify it |
---|
| 7 | % under the terms of the GNU General Public License as published by the |
---|
| 8 | % Free Software Foundation; either version 2 of the License, or (at your |
---|
| 9 | % option) any later version. |
---|
| 10 | % |
---|
| 11 | % Duchamp is distributed in the hope that it will be useful, but WITHOUT |
---|
| 12 | % ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or |
---|
| 13 | % FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License |
---|
| 14 | % for more details. |
---|
| 15 | % |
---|
| 16 | % You should have received a copy of the GNU General Public License |
---|
| 17 | % along with Duchamp; if not, write to the Free Software Foundation, |
---|
| 18 | % Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA |
---|
| 19 | % |
---|
| 20 | % Correspondence concerning Duchamp may be directed to: |
---|
| 21 | % Internet email: Matthew.Whiting [at] atnf.csiro.au |
---|
| 22 | % Postal address: Dr. Matthew Whiting |
---|
| 23 | % Australia Telescope National Facility, CSIRO |
---|
| 24 | % PO Box 76 |
---|
| 25 | % Epping NSW 1710 |
---|
| 26 | % AUSTRALIA |
---|
| 27 | % ----------------------------------------------------------------------- |
---|
[158] | 28 | \secA{Introduction and getting going quickly} |
---|
| 29 | |
---|
[1037] | 30 | \secB{About \duchamp} |
---|
[309] | 31 | |
---|
[158] | 32 | This document provides a user's guide to \duchamp, an object-finder |
---|
[258] | 33 | for use on spectral-line data cubes. The basic execution of \duchamp |
---|
[158] | 34 | is to read in a FITS data cube, find sources in the cube, and produce |
---|
| 35 | a text file of positions, velocities and fluxes of the detections, as |
---|
| 36 | well as a postscript file of the spectra of each detection. |
---|
| 37 | |
---|
[309] | 38 | \duchamp has been designed to search for objects in particular sorts |
---|
| 39 | of data: those with relatively small, isolated objects in a large |
---|
| 40 | amount of background or noise. Examples of such data are extragalactic |
---|
| 41 | \hi surveys, or maser surveys. \duchamp searches for groups of |
---|
| 42 | connected voxels (or pixels) that are all above some flux |
---|
| 43 | threshold. No assumption is made as to the shape of detections, and |
---|
| 44 | the only size constraints applied are those specified by the user. |
---|
[298] | 45 | |
---|
[447] | 46 | \duchamp has been written as a three-dimensional finder, but it is |
---|
| 47 | possible to run it on a two-dimensional image (\ie one with no |
---|
| 48 | frequency or velocity information), or indeed a one-dimensional array, |
---|
| 49 | and many of the features of the program will work fine. The focus, |
---|
| 50 | however, is on object detection in three dimensions, one of which is a |
---|
| 51 | spectral dimension. Note, in particular, that it does not do any |
---|
| 52 | fitting of source profiles, a feature common (and desirable) for many |
---|
| 53 | two-dimensional source finders. This is beyond the current scope of |
---|
| 54 | \duchamp, whose aim is reliable detection of spectral-line objects. |
---|
| 55 | |
---|
[1028] | 56 | \duchamp provides the ability to pre-process the data prior to |
---|
| 57 | searching. Spectral baselines can be removed, and either smoothing or |
---|
| 58 | multi-resolution wavelet reconstruction can be performed to enhance |
---|
| 59 | the completeness and reliability of the resulting catalogue. |
---|
| 60 | |
---|
[1037] | 61 | \secB{Acknowledging the use of \duchamp} |
---|
[964] | 62 | |
---|
[1037] | 63 | \duchamp is provided in the hope that it will be useful for your |
---|
| 64 | research. If you find that it is, I would ask that you acknowledge it |
---|
| 65 | in your publication by using the following: |
---|
| 66 | "This research made use of the Duchamp source finder, produced at |
---|
| 67 | the Australia Telescope National Facility, CSIRO, by M. Whiting." |
---|
| 68 | |
---|
| 69 | Additionally, \duchamp has been described in a journal paper |
---|
| 70 | \citep{whiting12}. This paper covers the key algorithms implemented in |
---|
| 71 | the software, and provides some simple completeness and reliability |
---|
| 72 | comparisons of different modes of operation. Users of \duchamp are |
---|
| 73 | encouraged to read the paper in conjunction with this user guide, as |
---|
| 74 | while some things are repeated herein, not everything |
---|
| 75 | is. \citet{whiting12} should be cited when describing the use of |
---|
| 76 | \duchamp in your research. |
---|
| 77 | |
---|
| 78 | |
---|
| 79 | |
---|
[232] | 80 | \secB{What to do} |
---|
| 81 | |
---|
[158] | 82 | So, you have a FITS cube, and you want to find the sources in it. What |
---|
| 83 | do you do? First, you need to get \duchamp: there are instructions in |
---|
| 84 | Appendix~\ref{app-install} for obtaining and installing it. Once you |
---|
| 85 | have it running, the first step is to make an input file that contains |
---|
| 86 | the list of parameters. Brief and detailed examples are shown in |
---|
| 87 | Appendix~\ref{app-input}. This file provides the input file name, the |
---|
| 88 | various output files, and defines various parameters that control the |
---|
| 89 | execution. |
---|
| 90 | |
---|
[258] | 91 | The standard way to run \duchamp is by the command |
---|
[158] | 92 | \begin{quote} |
---|
[294] | 93 | {\footnotesize |
---|
| 94 | \texttt{> Duchamp -p [parameter file]} |
---|
| 95 | } |
---|
[158] | 96 | \end{quote} |
---|
| 97 | replacing \texttt{[parameter file]} with the name of the file listing |
---|
| 98 | the parameters. |
---|
| 99 | |
---|
| 100 | An even easier way is to use the default values for all parameters |
---|
| 101 | (these are given in Appendix~\ref{app-param} and in the file |
---|
[231] | 102 | \texttt{InputComplete} included in the distribution directory) and use |
---|
| 103 | the syntax |
---|
[158] | 104 | \begin{quote} |
---|
[294] | 105 | {\footnotesize |
---|
| 106 | \texttt{> Duchamp -f [FITS file]} |
---|
| 107 | } |
---|
[158] | 108 | \end{quote} |
---|
| 109 | where \texttt{[FITS file]} is the file you wish to search. |
---|
| 110 | |
---|
[294] | 111 | The default action includes displaying a map of detected objects in a |
---|
| 112 | PGPLOT X-window. This can be disabled by setting the parameter |
---|
[298] | 113 | \texttt{flagXOutput = false} or using the \texttt{-x} command-line |
---|
[294] | 114 | option, as in |
---|
| 115 | \begin{quote} |
---|
| 116 | {\footnotesize |
---|
| 117 | \texttt{> Duchamp -x -p [parameter file]} |
---|
| 118 | } |
---|
| 119 | \end{quote} |
---|
| 120 | and similarly for the \texttt{-f} case. |
---|
[158] | 121 | |
---|
[294] | 122 | Once a FITS file and parameters have been set, the program will then |
---|
| 123 | work away and give you the list of detections and their spectra. The |
---|
| 124 | program execution is summarised below, and detailed in |
---|
| 125 | \S\ref{sec-flow}. Information on inputs is in \S\ref{sec-param} and |
---|
| 126 | Appendix~\ref{app-param}, and descriptions of the output is in |
---|
| 127 | \S\ref{sec-output}. |
---|
| 128 | |
---|
[158] | 129 | \secB{Guide to terminology and conventions} |
---|
| 130 | |
---|
[258] | 131 | First, a brief note on the use of terminology in this guide. \duchamp |
---|
[158] | 132 | is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is |
---|
| 133 | the Flexible Image Transport System -- see \citet{hanisch01} or |
---|
| 134 | websites such as |
---|
| 135 | \href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html} |
---|
[258] | 136 | for details.} image arrays with (at least) three dimensions. They |
---|
| 137 | are assumed to have the following form: the first two dimensions |
---|
| 138 | (referred to as $x$ and $y$) are spatial directions (that is, relating |
---|
| 139 | to the position on the sky -- often, but not necessarily, |
---|
| 140 | corresponding to Equatorial or Galactic coordinates), while the third |
---|
| 141 | dimension, $z$, is the spectral direction, which can correspond to |
---|
| 142 | frequency, wavelength, or velocity. The three dimensional analogue of |
---|
| 143 | pixels are ``voxels'', or volume cells -- a voxel is defined by a |
---|
[265] | 144 | unique $(x,y,z)$ location and has a single value of flux, intensity |
---|
[258] | 145 | or brightness (or something equivalent) associated with it. |
---|
[158] | 146 | |
---|
[285] | 147 | Sometimes, some pixels in a FITS file are labelled as BLANK -- that |
---|
| 148 | is, they are given a nominal value, defined by FITS header keywords |
---|
[1028] | 149 | \textsc{blank} (and potentially \textsc{bscale} and \textsc{bzero}), |
---|
| 150 | that marks them as not having a flux value. These are often used to |
---|
| 151 | pad a cube out so that it has a rectangular spatial shape. \duchamp |
---|
| 152 | has the ability to avoid these: see \S\ref{sec-blank}. |
---|
[285] | 153 | |
---|
[158] | 154 | Note that it is possible for the FITS file to have more than three |
---|
[232] | 155 | dimensions (for instance, there could be a fourth dimension |
---|
| 156 | representing a Stokes parameter). Only the two spatial dimensions and |
---|
| 157 | the spectral dimension are read into the array of pixel values that is |
---|
| 158 | searched for objects. All other dimensions are ignored\footnote{This |
---|
| 159 | actually means that the first pixel only of that axis is used, and the |
---|
| 160 | array is read by the \texttt{fits\_read\_subsetnull} command from the |
---|
[160] | 161 | \textsc{cfitsio} library.}. Herein, we discuss the data in terms of |
---|
| 162 | the three basic dimensions, but you should be aware it is possible for |
---|
| 163 | the FITS file to have more than three. Note that the order of the |
---|
| 164 | dimensions in the FITS file does not matter. |
---|
[158] | 165 | |
---|
[232] | 166 | With this setup, each spatial pixel (a given $(x,y)$ coordinate) can |
---|
| 167 | be said to be a single spectrum, while a slice through the cube |
---|
| 168 | perpendicular to the spectral direction at a given $z$-value is a |
---|
[265] | 169 | single channel, with the 2-D image in that channel called a channel |
---|
| 170 | map. |
---|
[158] | 171 | |
---|
[1028] | 172 | Detection involves locating contiguous groups of voxels with fluxes |
---|
[258] | 173 | above a certain threshold. \duchamp makes no assumptions as to the |
---|
[1028] | 174 | size or shape of the detected features, other than allowing |
---|
[1262] | 175 | user-selected minimum or maximum size criteria. Features that are |
---|
| 176 | detected are assumed to be positive. The user can choose to search for |
---|
| 177 | negative features by setting an input parameter -- which will invert |
---|
| 178 | the cube prior to the search (see \S\ref{sec-searchTechnique} for |
---|
| 179 | details). |
---|
[158] | 180 | |
---|
[232] | 181 | \secB{A summary of the execution steps} |
---|
| 182 | |
---|
| 183 | The basic flow of the program is summarised here -- all steps are |
---|
| 184 | discussed in more detail in the following sections. |
---|
| 185 | \begin{enumerate} |
---|
| 186 | \item The necessary parameters are recorded. |
---|
| 187 | |
---|
| 188 | How this is done depends on the way the program is run from the |
---|
| 189 | command line. If the \texttt{-p} option is used, the parameter file |
---|
| 190 | given on the command line is read in, and the parameters therein are |
---|
| 191 | read. All other parameters are given their default values (listed in |
---|
| 192 | Appendix~\ref{app-param}). |
---|
| 193 | |
---|
| 194 | If the \texttt{-f} option is used, all parameters are assigned their |
---|
[1178] | 195 | default values, with the flux threshold able to be set with the |
---|
| 196 | \texttt{-t} option. |
---|
[232] | 197 | |
---|
| 198 | \item The FITS image is located and read in to memory. |
---|
| 199 | |
---|
| 200 | The file given is assumed to be a valid FITS file. As discussed |
---|
[258] | 201 | above, it can have any number of dimensions, but \duchamp only |
---|
[232] | 202 | reads in the two spatial and the one spectral dimensions. A subset |
---|
| 203 | of the FITS array can be given (see \S\ref{sec-input} for details). |
---|
| 204 | |
---|
[1028] | 205 | \item \label{step-reuse} If requested, a FITS file containing a |
---|
| 206 | previously reconstructed or smoothed array is read in. |
---|
[232] | 207 | |
---|
[285] | 208 | When a cube is either smoothed or reconstructed with the \atrous |
---|
| 209 | wavelet method, the result can be saved to a FITS file, so that |
---|
| 210 | subsequent runs of \duchamp can read it in to save having to re-do |
---|
[1028] | 211 | the calculations. |
---|
[232] | 212 | |
---|
| 213 | \item \label{step-blank} If requested, BLANK pixels are trimmed from |
---|
| 214 | the edges, and the baseline of each spectrum is removed. |
---|
| 215 | |
---|
[285] | 216 | BLANK pixels, while they are ignored by all calculations in |
---|
| 217 | \duchamp, do increase the size in memory of the array above that |
---|
| 218 | absolutely needed. This step trims them from the spatial edges, |
---|
[1021] | 219 | keeping a record of the amount trimmed so that they can be added |
---|
| 220 | back in later. |
---|
[232] | 221 | |
---|
[1028] | 222 | A spectral baseline (or bandpass) may optionally be removed at this |
---|
| 223 | point as well. This may be necessary if there is a ripple or other |
---|
[265] | 224 | large-scale feature present that will hinder detection of faint |
---|
| 225 | sources. |
---|
[232] | 226 | |
---|
| 227 | \item If the reconstruction method is requested, and the reconstructed |
---|
| 228 | array has not been read in at Step 3 above, the cube is |
---|
[258] | 229 | reconstructed using the \atrous wavelet method. |
---|
[232] | 230 | |
---|
[1028] | 231 | This step uses the multi-resolution \atrous method to determine the |
---|
| 232 | amount of structure present at various scales. A simple thresholding |
---|
| 233 | technique then removes random noise from the cube, leaving the |
---|
| 234 | significant signal. This process can greatly reduce the noise level |
---|
| 235 | in the cube, enhancing the reliability of the resulting catalogue. |
---|
[232] | 236 | |
---|
[275] | 237 | \item Alternatively (and if requested), the cube is smoothed, either |
---|
| 238 | spectrally or spatially. |
---|
[232] | 239 | |
---|
[275] | 240 | This step presents two options. The first considers each spectrum |
---|
| 241 | individually, and convolves it with a Hanning filter (with width |
---|
| 242 | chosen by the user). The second considers each channel map |
---|
[285] | 243 | separately, and smoothes it with a Gaussian kernel of size and shape |
---|
| 244 | chosen by the user. This step can help to reduce the amount of noise |
---|
[1028] | 245 | visible in the cube and enhance fainter sources, increasing the |
---|
| 246 | completeness and reliability of the output catalogue. |
---|
[232] | 247 | |
---|
| 248 | \item A threshold for the cube is then calculated, based on the pixel |
---|
| 249 | statistics (unless a threshold is manually specified by the user). |
---|
| 250 | |
---|
| 251 | The threshold can either be chosen as a simple $n\sigma$ threshold |
---|
[285] | 252 | (\ie a certain number of standard deviations above the mean), or |
---|
| 253 | calculated via the ``False Discovery Rate'' method. Alternatively, |
---|
| 254 | the threshold can be specified as a simple flux value, without care |
---|
| 255 | as to the statistical significance (\eg ``I want every source |
---|
| 256 | brighter than 10mJy''). |
---|
[232] | 257 | |
---|
[265] | 258 | By default, the full cube is used for the statistics calculation, |
---|
| 259 | although the user can nominate a subsection of the cube to be used |
---|
| 260 | instead. |
---|
| 261 | |
---|
[232] | 262 | \item Searching for objects then takes place, using the requested |
---|
| 263 | thresholding method. |
---|
| 264 | |
---|
[1028] | 265 | The cube is searched either one channel-map at a time (``spatial'' |
---|
| 266 | search) or one spectrum at a time (``spectral'' search). Detections |
---|
| 267 | are compared to already detected objects and either combined with a |
---|
[264] | 268 | neighbouring one or added to the end of the list. |
---|
[232] | 269 | |
---|
| 270 | \item The list of objects is condensed by merging neighbouring objects |
---|
| 271 | and removing those deemed unacceptable. |
---|
| 272 | |
---|
[264] | 273 | While some merging has been done in the previous step, this process |
---|
| 274 | is a much more rigorous comparison of each object with every other |
---|
| 275 | one. If a pair of objects lie within requested limits, they are |
---|
| 276 | combined. |
---|
[232] | 277 | |
---|
[264] | 278 | After the merging is done, the list is culled (although see comment |
---|
| 279 | for the next step). There are certain criteria the user can specify |
---|
[1262] | 280 | that objects must meet: minimum (or maximum) numbers of spatial |
---|
| 281 | pixels and spectral channels, and minimum separations between |
---|
| 282 | neighbouring objects. Those that do not meet these criteria are |
---|
| 283 | deleted from the list. |
---|
[264] | 284 | |
---|
| 285 | \item If requested, the objects are ``grown'' down to a lower |
---|
| 286 | threshold, and then the merging step is done a second time. |
---|
| 287 | |
---|
| 288 | In this case, each object has pixels in its neighbourhood examined, |
---|
| 289 | and if they are above a secondary threshold, they are added to the |
---|
| 290 | object. The merging process is done a second time in case two |
---|
| 291 | objects have grown over the top of one another. Note that the |
---|
| 292 | rejection part of the previous step is not done until the end of the |
---|
| 293 | second merging process. |
---|
| 294 | |
---|
[232] | 295 | \item The baselines and trimmed pixels are replaced prior to output. |
---|
| 296 | |
---|
| 297 | This is just the inverse of step~\#\ref{step-blank}. |
---|
| 298 | |
---|
| 299 | \item The details of the detections are written to screen and to the |
---|
| 300 | requested output file. |
---|
| 301 | |
---|
| 302 | Crucial properties of each detection are provided, showing its |
---|
| 303 | location, extent, and flux. These are presented in both pixel |
---|
| 304 | coordinates and world coordinates (\eg sky position and |
---|
| 305 | velocity). Any warning flags are also printed, showing detections to |
---|
[265] | 306 | be wary of. Alternative output options are available, such as a |
---|
[1078] | 307 | VOTable or annotation files for image viewers such as kvis, ds9 or |
---|
| 308 | casaviewer. |
---|
[232] | 309 | |
---|
| 310 | \item Maps showing the spatial location of the detections are written. |
---|
| 311 | |
---|
| 312 | These are 2-dimensional maps, showing where each detection lies on |
---|
| 313 | the spatial coverage of the cube. This is provided as an aid to the |
---|
| 314 | user so that a quick idea of the distribution of object positions |
---|
| 315 | can be gained \eg are all the detections on the edge? |
---|
| 316 | |
---|
| 317 | Two maps are provided: one is a 0th moment map, showing the 0th |
---|
[265] | 318 | moment (\ie a map of the integrated flux) of each detection in its |
---|
| 319 | appropriate position, while the second is a ``detection map'', |
---|
| 320 | showing the number of times each spatial pixel was detected in the |
---|
| 321 | searching routines (including those pixels rejected at step 9 and so |
---|
| 322 | not in any of the final detections). |
---|
[232] | 323 | |
---|
| 324 | These maps are written to postscript files, and the 0th moment map |
---|
| 325 | can also be displayed in a PGPLOT X-window. |
---|
| 326 | |
---|
[1028] | 327 | \item A pixel mask is written to a FITS file. |
---|
| 328 | |
---|
| 329 | A FITS file of the same size as the input file can be written. Here, |
---|
| 330 | each pixel has a value indicating whether or note it was detected |
---|
| 331 | and falls in one of the catalogue sources. Different objects can be |
---|
| 332 | traced by different non-zero pixel values. |
---|
| 333 | |
---|
[232] | 334 | \item The integrated or peak spectra of each detection are written to a |
---|
| 335 | postscript file. |
---|
| 336 | |
---|
| 337 | The spectral equivalent of the maps -- what is the spectral profile |
---|
| 338 | of each detection? Also provided here are basic information for each |
---|
| 339 | object (a summary of the information in the results file), as well |
---|
| 340 | as a 0th moment map of the detection. |
---|
| 341 | |
---|
[819] | 342 | \item If requested, a text file containing all spectra is written. |
---|
| 343 | |
---|
| 344 | This file will contain the peak or integrated spectra for each |
---|
| 345 | source, as a function of the appropriate spectral coordinate. The |
---|
| 346 | file is a multi-column ascii text file, suitable for import into |
---|
| 347 | other software packages. |
---|
| 348 | |
---|
[1178] | 349 | \item If requested, FITS files are written containing the |
---|
| 350 | reconstructed, smoothed, baseline or mask arrays. |
---|
[232] | 351 | |
---|
[1178] | 352 | If one of the preprocessing methods was used, the resulting array |
---|
| 353 | can be saved as a FITS file for later examination or use (for |
---|
| 354 | instance, reading in as described at step \#\ref{step-reuse}). The |
---|
| 355 | FITS header will be the same as the input file, with a few |
---|
| 356 | additional keywords to identify the file. |
---|
[232] | 357 | |
---|
| 358 | \end{enumerate} |
---|
| 359 | |
---|
[1028] | 360 | \secB{Why ``\duchamp''?} |
---|
| 361 | |
---|
| 362 | Well, it's important for a program to have a name, and the initial |
---|
| 363 | working title of \emph{cubefind} was somewhat uninspiring. I wanted to |
---|
| 364 | avoid the classic astronomical approach of designing a cute acronym, |
---|
| 365 | and since it is designed to work on cubes, I looked at naming it after |
---|
| 366 | a cubist. \emph{Picasso}, sadly, was already taken \citep{minchin99}, |
---|
| 367 | so I settled on naming it after Marcel Duchamp, another cubist, but |
---|
| 368 | also one of the first artists to work with ``found objects''. |
---|
| 369 | |
---|
[1178] | 370 | |
---|
| 371 | %%% Local Variables: |
---|
| 372 | %%% mode: latex |
---|
| 373 | %%% TeX-master: "Guide" |
---|
| 374 | %%% End: |
---|