source: tags/release-1.6.1/docs/intro.tex

Last change on this file was 1262, checked in by MatthewWhiting, 11 years ago

Adding text to the Guide on the baseline changes, as well as the new maximum pixels/voxels/channels parameters.

File size: 16.9 KB
Line 
1% -----------------------------------------------------------------------
2% intro.tex: Introduction, and guide to what Duchamp is doing.
3% -----------------------------------------------------------------------
4% Copyright (C) 2006, Matthew Whiting, ATNF
5%
6% This program is free software; you can redistribute it and/or modify it
7% under the terms of the GNU General Public License as published by the
8% Free Software Foundation; either version 2 of the License, or (at your
9% option) any later version.
10%
11% Duchamp is distributed in the hope that it will be useful, but WITHOUT
12% ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13% FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
14% for more details.
15%
16% You should have received a copy of the GNU General Public License
17% along with Duchamp; if not, write to the Free Software Foundation,
18% Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA
19%
20% Correspondence concerning Duchamp may be directed to:
21%    Internet email: Matthew.Whiting [at] atnf.csiro.au
22%    Postal address: Dr. Matthew Whiting
23%                    Australia Telescope National Facility, CSIRO
24%                    PO Box 76
25%                    Epping NSW 1710
26%                    AUSTRALIA
27% -----------------------------------------------------------------------
28\secA{Introduction and getting going quickly}
29
30\secB{About \duchamp}
31
32This document provides a user's guide to \duchamp, an object-finder
33for use on spectral-line data cubes. The basic execution of \duchamp
34is to read in a FITS data cube, find sources in the cube, and produce
35a text file of positions, velocities and fluxes of the detections, as
36well as a postscript file of the spectra of each detection.
37
38\duchamp has been designed to search for objects in particular sorts
39of data: those with relatively small, isolated objects in a large
40amount of background or noise. Examples of such data are extragalactic
41\hi surveys, or maser surveys. \duchamp searches for groups of
42connected voxels (or pixels) that are all above some flux
43threshold. No assumption is made as to the shape of detections, and
44the only size constraints applied are those specified by the user.
45
46\duchamp has been written as a three-dimensional finder, but it is
47possible to run it on a two-dimensional image (\ie one with no
48frequency or velocity information), or indeed a one-dimensional array,
49and many of the features of the program will work fine. The focus,
50however, is on object detection in three dimensions, one of which is a
51spectral dimension. Note, in particular, that it does not do any
52fitting of source profiles, a feature common (and desirable) for many
53two-dimensional source finders. This is beyond the current scope of
54\duchamp, whose aim is reliable detection of spectral-line objects.
55
56\duchamp provides the ability to pre-process the data prior to
57searching. Spectral baselines can be removed, and either smoothing or
58multi-resolution wavelet reconstruction can be performed to enhance
59the completeness and reliability of the resulting catalogue.
60
61\secB{Acknowledging the use of \duchamp}
62
63\duchamp is provided in the hope that it will be useful for your
64research. If you find that it is, I would ask that you acknowledge it
65in your publication by using the following:
66"This research made use of the Duchamp source finder, produced at
67the Australia Telescope National Facility, CSIRO, by M. Whiting."
68
69Additionally, \duchamp has been described in a journal paper
70\citep{whiting12}. This paper covers the key algorithms implemented in
71the software, and provides some simple completeness and reliability
72comparisons of different modes of operation. Users of \duchamp are
73encouraged to read the paper in conjunction with this user guide, as
74while some things are repeated herein, not everything
75is. \citet{whiting12} should be cited when describing the use of
76\duchamp in your research.
77
78
79
80\secB{What to do}
81
82So, you have a FITS cube, and you want to find the sources in it. What
83do you do? First, you need to get \duchamp: there are instructions in
84Appendix~\ref{app-install} for obtaining and installing it. Once you
85have it running, the first step is to make an input file that contains
86the list of parameters. Brief and detailed examples are shown in
87Appendix~\ref{app-input}. This file provides the input file name, the
88various output files, and defines various parameters that control the
89execution.
90
91The standard way to run \duchamp is by the command
92\begin{quote}
93{\footnotesize
94\texttt{> Duchamp -p [parameter file]}
95}
96\end{quote}
97replacing \texttt{[parameter file]} with the name of the file listing
98the parameters.
99
100An even easier way is to use the default values for all parameters
101(these are given in Appendix~\ref{app-param} and in the file
102\texttt{InputComplete} included in the distribution directory) and use
103the syntax
104\begin{quote}
105{\footnotesize
106\texttt{> Duchamp -f [FITS file]}
107}
108\end{quote}
109where \texttt{[FITS file]} is the file you wish to search.
110
111The default action includes displaying a map of detected objects in a
112PGPLOT X-window. This can be disabled by setting the parameter
113\texttt{flagXOutput = false} or using the \texttt{-x} command-line
114option, as in
115\begin{quote}
116{\footnotesize
117\texttt{> Duchamp -x -p [parameter file]}
118}
119\end{quote}
120and similarly for the \texttt{-f} case.
121
122Once a FITS file and parameters have been set, the program will then
123work away and give you the list of detections and their spectra. The
124program execution is summarised below, and detailed in
125\S\ref{sec-flow}. Information on inputs is in \S\ref{sec-param} and
126Appendix~\ref{app-param}, and descriptions of the output is in
127\S\ref{sec-output}.
128
129\secB{Guide to terminology and conventions}
130
131First, a brief note on the use of terminology in this guide. \duchamp
132is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is
133the Flexible Image Transport System -- see \citet{hanisch01} or
134websites such as
135\href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html}
136for details.} image arrays with (at least) three dimensions. They
137are assumed to have the following form: the first two dimensions
138(referred to as $x$ and $y$) are spatial directions (that is, relating
139to the position on the sky -- often, but not necessarily,
140corresponding to Equatorial or Galactic coordinates), while the third
141dimension, $z$, is the spectral direction, which can correspond to
142frequency, wavelength, or velocity. The three dimensional analogue of
143pixels are ``voxels'', or volume cells -- a voxel is defined by a
144unique $(x,y,z)$ location and has a single value of flux, intensity
145or brightness (or something equivalent) associated with it.
146
147Sometimes, some pixels in a FITS file are labelled as BLANK -- that
148is, they are given a nominal value, defined by FITS header keywords
149\textsc{blank} (and potentially \textsc{bscale} and \textsc{bzero}),
150that marks them as not having a flux value. These are often used to
151pad a cube out so that it has a rectangular spatial shape. \duchamp
152has the ability to avoid these: see \S\ref{sec-blank}.
153
154Note that it is possible for the FITS file to have more than three
155dimensions (for instance, there could be a fourth dimension
156representing a Stokes parameter). Only the two spatial dimensions and
157the spectral dimension are read into the array of pixel values that is
158searched for objects. All other dimensions are ignored\footnote{This
159actually means that the first pixel only of that axis is used, and the
160array is read by the \texttt{fits\_read\_subsetnull} command from the
161\textsc{cfitsio} library.}. Herein, we discuss the data in terms of
162the three basic dimensions, but you should be aware it is possible for
163the FITS file to have more than three. Note that the order of the
164dimensions in the FITS file does not matter.
165
166With this setup, each spatial pixel (a given $(x,y)$ coordinate) can
167be said to be a single spectrum, while a slice through the cube
168perpendicular to the spectral direction at a given $z$-value is a
169single channel, with the 2-D image in that channel called a channel
170map.
171
172Detection involves locating contiguous groups of voxels with fluxes
173above a certain threshold. \duchamp makes no assumptions as to the
174size or shape of the detected features, other than allowing
175user-selected minimum or maximum size criteria. Features that are
176detected are assumed to be positive. The user can choose to search for
177negative features by setting an input parameter -- which will invert
178the cube prior to the search (see \S\ref{sec-searchTechnique} for
179details).
180
181\secB{A summary of the execution steps}
182
183The basic flow of the program is summarised here -- all steps are
184discussed in more detail in the following sections.
185\begin{enumerate}
186\item The necessary parameters are recorded.
187
188  How this is done depends on the way the program is run from the
189  command line. If the \texttt{-p} option is used, the parameter file
190  given on the command line is read in, and the parameters therein are
191  read. All other parameters are given their default values (listed in
192  Appendix~\ref{app-param}).
193
194  If the \texttt{-f} option is used, all parameters are assigned their
195  default values, with the flux threshold able to be set with the
196  \texttt{-t} option.
197
198\item The FITS image is located and read in to memory.
199
200  The file given is assumed to be a valid FITS file. As discussed
201  above, it can have any number of dimensions, but \duchamp only
202  reads in the two spatial and the one spectral dimensions. A subset
203  of the FITS array can be given (see \S\ref{sec-input} for details).
204
205\item \label{step-reuse} If requested, a FITS file containing a
206  previously reconstructed or smoothed array is read in.
207
208  When a cube is either smoothed or reconstructed with the \atrous
209  wavelet method, the result can be saved to a FITS file, so that
210  subsequent runs of \duchamp can read it in to save having to re-do
211  the calculations.
212
213\item \label{step-blank} If requested, BLANK pixels are trimmed from
214  the edges, and the baseline of each spectrum is removed.
215
216  BLANK pixels, while they are ignored by all calculations in
217  \duchamp, do increase the size in memory of the array above that
218  absolutely needed. This step trims them from the spatial edges,
219  keeping a record of the amount trimmed so that they can be added
220  back in later.
221
222  A spectral baseline (or bandpass) may optionally be removed at this
223  point as well. This may be necessary if there is a ripple or other
224  large-scale feature present that will hinder detection of faint
225  sources.
226
227\item If the reconstruction method is requested, and the reconstructed
228  array has not been read in at Step 3 above, the cube is
229  reconstructed using the \atrous wavelet method.
230
231  This step uses the multi-resolution \atrous method to determine the
232  amount of structure present at various scales. A simple thresholding
233  technique then removes random noise from the cube, leaving the
234  significant signal. This process can greatly reduce the noise level
235  in the cube, enhancing the reliability of the resulting catalogue.
236
237\item Alternatively (and if requested), the cube is smoothed, either
238  spectrally or spatially.
239
240  This step presents two options. The first considers each spectrum
241  individually, and convolves it with a Hanning filter (with width
242  chosen by the user). The second considers each channel map
243  separately, and smoothes it with a Gaussian kernel of size and shape
244  chosen by the user. This step can help to reduce the amount of noise
245  visible in the cube and enhance fainter sources, increasing the
246  completeness and reliability of the output catalogue.
247
248\item A threshold for the cube is then calculated, based on the pixel
249  statistics (unless a threshold is manually specified by the user).
250
251  The threshold can either be chosen as a simple $n\sigma$ threshold
252  (\ie a certain number of standard deviations above the mean), or
253  calculated via the ``False Discovery Rate'' method. Alternatively,
254  the threshold can be specified as a simple flux value, without care
255  as to the statistical significance (\eg ``I want every source
256  brighter than 10mJy'').
257
258  By default, the full cube is used for the statistics calculation,
259  although the user can nominate a subsection of the cube to be used
260  instead.
261
262\item Searching for objects then takes place, using the requested
263  thresholding method.
264
265  The cube is searched either one channel-map at a time (``spatial''
266  search) or one spectrum at a time (``spectral'' search). Detections
267  are compared to already detected objects and either combined with a
268  neighbouring one or added to the end of the list.
269
270\item The list of objects is condensed by merging neighbouring objects
271  and removing those deemed unacceptable.
272
273  While some merging has been done in the previous step, this process
274  is a much more rigorous comparison of each object with every other
275  one. If a pair of objects lie within requested limits, they are
276  combined.
277
278  After the merging is done, the list is culled (although see comment
279  for the next step). There are certain criteria the user can specify
280  that objects must meet: minimum (or maximum) numbers of spatial
281  pixels and spectral channels, and minimum separations between
282  neighbouring objects. Those that do not meet these criteria are
283  deleted from the list.
284
285\item If requested, the objects are ``grown'' down to a lower
286  threshold, and then the merging step is done a second time.
287
288  In this case, each object has pixels in its neighbourhood examined,
289  and if they are above a secondary threshold, they are added to the
290  object. The merging process is done a second time in case two
291  objects have grown over the top of one another. Note that the
292  rejection part of the previous step is not done until the end of the
293  second merging process.
294
295\item The baselines and trimmed pixels are replaced prior to output.
296
297  This is just the inverse of step~\#\ref{step-blank}.
298
299\item The details of the detections are written to screen and to the
300  requested output file.
301
302  Crucial properties of each detection are provided, showing its
303  location, extent, and flux. These are presented in both pixel
304  coordinates and world coordinates (\eg sky position and
305  velocity). Any warning flags are also printed, showing detections to
306  be wary of. Alternative output options are available, such as a
307  VOTable or annotation files for image viewers such as kvis, ds9 or
308  casaviewer.
309
310\item Maps showing the spatial location of the detections are written.
311
312  These are 2-dimensional maps, showing where each detection lies on
313  the spatial coverage of the cube. This is provided as an aid to the
314  user so that a quick idea of the distribution of object positions
315  can be gained \eg are all the detections on the edge?
316
317  Two maps are provided: one is a 0th moment map, showing the 0th
318  moment (\ie a map of the integrated flux) of each detection in its
319  appropriate position, while the second is a ``detection map'',
320  showing the number of times each spatial pixel was detected in the
321  searching routines (including those pixels rejected at step 9 and so
322  not in any of the final detections).
323
324  These maps are written to postscript files, and the 0th moment map
325  can also be displayed in a PGPLOT X-window.
326
327\item A pixel mask is written to a FITS file.
328
329  A FITS file of the same size as the input file can be written. Here,
330  each pixel has a value indicating whether or note it was detected
331  and falls in one of the catalogue sources. Different objects can be
332  traced by different non-zero pixel values.
333
334\item The integrated or peak spectra of each detection are written to a
335  postscript file.
336
337  The spectral equivalent of the maps -- what is the spectral profile
338  of each detection? Also provided here are basic information for each
339  object (a summary of the information in the results file), as well
340  as a 0th moment map of the detection.
341
342\item If requested, a text file containing all spectra is written.
343
344  This file will contain the peak or integrated spectra for each
345  source, as a function of the appropriate spectral coordinate. The
346  file is a multi-column ascii text file, suitable for import into
347  other software packages.
348
349\item If requested, FITS files are written containing the
350  reconstructed, smoothed, baseline or mask arrays.
351
352  If one of the preprocessing methods was used, the resulting array
353  can be saved as a FITS file for later examination or use (for
354  instance, reading in as described at step \#\ref{step-reuse}). The
355  FITS header will be the same as the input file, with a few
356  additional keywords to identify the file.
357
358\end{enumerate}
359
360\secB{Why ``\duchamp''?}
361
362Well, it's important for a program to have a name, and the initial
363working title of \emph{cubefind} was somewhat uninspiring. I wanted to
364avoid the classic astronomical approach of designing a cute acronym,
365and since it is designed to work on cubes, I looked at naming it after
366a cubist. \emph{Picasso}, sadly, was already taken \citep{minchin99},
367so I settled on naming it after Marcel Duchamp, another cubist, but
368also one of the first artists to work with ``found objects''.
369
370
371%%% Local Variables:
372%%% mode: latex
373%%% TeX-master: "Guide"
374%%% End:
Note: See TracBrowser for help on using the repository browser.