% -----------------------------------------------------------------------
% intro.tex: Introduction, and guide to what Duchamp is doing.
% -----------------------------------------------------------------------
% Copyright (C) 2006, Matthew Whiting, ATNF
%
% This program is free software; you can redistribute it and/or modify it
% under the terms of the GNU General Public License as published by the
% Free Software Foundation; either version 2 of the License, or (at your
% option) any later version.
%
% Duchamp is distributed in the hope that it will be useful, but WITHOUT
% ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
% FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
% for more details.
%
% You should have received a copy of the GNU General Public License
% along with Duchamp; if not, write to the Free Software Foundation,
% Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA
%
% Correspondence concerning Duchamp may be directed to:
%    Internet email: Matthew.Whiting [at] atnf.csiro.au
%    Postal address: Dr. Matthew Whiting
%                    Australia Telescope National Facility, CSIRO
%                    PO Box 76
%                    Epping NSW 1710
%                    AUSTRALIA
% -----------------------------------------------------------------------
\secA{Introduction and getting going quickly}

\secB{About Duchamp}

This document provides a user's guide to \duchamp, an object-finder
for use on spectral-line data cubes. The basic execution of \duchamp
is to read in a FITS data cube, find sources in the cube, and produce
a text file of positions, velocities and fluxes of the detections, as
well as a postscript file of the spectra of each detection.

\duchamp has been designed to search for objects in particular sorts
of data: those with relatively small, isolated objects in a large
amount of background or noise. Examples of such data are extragalactic
\hi surveys, or maser surveys. \duchamp searches for groups of
connected voxels (or pixels) that are all above some flux
threshold. No assumption is made as to the shape of detections, and
the only size constraints applied are those specified by the user.

\duchamp has been written as a three-dimensional finder, but it is
possible to run it on a two-dimensional image (\ie one with no
frequency or velocity information), or indeed a one-dimensional array,
and many of the features of the program will work fine. The focus,
however, is on object detection in three dimensions, one of which is a
spectral dimension. Note, in particular, that it does not do any
fitting of source profiles, a feature common (and desirable) for many
two-dimensional source finders. This is beyond the current scope of
\duchamp, whose aim is reliable detection of spectral-line objects.

\secB{What to do}

So, you have a FITS cube, and you want to find the sources in it. What
do you do? First, you need to get \duchamp: there are instructions in
Appendix~\ref{app-install} for obtaining and installing it. Once you
have it running, the first step is to make an input file that contains
the list of parameters. Brief and detailed examples are shown in
Appendix~\ref{app-input}. This file provides the input file name, the
various output files, and defines various parameters that control the
execution.

The standard way to run \duchamp is by the command
\begin{quote}
{\footnotesize
\texttt{> Duchamp -p [parameter file]}
}
\end{quote}
replacing \texttt{[parameter file]} with the name of the file listing
the parameters. 

An even easier way is to use the default values for all parameters
(these are given in Appendix~\ref{app-param} and in the file
\texttt{InputComplete} included in the distribution directory) and use
the syntax
\begin{quote}
{\footnotesize
\texttt{> Duchamp -f [FITS file]}
}
\end{quote}
where \texttt{[FITS file]} is the file you wish to search. 

The default action includes displaying a map of detected objects in a
PGPLOT X-window. This can be disabled by setting the parameter
\texttt{flagXOutput = false} or using the \texttt{-x} command-line
option, as in
\begin{quote}
{\footnotesize
\texttt{> Duchamp -x -p [parameter file]}
}
\end{quote}
and similarly for the \texttt{-f} case.

Once a FITS file and parameters have been set, the program will then
work away and give you the list of detections and their spectra. The
program execution is summarised below, and detailed in
\S\ref{sec-flow}. Information on inputs is in \S\ref{sec-param} and
Appendix~\ref{app-param}, and descriptions of the output is in
\S\ref{sec-output}.

\secB{Guide to terminology and conventions}

First, a brief note on the use of terminology in this guide. \duchamp
is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is
the Flexible Image Transport System -- see \citet{hanisch01} or
websites such as
\href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html}
for details.} image arrays with (at least) three dimensions. They
are assumed to have the following form: the first two dimensions
(referred to as $x$ and $y$) are spatial directions (that is, relating
to the position on the sky -- often, but not necessarily,
corresponding to Equatorial or Galactic coordinates), while the third
dimension, $z$, is the spectral direction, which can correspond to
frequency, wavelength, or velocity. The three dimensional analogue of
pixels are ``voxels'', or volume cells -- a voxel is defined by a
unique $(x,y,z)$ location and has a single value of flux, intensity
or brightness (or something equivalent) associated with it.

Sometimes, some pixels in a FITS file are labelled as BLANK -- that
is, they are given a nominal value, defined by FITS header keywords
\textsc{blank, bscale, \& bzero}, that marks them as not having a flux
value. These are often used to pad a cube out so that it has a
rectangular spatial shape. \duchamp has the ability to avoid these:
see \S\ref{sec-blank}.

Note that it is possible for the FITS file to have more than three
dimensions (for instance, there could be a fourth dimension
representing a Stokes parameter). Only the two spatial dimensions and
the spectral dimension are read into the array of pixel values that is
searched for objects. All other dimensions are ignored\footnote{This
actually means that the first pixel only of that axis is used, and the
array is read by the \texttt{fits\_read\_subsetnull} command from the
\textsc{cfitsio} library.}. Herein, we discuss the data in terms of
the three basic dimensions, but you should be aware it is possible for
the FITS file to have more than three. Note that the order of the
dimensions in the FITS file does not matter.

With this setup, each spatial pixel (a given $(x,y)$ coordinate) can
be said to be a single spectrum, while a slice through the cube
perpendicular to the spectral direction at a given $z$-value is a
single channel, with the 2-D image in that channel called a channel
map.

Detection involves locating a contiguous group of voxels with fluxes
above a certain threshold. \duchamp makes no assumptions as to the
size or shape of the detected features, other than having
user-selected minimum size criteria. Features that are detected are
assumed to be positive. The user can choose to search for negative
features by setting an input parameter -- this inverts the cube prior
to the search (see \S\ref{sec-detection} for details).

\secB{A summary of the execution steps}

The basic flow of the program is summarised here -- all steps are
discussed in more detail in the following sections.
\begin{enumerate}
\item The necessary parameters are recorded.

  How this is done depends on the way the program is run from the
  command line. If the \texttt{-p} option is used, the parameter file
  given on the command line is read in, and the parameters therein are
  read. All other parameters are given their default values (listed in
  Appendix~\ref{app-param}).

  If the \texttt{-f} option is used, all parameters are assigned their
  default values.

\item The FITS image is located and read in to memory.

  The file given is assumed to be a valid FITS file. As discussed
  above, it can have any number of dimensions, but \duchamp only
  reads in the two spatial and the one spectral dimensions. A subset
  of the FITS array can be given (see \S\ref{sec-input} for details).

\item If requested, a FITS file containing a previously reconstructed
  or smoothed array is read in.

  When a cube is either smoothed or reconstructed with the \atrous
  wavelet method, the result can be saved to a FITS file, so that
  subsequent runs of \duchamp can read it in to save having to re-do
  the calculations (as they can be relatively time-intensive).

\item \label{step-blank} If requested, BLANK pixels are trimmed from
  the edges, and the baseline of each spectrum is removed.

  BLANK pixels, while they are ignored by all calculations in
  \duchamp, do increase the size in memory of the array above that
  absolutely needed. This step trims them from the spatial edges,
  recording the amount trimmed so that they can be added back in
  later.

  A spectral baseline (or bandpass) can also be removed at this point
  as well. This may be necessary if there is a ripple or other
  large-scale feature present that will hinder detection of faint
  sources.

\item If the reconstruction method is requested, and the reconstructed
  array has not been read in at Step 3 above, the cube is
  reconstructed using the \atrous wavelet method.

  This step uses the \atrous method to determine the amount of
  structure present at various scales. A simple thresholding technique
  then removes random noise from the cube, leaving the significant
  signal. This process can greatly reduce the noise level in the cube,
  enhancing the detectability of sources.

\item Alternatively (and if requested), the cube is smoothed, either
  spectrally or spatially.

  This step presents two options. The first considers each spectrum
  individually, and convolves it with a Hanning filter (with width
  chosen by the user). The second considers each channel map
  separately, and smoothes it with a Gaussian kernel of size and shape
  chosen by the user. This step can help to reduce the amount of noise
  visible in the cube and enhance fainter sources.

\item A threshold for the cube is then calculated, based on the pixel
  statistics (unless a threshold is manually specified by the user).

  The threshold can either be chosen as a simple $n\sigma$ threshold
  (\ie a certain number of standard deviations above the mean), or
  calculated via the ``False Discovery Rate'' method. Alternatively,
  the threshold can be specified as a simple flux value, without care
  as to the statistical significance (\eg ``I want every source
  brighter than 10mJy'').

  By default, the full cube is used for the statistics calculation,
  although the user can nominate a subsection of the cube to be used
  instead. 

\item Searching for objects then takes place, using the requested
  thresholding method.

  The cube is searched one channel-map at a time. Detections are
  compared to already detected objects and either combined with a
  neighbouring one or added to the end of the list.

\item The list of objects is condensed by merging neighbouring objects
  and removing those deemed unacceptable.

  While some merging has been done in the previous step, this process
  is a much more rigorous comparison of each object with every other
  one. If a pair of objects lie within requested limits, they are
  combined. 

  After the merging is done, the list is culled (although see comment
  for the next step). There are certain criteria the user can specify
  that objects must meet: minimum numbers of spatial pixels and
  spectral channels, and minimum separations between neighbouring
  objects. Those that do not meet these criteria are deleted
  from the list.

\item If requested, the objects are ``grown'' down to a lower
  threshold, and then the merging step is done a second time.

  In this case, each object has pixels in its neighbourhood examined,
  and if they are above a secondary threshold, they are added to the
  object. The merging process is done a second time in case two
  objects have grown over the top of one another. Note that the
  rejection part of the previous step is not done until the end of the
  second merging process.

\item The baselines and trimmed pixels are replaced prior to output.

  This is just the inverse of step~\#\ref{step-blank}.

\item The details of the detections are written to screen and to the
  requested output file.

  Crucial properties of each detection are provided, showing its
  location, extent, and flux. These are presented in both pixel
  coordinates and world coordinates (\eg sky position and
  velocity). Any warning flags are also printed, showing detections to
  be wary of. Alternative output options are available, such as a
  VOTable or a Karma annotation file.

\item Maps showing the spatial location of the detections are written.

  These are 2-dimensional maps, showing where each detection lies on
  the spatial coverage of the cube. This is provided as an aid to the
  user so that a quick idea of the distribution of object positions
  can be gained \eg are all the detections on the edge?

  Two maps are provided: one is a 0th moment map, showing the 0th
  moment (\ie a map of the integrated flux) of each detection in its
  appropriate position, while the second is a ``detection map'',
  showing the number of times each spatial pixel was detected in the
  searching routines (including those pixels rejected at step 9 and so
  not in any of the final detections).

  These maps are written to postscript files, and the 0th moment map
  can also be displayed in a PGPLOT X-window.

\item The integrated or peak spectra of each detection are written to a
  postscript file. 

  The spectral equivalent of the maps -- what is the spectral profile
  of each detection? Also provided here are basic information for each
  object (a summary of the information in the results file), as well
  as a 0th moment map of the detection.

\item If requested, a text file containing all spectra is written.

  This file will contain the peak or integrated spectra for each
  source, as a function of the appropriate spectral coordinate. The
  file is a multi-column ascii text file, suitable for import into
  other software packages. 

\item If requested, the reconstructed or smoothed array can be written
  to a new FITS file.

  If either of these procedures were done, the resulting array can be
  saved as a FITS file for later use. The FITS header will be the same
  as the input file, with a few additional keywords to identify the
  file. 

\end{enumerate}