source: trunk/docs/Guide.tex @ 87

Last change on this file since 87 was 87, checked in by Matthew Whiting, 18 years ago

Added code to produce warning flags for detections: for either edge location
or negative enclosed flux. Summary of changes:
Cubes/cubes.cc -- three new routines
Cubes/cubes.hh -- prototypes for new routines. New isBlank functions.
Detection/outputDetection.cc -- output of warning flags.
mainDuchamp.cc -- calling of new routines. Other minor changes.
docs/Guide.tex -- explanation of new warning flags. Other minor changes.
docs/example_spectrum.pdf -- shows the new formatting.
docs/example_spectrum.ps -- ditto
InputComplete? -- all values are now the same as the default param values

File size: 75.6 KB
Line 
1\documentclass[12pt,a4paper]{article}
2
3%Define a test for doing PDF format -- use different code below
4\newif\ifPDF
5\ifx\pdfoutput\undefined\PDFfalse
6\else\ifnum\pdfoutput > 0\PDFtrue
7     \else\PDFfalse
8     \fi
9\fi
10
11\textwidth=161 mm
12\textheight=245 mm
13\topmargin=-15 mm
14\oddsidemargin=0 mm
15\parindent=6 mm
16
17\usepackage[sort]{natbib}
18\usepackage{lscape}
19\bibpunct[,]{(}{)}{;}{a}{}{,}
20
21\newcommand{\eg}{e.g.\ }
22\newcommand{\ie}{i.e.\ }
23\newcommand{\hi}{H{\sc i}}
24\newcommand{\hipass}{{\sc hipass}}
25\newcommand{\progname}{{\tt Duchamp}}
26\newcommand{\diff}{{\rm d}}
27\newcommand{\entrylabel}[1]{\mbox{\textsf{\bf{#1:}}}\hfil}
28\newenvironment{entry}
29        {\begin{list}{}%
30                {\renewcommand{\makelabel}{\entrylabel}%
31                        \setlength{\labelwidth}{30mm}%
32                        \setlength{\labelsep}{5pt}%
33                        \setlength{\itemsep}{2pt}%
34                        \setlength{\parsep}{2pt}%
35                        \setlength{\leftmargin}{35mm}%
36                }%
37        }%
38{\end{list}}
39
40
41\title{A Guide to the {\it Duchamp} Source Finding Software}
42\author{Matthew Whiting\\
43%{\small \href{mailto:Matthew.Whiting@csiro.au}{Matthew.Whiting@csiro.au}}\\
44Australia Telescope National Facility\\CSIRO}
45%\date{January 2006}
46\date{}
47
48% If we are creating a PDF, use different options for graphicx, hyperref.
49\ifPDF
50  \usepackage[pdftex]{graphicx,color}
51  \usepackage[pdftex]{hyperref}
52  \hypersetup{colorlinks=true,%             
53              citecolor=red,%
54              filecolor=red,%
55              linkcolor=red,%
56              urlcolor=red,%
57              }
58\else
59  \usepackage[dvips]{graphicx}
60  \usepackage[dvips]{hyperref}
61\fi
62
63\pagestyle{headings}
64\begin{document}
65
66\maketitle
67\thispagestyle{empty}
68\begin{figure}[!h]
69\begin{center}
70\includegraphics[width=\textwidth]{cover_image}
71\end{center}
72\end{figure}
73
74\newpage
75\tableofcontents
76
77\newpage
78\section{Introduction and getting going quickly}
79
80This document gives details on the use of the program Duchamp. This
81has been designed to provide a source-detection facility for
82spectral-line data cubes. The basic execution of Duchamp is to read
83in a FITS data cube, find sources in the cube, and produce a text
84file of positions, velocities and fluxes of the detections, as well as
85a postscript file of the spectra of each detection.
86
87So, you have a FITS cube, and you want to find the sources in it. What
88do you do? The first step is to make an input file that contains the
89list of parameters. Brief and detailed examples are shown in
90Appendix~\ref{app-input}. This provides the input file name, the various
91output files, and defines various parameters that control the
92execution.
93
94The standard way to run Duchamp is by the command
95\begin{quote}
96{\tt Duchamp -p [parameter file]}
97\end{quote}
98replacing {\tt [parameter file]} with the name of the file you have
99just created/copied. Alternatively, you can use the syntax
100\begin{quote}
101{\tt Duchamp -f [FITS file]}
102\end{quote}
103where {\tt [FITS file]} is the file you wish to search. In the latter
104case, the rest of the parameters will take their default values
105detailed in Appendix~\ref{app-param}. In either case, the program will
106then work away and give you the list of detections and their
107spectra. The program execution is summarised below, and detailed in
108\S\ref{sec-flow}. Information on inputs is in \S\ref{sec-param} and
109Appendix~\ref{app-param}, and descriptions of the output is in
110\S\ref{sec-output}.
111
112\subsection{A summary of the execution steps}
113
114The basic flow of the program is summarised here. All these steps are
115discussed in more detail in the following sections, so read on if
116you have questions!
117\begin{enumerate}
118\item The parameter file given on the command line is read in, and the
119  parameters absorbed.
120\item From the parameter file, the FITS image is located and read in
121  to memory.
122\item If requested, a FITS image with a previously reconstructed array
123  is read in.
124\item If requested, blank pixels are trimmed from the edges, and
125  channels corresponding to bright (\eg Galactic) emission are
126  excised.
127\item If requested, the baseline of each spectrum is removed.
128\item If the reconstruction method is requested, and the reconstructed
129  array has not been read in at Step 3 above, the cube is
130  reconstructed using the {\it {\' a} trous} wavelet method.
131\item Searching for objects then takes place, using the requested
132  thresholding method.
133\item The list of objects is trimmed by merging neighbouring objects
134  and removing those deemed unacceptable.
135\item The baselines and trimmed pixels are replaced prior to output.
136\item The details on the detections are written to screen and to the
137  requested output file.
138\item Maps showing the spatial location of the detections are written.
139\item The integrated spectra of each detection are written to a
140  postscript file.
141\item If requested, the reconstructed array can be written to a new
142  FITS file.
143\end{enumerate}
144
145\subsection{Guide to terminology}
146
147First, a brief note on the use of terminology in this guide. Duchamp
148is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is
149the Flexible Image Transport System -- see \citet{hanisch01} or
150websites such as
151\href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html}
152for details.} image arrays with three dimensions -- they are assumed
153to have the following form: the first two dimensions (referred to as
154$x$ and $y$) are spatial directions (that is, relating to the position
155on the sky), while the third dimension, $z$, is the spectral
156direction, which can correspond to frequency, wavelength, or velocity.
157
158Each spatial pixel (a given $(x,y)$ coordinate) can be said to be a
159single spectrum, while a slice through the cube perpendicular to the
160spectral direction at a given $z$-value is a single channel (the 2-D
161image is a channel map).
162
163Features that are detected are assumed to be positive. The user can
164choose to search for negative features by setting an input parameter
165-- this inverts the cube prior to the search (see
166\S~\ref{sec-detection} for details).
167
168Note that it is possible to run Duchamp on a two-dimensional image
169(\ie one with no frequency or velocity information), or indeed a
170one-dimensional array, and many of the features of the program will
171work fine. The focus, however, is on object detection in three
172dimensions.
173
174\subsection{Why ``Duchamp''?}
175
176Well, it's important for a program to have a name, and it certainly
177beats the initial working title of ``cubefind''. I had planned to call
178it ``Picasso'' (as in the father of cubism), but sadly this had
179already been used before \citep{minchin99}. So I settled on naming it
180after Marcel Duchamp, another cubist, but also one of the first
181artists to work with ``found objects''.
182
183\section{User Inputs}
184\label{sec-param}
185
186Input to the program is provided by means of a parameter file. Parameters
187are listed in the file, followed by the value that should be assigned
188to them. The syntax used is {\tt paramName value}. The file is not
189case-sensitive, and lines in the input file that start with {\tt \#} are
190ignored. If a parameter is listed more than once, the latter value is
191used, but otherwise the order in which the parameters are listed in the
192input file is arbitrary.
193
194If a parameter is not listed, the default value is assumed. The
195defaults are chosen to provide a good result (using the reconstruction
196method), so the user doesn't need to specify many new parameters in
197the input file. Note that the image file {\bf must} be specified! The
198parameters that can be set are listed in Appendix~\ref{app-param},
199with their default values in parentheses.
200
201The 'flag' parameters are stored as {\tt bool} variables, and so are
202either {\tt true = 1} or {\tt false = 0}. Currently the program only
203reads them from the file as integers, and so they should be entered in
204the file as 0 or 1 (see example file in Appendix~\ref{app-input}).
205
206\section{What the program is doing}
207\label{sec-flow}
208
209The execution flow of the program is detailed here, indicating the
210main algorithmic steps that are used. The program is written in C/C++
211and makes use of the {\sc cfitsio}, {\sc wcslib} and {\sc pgplot}
212libraries.
213
214%\subsection{Parameter input}
215%
216%The user provides parameters that govern the selection of files and
217%the parameters used by the various subroutines in the program. This is
218%done via a parameter file, and the parameters are stored in a C++
219%class for use throughout the program. The form of the parameter file is
220%discussed in \S\ref{sec-param}, and the parameters themselves are
221%listed in Appendix~\ref{app-param}.
222
223\subsection{Image input}
224
225The cube is read in using basic {\sc cfitsio} commands, and stored as
226an array in a special C++ class structure. This class keeps track of
227the list of detected objects, as well as any reconstructed arrays that
228are made (see \S\ref{sec-recon}). The World Coordinate System (WCS)
229information for the cube is also obtained from the FITS header by {\sc
230wcslib} functions \citep{greisen02, calabretta02}, and this
231information, in the form of a {\tt wcsprm} structure, is also stored
232in the same class.
233
234A sub-section of an image can be requested via the {\tt subsection}
235parameter in the parameter file -- this can be a good idea if the cube
236has very noisy edges, which may produce many spurious detections. The
237generalised form of the subsection that is used by {\sc cfitsio} is
238{\tt [x1:x2:dx,y1:y2:dy,z1:z2:dz]}, such that the x-coordinates run
239from {\tt x1} to {\tt x2} (inclusive), with steps of {\tt dx}. The
240step value can be omitted (so a subsection of the form {\tt
241[2:50,2:50,10:1000]} is still valid). Duchamp does not at this stage
242deal with the presence of steps in the subsection string, and any that
243are present are removed before the file is opened.
244
245If one wants the full range of a coordinate then replace the range
246with an asterisk, \eg {\tt [2:50,2:50,*]}. If one wants to use just a
247subsection, one must set {\tt flagSubsection = 1}. A complete
248description of the section syntax can be found at the {\sc fitsio} web
249site
250\footnote{
251\href{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}%
252{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}}.
253
254\subsection{Image modification}
255\label{sec-modify}
256
257Several modifications to the cube can be made that improve the
258execution and efficiency of Duchamp (these are optional -- their
259use is indicated by the relevant flags set in the input parameter
260file).
261
262\subsubsection{Milky-Way removal}
263
264First, a single set of contiguous channels can be removed -- these may
265exhibit very strong emission, such as that from the Milky Way as seen
266in extragalactic \hi\ cubes (hence the references to ``Milky Way'' in
267relation to this task -- apologies to Galactic astronomers!). Such
268dominant channels will both produce many unnecessary, uninteresting
269and large (in size and hence in memory usage) detections, and will
270also affect any reconstruction that is performed (see next
271section). The use of this feature is controlled by the {\tt flagMW}
272parameter, and the exact channels concerned are able to be set by the
273user (using {\tt maxMW} and {\tt minMW}). When employed, the flux in
274these channels is set to zero. The information in those channels is
275not kept.
276
277\subsubsection{Blank pixel removal}
278
279Second, the cube is trimmed of any BLANK pixels that pad the image
280out to a rectangular shape. This is also optional, being determined by
281the {\tt flagBlankPix} parameter. The value for these pixels is read from
282the FITS header (using the BLANK, BSCALE and BZERO keywords), but if
283these are not present then the value can be specified by the user in
284the parameter file. If these blank pixels are stored as NaNs, then a
285normal number will be substituted (allowing these pixels to be
286accurately removed without adverse effects). [NOTE: this appears not
287  to be working correctly at time of writing. If your data has
288  unspecified BLANKs, be wary...]
289
290This stage is particularly important for the reconstruction step, as
291lots of BLANK pixels on the edges will smooth out features in the
292wavelet calculation stage. The trimming will also reduce the size of
293the cube's array, speeding up the execution. The amount of trimming is
294recorded, and these pixels are added back in once the source-detection
295is completed (so that quoted pixel positions are applicable to the
296original cube).
297
298Rows and columns are trimmed one at a time until the first non-BLANK
299pixel is reached, so that the image remains rectangular. In practice,
300this means that there will be BLANK pixels left in the trimmed image
301(if the non-BLANK region is non-rectangular). However, these are
302ignored in all further calculations done on the cube.
303
304\subsubsection{Baseline removal}
305
306Finally, the user may request the removal of baselines from the
307spectra, via the parameter {\tt flagBaseline}. This may be necessary
308if there is a strong baseline ripple present, which can result in
309spurious detections on the high points of the ripple. The baseline is
310calculated from a wavelet reconstruction procedure (see
311\S\ref{sec-recon}) that keeps only the two largest scales. This is
312done separately for each spatial pixel (\ie for each spectrum in the
313cube), and the baselines are stored and added back in before any
314output is done. In this way the quoted fluxes and displayed spectra
315are as one would see from the input cube itself -- even though the
316detection (and reconstruction if applicable) is done on the
317baseline-removed cube.
318
319The presence of very strong signals (for instance, masers at several
320hundred Jy) can affect the determination of the baseline, leading to a
321large dip centred on the signal in the baseline-subtracted
322spectrum. To prevent this, the signal is trimmed prior to the
323reconstruction process at some standard threshold (at $8\sigma$ above
324the mean). The baseline determined should thus be representative of
325the true, signal-free baseline. Note that this trimming is only a
326temporary measure which does not affect the source-detection.
327
328\subsection{Image reconstruction}
329\label{sec-recon}
330
331This is an optional step. The user can direct Duchamp to
332reconstruct the data cube using the {\it {\`a} trous} wavelet
333procedure. A good description of the procedure can be found in
334\citet{starck02:book}. This is an effective way of removing a
335lot of the noise in the image, but at this stage is relatively time-
336and memory-intensive. The steps in the procedure are as follows:
337\begin{enumerate}
338\item Set the reconstructed array to 0 everywhere.
339\item The cube is discretely convolved with a given filter
340  function. This is determined from the parameter file via the {\tt
341  filterCode} parameter -- see Appendix~\ref{app-param} for details on
342  the filters available.
343\item The wavelet coefficients are calculated by taking the difference
344  between the convolved array and the input array.
345\item If the wavelet coefficients at a given point are above the
346  threshold requested (given by {\tt snrRecon} as the number of
347  $\sigma$ above the mean and adjusted to the current scale), add
348  these to the reconstructed array.
349\item The separation of the filter coefficients is doubled.
350\item The procedure is repeated from step 2, using the convolved array
351  as the input array.
352\item Continue until the required maximum number of scales is reached.
353\item Add the final smoothed (\ie convolved) array to the
354  reconstructed array. This provides the ``DC offset'', as each of the
355  wavelet coefficient arrays will have zero mean.
356\end{enumerate}
357
358Note that any BLANK pixels that are still in the cube will not be
359altered by the reconstruction -- they will be left as BLANK so that
360the shape of the valid part of the cube is preserved.
361
362It is important to note that the {\it {\`a} trous} decomposition is
363an example of a ``redundant'' transformation. If no thresholding is
364performed, the sum of all the wavelet coefficient arrays and the final
365smoothed array is identical to the input array. The thresholding thus
366removes only the unwanted structure in the array.
367
368The statistics of the cube are estimated using robust methods, to
369avoid corruption by strong outlying points. The mean is actually
370estimated by the median, while the median absolute deviation from the
371median (MADFM) is calculated and corrected assuming Gaussianity to
372estimate the standard deviation $\sigma$. The Gaussianity (or
373Normality) assumption is critical, as the MADFM does not give the same
374value as the usual rms or standard deviation value -- for a normal
375distribution $N(\mu,\sigma)$ we find MADFM$=0.6744888\sigma$. The
376difference between the MADFM and $\sigma$ is corrected for, so the
377user need only think in the usual multiples of $\sigma$ when setting
378{\tt snrRecon}. See Appendix~\ref{app-madfm} for a derivation of this
379value.
380
381When thresholding the different wavelet scales, the value of $\sigma$
382as measured from the input array needs to be scaled to account for the
383increased amount of correlation between neighbouring pixels (due to
384the convolution). See Appendix~\ref{app-scaling} for details on this
385scaling.
386
387The user can also select the minimum scale to be used in the
388reconstruction -- the first scale exhibits the highest frequency
389variations, and so ignoring this one can sometimes be beneficial in
390removing excess noise. The default, however, is to use all scales
391({\tt minscale = 1}).
392
393The reconstruction has at least two iterations. The first iteration
394makes a first pass at the wavelet reconstruction (the process outlined
395in the 8 stages above), but the residual array will inevitably have
396some structure still in it, so the wavelet filtering is done on the
397residual, and any significant wavelet terms are added to the final
398reconstruction. This step is repeated until the change in the $\sigma$
399of the background is less than some fiducial amount.
400
401\subsection{Reconstruction I/O}
402
403The reconstruction stage can be relatively time-consuming,
404particularly for large cubes. Duchamp thus has a shortcut to allow
405users to quickly do multiple searches (\eg with different thresholds)
406on the same reconstruction.
407
408The first step is to select to save the reconstructed image as a
409FITS file -- at the moment this is just saved in the same directory as
410the input file, so it won't work if the user does not have write
411permissions on that directory. The name of the file will be derived
412from the input file, in the following manner: if the input file is
413{\tt image.fits}, the reconstructed array will be saved in {\tt
414image.RECON?.fits}, where {\tt ?} stands for the value of {\tt
415snrRecon} (for instance, if {\tt snrRecon}$=4$, it will be {\tt
416image.RECON4.fits}, and if {\tt snrRecon}$=4.5$, it will be {\tt
417image.RECON4.5.fits}). To save the reconstructed array, set {\tt
418  flagOutputRecon = true}.
419
420Likewise, the residual image, defined as the difference between the
421input image and the reconstructed image, can also be saved in the same
422manner -- its filename will be {\tt image.RESID?.fits}. This is done
423by setting {\tt flagOutputResid = true}.
424
425If a reconstructed image has been saved, it can be read in and used
426instead of redoing the reconstruction. To do so, the user should set
427{\tt flagReconExists = true}. The user can indicate the name of the
428reconstructed FITS file using the {\tt reconFile} parameter, or, if
429this is not specified, Duchamp searches for the file {\tt
430  image.RECON?.fits} (as defined above). If the file is not found, the
431reconstruction is performed as normal. Note that to do this, the user
432needs to set {\tt flagAtrous = true} (obviously, if this is {\tt
433  false}, the reconstruction is not needed).
434
435\subsection{Searching the image}
436\label{sec-detection}
437
438The image is searched for detections in two ways: spectrally (a
4391-dimensional search in the spectrum in each spatial pixel), and
440spatially (a 2-dimensional search in the spatial image in each
441channel). In both cases, the algorithm finds connected pixels that are
442above the user-specified threshold. In the case of the spatial image
443search, the algorithm of \citet{lutz80} is used to raster scan through
444the image and connect groups of pixels on neighbouring rows.
445
446Note that this algorithm cannot be applied directly to a 3-dimensional
447case, as it requires that objects are completely nested in a row: that
448is, if you are scanning along a row, and one object finishes and
449another starts, you know that you will not get back to the first one
450(if at all) until the second is finished for that
451row. Three-dimensional data does not have this property, which is why
452we break up the searching into 1- and 2-dimensional cases.
453
454The determination of the threshold is done in one of two ways. The
455first way is a simple sigma-clipping, where a threshold is set at
456$n\sigma$ above the mean and pixels above this threshold are
457flagged as detected. The value of $n$ is set with the parameter {\tt
458  snrCut}. As before, the value for $\sigma$ is estimated by
459the MADFM, and corrected by the ratio derived in
460Appendix~\ref{app-madfm}.
461
462The second method uses the False Discovery Rate (FDR) technique
463\citep{miller01,hopkins02}, whose basis we briefly detail here. The
464false discovery rate (given by the number of false detections divided
465by the total number of detections) is fixed at a certain value
466$\alpha$ (\eg $\alpha=0.05$ implies 5\% of detections are false
467positives). In practice, an $\alpha$ value is chosen, and the ensemble
468average FDR (\ie $<FDR>$) when the method is used will be less than
469$\alpha$.  One calculates $p$ -- the probability, assuming the null
470hypothesis is true, of obtaining a test statistic as extreme as the
471pixel value (the observed test statistic) -- for each pixel, and sorts
472them in increasing order. One then calculates $d$ where
473\[
474d = \max_j \left\{ j : P_j < \frac{j\alpha}{c_N N} \right\},
475\]
476and then rejects all hypotheses whose $p$-values are less than or equal
477to $P_d$. (So a $P_i<P_d$ will be rejected even if $P_i \geq
478j\alpha/c_N N$.) Note that ``reject hypothesis'' here means ``accept
479the pixel as an object pixel'' (\ie we are rejecting the null
480hypothesis that the pixel belongs to the background).
481
482The $c_N$ values here are normalisation constants that depend on the
483correlated nature of the pixel values. If all the pixels are
484uncorrelated, then $c_N=1$. If $N$ pixels are correlated, then their
485tests will be dependent on each other, and so $c_N = \sum_{i=1}^N
486i^{-1}$. \citet{hopkins02} consider real radio data, where the pixels
487are correlated over the beam. In this case the sum is made over the
488$N$ pixels that make up the beam. The value of $N$ is calculated from
489the FITS header (if the correct keywords -- BMAJ, BMIN -- are not
490present, a default value of 10 pixels is assumed).
491
492If a reconstruction has been made, the residuals (defined as original
493$-$ reconstruction) are used to estimate the noise parameters of the
494cube. Otherwise they are estimated directly from the cube itself. In
495both cases, the median is used as a robust estimator of the mean
496value, although the $\sigma$ is estimated by the standard deviation
497(of the residual array, in the case of the reconstruction, but of the
498original array otherwise).
499
500Detections must have a minimum number of pixels to be counted. This
501minimum number is given by the input parameters {\tt minPix} (for
5022-dimensional searches) and {\tt minChannels} (for 1-dimensional
503searches).
504
505The search only looks for positive features. If one is interested
506instead in negative features (such as absorption lines), set the
507parameter {\tt flagNegative = true}. This will invert the cube (\ie
508multiply all pixels by $-1$) prior to the search, and then re-invert
509the cube (and the fluxes of any detections) after searching is
510complete. All outputs are done in the same manner as normal, so that
511fluxes of detections will be negative.
512
513\subsection{Merging detected objects}
514\label{sec-merger}
515
516The searching step produces a list of detections that will have many
517repeated detections of a given object -- for instance, spectral
518detections in adjacent pixels of the same object and/or spatial
519detections in neighbouring channels. These are then combined in an
520algorithm that matches all objects judged to be ``close''. This
521determination is made in one of two ways.
522
523One way is to define two thresholds -- one spatial and one in velocity
524-- and say that two objects should be merged if there is at least one
525pair of pixels that lie within these threshold distances of each
526other. These thresholds are specified by the parameters {\tt
527threshSpatial} and {\tt threshVelocity} (in units of pixels and
528channels respectively).
529
530Alternatively, the spatial requirement can be changed to say that
531there must be a pair of pixels that are {\it adjacent} -- a stricter,
532but more realistic requirement, particularly when the spatial pixels
533have a large angular size (as is the case for \hi\ surveys). This
534method can be selected by setting the parameter
535{\tt flagAdjacent} to 1 (\ie {\tt true}) in the parameter file. The
536velocity thresholding is done in the same way as the first option.
537
538Once the detections have been merged, they may be ``grown''. This is a
539process of increasing the size of the detection by adding adjacent
540pixels that are above some secondary threshold. This threshold is
541lower than the one used for the initial detection, but above the noise
542level, so that faint pixels are only detected when they are close to a
543bright pixel. The value of this threshold is a possible input
544parameter ({\tt growthCut}), with a default value of $1.5\sigma$. The
545use of the growth algorithm is controlled by the {\tt flagGrowth}
546parameter -- the default value of which is {\tt false}. If the
547detections are grown, they are sent through the merging algorithm a
548second time, to pick up any detections that now overlap or have grown
549over each other.
550
551Finally, to be accepted, the detections must span {\it both} a minimum
552number of channels (to remove any spurious single-channel spikes that
553may be present), and a minimum number of spatial pixels. These
554numbers, as for the original detection step, are set with the {\tt
555minChannels} and {\tt minPix} parameters. The channel requirement
556means there must be at least one set of this many consecutive channels
557in the source for it to be accepted.
558
559\section{Outputs}
560\label{sec-output}
561
562\subsection{During execution}
563
564Duchamp provides the user with feedback whilst it is running, to
565keep the user informed on the progress of the analysis. Most of this
566consists of self-explanatory messages about the particular stage the
567program is up to. The relevant parameters are printed to the screen at
568the start (once the file has been successfully read in), so the user
569is able to make a quick check that the setup is correct.
570
571If the cube is being trimmed (\S\ref{sec-modify}), the resulting
572dimensions are printed to indicate how much has been trimmed. If a
573reconstruction is being done, a continually updating message shows the
574current iteration and scale (compared to the maximum scale).
575
576During the searching algorithms, the progress through the 1D and 2D
577searches are shown. When the searches have completed,
578the number of objects found in both the 1D and 2D searches are
579reported (see \S\ref{sec-detection} for details).
580
581In the merging process (where multiple detections of the same object
582are combined -- see \S\ref{sec-merger}), two stages of output
583occur. The first is when each object in the list is compared with all
584others. The output shows two numbers: the first being how far through
585the list we are, and the second being the length of the list. As the
586algorithm proceeds, the first number should increase and the second
587should decrease (as objects are combined). When the numbers meet (\ie
588the whole list has been compared), the second phase begins, in which
589multiply-appearing pixels in each object are removed, as are objects
590not meeting the minimum channels requirement. During this phase, the
591total number of accepted objects is shown, which should steadily
592increase until all have been accepted or rejected. Note that these
593steps can be very quick for small numbers of detections.
594
595Since this continual printing to screen has some overhead of time and
596CPU involved, the user can elect to not print this information by
597setting the parameter {\tt verbose = 0}. In this case, the user is
598still informed as to the steps being undertaken, but the details of
599the progress are not shown.
600
601\subsection{Results}
602
603Finally, we get to the results -- the reason for running Duchamp in
604the first place. Once the detection list is finalised, it is sorted by
605the mean velocity of the detections (or, if there is no good WCS
606associated with the cube, by the mean Z-pixel position). The results
607are then printed to the screen and to the output file, denoted by the
608{\tt OutFile} parameter. The results list, an example of which can be
609seen in Appendix~\ref{app-output}, contains the following columns
610(note that the title of the columns depending on WCS information will
611depend on the projection of the WCS):
612
613\begin{entry}
614\item[Obj\#] The ID number of the detection (simply the sequential
615  count for the list, which is ordered by increasing velocity).
616\item[Name] The IAU-format name of the detection (based on the WCS
617  projection).
618\item[X] The average X-pixel position.
619\item[Y] The average Y-pixel position.
620\item[Z] The average Z-pixel position.
621\item[RA/GLON] The Right Ascension or Galactic Longitude of the centre
622of the object.
623\item[DEC/GLAT] The Declination or Galactic Latitude of the centre of
624the object.
625\item[w\_RA/w\_GLON] The width of the object in Right Ascension or
626Galactic Longitude [arcmin].
627\item[w\_DEC/w\_GLAT] The width of the object in Declination Galactic
628  Latitude [arcmin].
629\item[VEL] The mean velocity of the object [km/s].
630\item[w\_VEL] The full velocity width of the detection (max channel
631  $-$ min channel, in velocity units [km/s]).
632\item[F\_tot] The integrated flux over the object, in the units of
633  flux times velocity (\eg Jy km/s).
634\item[F\_peak] The peak flux over the object, in the units of flux.
635\item[X1, X2] The minimum and maximum X-pixel coordinates.
636\item[Y1, Y2] The minimum and maximum Y-pixel coordinates.
637\item[Z1, Z2] The minimum and maximum Z-pixel coordinates.
638\item[Npix] The number of pixels \& channels (\ie distinct $(x,y,z)$
639  coordinates) in the detection.
640\item[Flag] Whether the detection has any warning flags (see below).
641\end{entry}
642The Name is derived from the WCS position. For instance, the (RA,Dec)
643position 12$^h$53$^m$45$^s$, -36$^\circ$24$'$12$''$ will be called
644J1253$-$3624 (if the epoch is J2000) or B1253$-$3624 (if B1950). An
645alternative form is used for Galactic coordinates: the position
646($l$,$b$) = (323.1245, 5.4567) will be called G323.12$+$05.45. If the
647WCS is not valid (\ie is not present or does not have all the
648necessary information), the Name, RA, DEC, VEL and related columns are
649not printed, but the pixel coordinates are still provided.
650
651\begin{figure}[t]
652\begin{center}
653\includegraphics[width=\textwidth]{example_spectrum}
654\end{center}
655\caption{\footnotesize An example of the spectrum output. Note several
656  of the features discussed in the text: the removal of the Milky Way
657  emission around 0 km/s; the red lines indicating the reconstructed
658  spectrum; the blue dashed lines indicating the spectral extent of
659  the detection; the blue border showing its spatial extent on the
660  0th moment map; and the 15~arcmin-long scale bar.}
661\label{fig-spect}
662\end{figure}
663
664The last column contains any warning flags about the detection. There
665are currently two options here. An `E' is printed if the detection is
666next to the edge of the image, meaning either the limit of the pixels,
667or the limit of the non-BLANK pixel region. An `N' is printed if the
668total flux, summed over all the (non-BLANK) pixels in the smallest box
669that completely encloses the detection, is negative. Note that this
670sum will possibly include non-detected pixels. It is of use in
671pointing out detections that lie next to strongly negative pixels,
672such as might arise due to interference -- the detected pixels might
673then also be due to the interference, so caution is advised.
674
675Two alternative results files can also be requested. One option is a
676VOTable-format XML file, containing just the RA, Dec, Velocity and the
677corresponding widths of the detections, as well as the fluxes. The
678user should set {\tt flagVOT = 1}, and put the desired filename in the
679parameter {\tt votFile} -- note that the default is for it not to be
680produced. This file should be compatible with all Virtual Observatory
681tools (such as Aladin\footnote{ Aladin can be found on the web at
682\href{http://aladin.u-strasbg.fr/}{http://aladin.u-strasbg.fr/}}). The
683second option is an annotation file for use with the Karma toolkit of
684visualisation tools (in particular, with {\tt kvis}). This will draw a
685circle at the position of each detection, and number it according to
686the Obj\# given above. To use, the user should set {\tt flagKarma = 1},
687and put the desired filename in the parameter {\tt karmaFile} -- again,
688the default is for it not to be produced.
689
690As the program is running, it also (optionally) records the detections
691made in each individual spectrum or channel (see
692\S\ref{sec-detection} for details on this process). This is
693recorded in the file denoted by the parameter {\tt LogFile}. This file
694does not include the columns {\tt Name, RA, DEC, w\_RA, w\_DEC, VEL,
695w\_VEL}. This file is designed primarily for diagnostic purposes: \eg
696to see if a given set of pixels is detected in, say, one channel
697image, but does not survive the merging process. The list of pixels
698(and their fluxes) in the final detection list are also printed to
699this file, again for diagnostic purposes. This feature can be turned
700off by setting {\tt flagLog = false}. (This may be a good idea if you
701are not interested in its contents, as it can be a large file.)
702
703\begin{figure}[!t]
704\begin{center}
705\includegraphics[width=\textwidth]{example_moment_map}
706\end{center}
707\caption{\footnotesize An example of the moment map created by
708  Duchamp. The full extent of the cube is covered, and the 0th moment
709  of each object is shown (integrated individually over all the
710  detected channels).}
711\label{fig-moment}
712\end{figure}
713
714As well as the output data file, a postscript file is created that
715shows the spectrum for each detection, together with a small cutout
716image (0th moment) and basic information about the detection (note
717that any flags are printed after the name of the detection, in the
718format {\tt [E]}). If the
719cube was reconstructed, the spectrum from the reconstruction is shown
720in red, over the top of the original spectrum. The spectrum that is
721plotted is governed by the {\tt spectralMethod} parameter. It can be
722either {\tt peak}, where the spectrum is from the spatial pixel
723containing the detection's peak flux; or {\tt sum}, where the spectrum
724is summed over all spatial pixels, and then corrected for the beam
725size.
726
727The spectral extent of the detection is indicated with blue lines, and
728a zoom is shown in a separate window. The cutout image can optionally
729include a border around the spatial pixels that are in the detection
730(turned on and off by the parameter {\tt drawBorders}). It also
731includes a scale bar in the bottom left corner to indicate size -- it
732is 15~arcmin long (note that due to projection effects it may be a
733slightly different physical length from object to object). An example
734detection can be seen below in Fig.~\ref{fig-spect}.
735
736Finally, a couple of images are optionally produced: a 0th moment map
737of the cube, combining just the detected channels in each object,
738showing the integrated flux in grey-scale; and a ``detection image'',
739a grey-scale image where the pixel values are the number of channels
740that spatial pixel is detected in. In both cases, if {\tt drawBorders =
741true}, a border is drawn around the spatial extent of each
742detection. An example moment map is shown in Fig.~\ref{fig-moment}.
743The production or otherwise of these images is governed by the {\tt
744flagMaps} parameter.
745
746The purpose of these images are to provide a visual guide to where the
747detections have been made, and, particularly in the case of the moment
748map, to provide an indication of the strength of the source. In both
749cases, the detections are numbered (in the same way as the output
750list), and the spatial borders are marked out as for the cutout images
751in the spectra file. Both these images are saved as postscript files
752(given by the parameters {\tt momentMap} and {\tt detectionMap}
753respectively), with the latter also displayed in a {\sc pgplot}
754window (regardless of the state of {\tt flagMaps}).
755
756\section{Notes and hints on the use of Duchamp}
757
758In using Duchamp, the user has to make a number of decisions about
759the way the program runs. This section is designed to give the user
760some idea about what to choose.
761
762The main choice is whether or not to use the wavelet
763reconstruction. The main benefits of this are the marked reduction in
764the noise level, leading to regularly-shaped detections, and good
765reliability for faint sources. The main drawback with its use is the
766long execution time: to reconstruct a $170\times160\times1024$
767(\hipass) cube often requires three iterations and takes about 20-25
768minutes. The searching part of the procedure is much quicker (although
769see the note on merging, below), so if one uses the FDR method on the
770un-reconstructed cube, the execution time is only a couple of
771minutes. Alternatively, using the ability to read in previously-saved
772reconstructed arrays makes running the reconstruction more than once a
773more feasible prospect.
774
775%A further drawback with the reconstruction is that it is susceptible
776%to edge effects. If the valid area in the cube (\ie the part that is
777%not BLANK) has very curved edges (such as the \hipass\ polar cap cube,
778%H001, which has a roughly circular shape after gridding), the
779%convolution can produce artefacts in the reconstruction that mimic the
780%edges and can lead (depending on the selection threshold) to some
781%spurious sources. Caution is advised with such data -- the user is
782%advised to check carefully the reconstructed cube for the presence of
783%such artefacts.
784
785If one chooses the reconstruction method, a further decision is
786required on the signal-to-noise cutoff used in determining acceptable
787wavelet coefficients. A larger value will remove more noise from the
788cube, at the expense of losing fainter sources, while a smaller value
789will include more noise, which may produce spurious detections, but
790will be more sensitive to faint sources. Values of less than about
791$3\sigma$ tend to not reduce the noise a great deal and can lead to
792many spurious sources (although this will depend on the nature of the
793cube).
794
795The FDR method certainly produces more reliable results than a simple
796sigma-clipping (\ie thresholding at some number of $\sigma$ above the
797mean), particularly if no reconstruction is done. However, at this
798point it does not seem to be giving the sensitivity expected for the
799supplied value of {\tt alpha} (\ie it is not finding as many sources
800as expected). Work is being done to assess this, and to judge whether
801there is a real problem (such as with the determination of the
802statistics), or simply a result of working in 3 dimensions as opposed
803to 2.
804
805A further point to bear in mind is that the shape of the detections in
806a cube that has been reconstructed will be much more regular and
807smooth -- the ragged edges that objects in the raw cube possess are
808smoothed by the removal of most of the noise.
809
810Finally, as Duchamp is still undergoing development, there are some
811elements that are not fully developed. In particular, it is not as
812clever as I would like at avoiding interference. The ability to place
813requirements on the minimum number of channels and pixels partially
814circumvents this problem, but work is being done to make Duchamp
815smarter at rejecting signals that are clearly (to a human eye at
816least) interference. See the following section for further
817improvements that are planned.
818
819%\section{Drawbacks of the current program}
820%
821%The program currently has a few problems/drawbacks/things to be aware
822%of that will hopefully be fixed in the future:
823%\begin{itemize}
824%
825%\item Narrow interference spikes are still getting found, particularly
826%  if there is no reconstruction, or reconstruction with a relatively
827%  low {\tt snrRecon} (such as 2 or 3). Increasing the {\tt
828%  minChannels} parameter is one way to circumvent this, but making the
829%  algorithm a bit more clever would be preferable.
830%
831%\item Sources that have strong continuum ripple and/or artefacts often
832%  generate many spurious detections. This needs some work to avoid
833%  Duchamp doing this, and until then users are advised to be aware
834%  of the possibility. Strong continuum ripples may generate many
835%  sources on the same spatial pixel, and this will be apparent on the
836%  detection images.
837%
838%\item Spectra are integrated over every spatial pixel of the
839%  detection, and this may dilute the actual detection, making it
840%  harder to see \ie the apparent strength of the line as plotted may
841%  not give a true indication of how strong it really is.
842%
843%%\item A caution on the merging part of the procedure. This can be time
844%%  consuming if there are many detections that do not require merging
845%%  -- in this case, the time will go like $N^2$ ($N$ = number of
846%%  detections). If there are plenty of mergers, the size of the list
847%%  reduces quickly, so the execution time will be less.
848%
849%
850%\end{itemize}
851
852
853%\section{Comparison with other software (to be developed further...)}
854%
855%\subsection{fred, by Matt Howlett}
856%
857%This is the program used in the \hipass\ analysis. It smoothes the
858%data spectrally with a boxcar filter of a size that varies over a
859%user-specified range, and then thresholds the data.
860%
861%Works effectively, but generally doesn't find as many sources as
862%Duchamp, particularly when the reconstruction is used. Sensitive to
863%faint, broad features that fall below the reconstruction threshold.
864%
865%Execution takes a long time, depending on the range of filter widths
866%that are used.
867%
868%\subsection{sfind}
869%
870%Hard to evaluate, as it does not (as far as I can see) output the
871%channel number at which detections are made, and does not merge
872%detections made at adjacent channels (\ie it just works in 2
873%dimensions).
874%
875
876\section{Future Developments}
877
878This is both a list of planned improvements and a wish-list of
879features that would be nice to include (but are not planned in the
880immediate future). Let me know if there are items not on this list, or
881items on the list you would like prioritised.
882
883\begin{itemize}
884
885\item More varied output formats. {\bf Planned.}
886
887\item Better determination of the noise characteristics of
888  spectral-line cubes, including understanding how the noise is
889  generated and developing a model for it. {\bf Planned.}
890 
891\item Include more source analysis. Examples could be: shape
892  information; measurements of HI mass; better measurements of
893  velocity width and profile... {\bf Some planned.}
894
895\item Provide some indication of the significance of the detection
896  (\ie some S/N-like value). {\bf Planned.}
897
898\item Improved ability to reject interference, possibly on the
899  spectral shape of features. {\bf Planned.}
900
901\item Ability to separate (de-blend) distinct sources that have been
902  merged. {\bf Planned.}
903
904\item Link to lists of possible counterparts (\eg via NED/SIMBAD/other
905  VO tools?). {\bf Wishlist.}
906
907\item At this point, the ``Milky Way'' channels are discarded and set
908  to zero. It may be that users would like to have those put back in
909  the final cube after the source detection is done, so at some point
910  this option may be added. {\bf Wishlist -- if needed.}
911
912\end{itemize}
913
914
915%\bibliographystyle{mn2e}
916%\bibliographystyle{abbrvnat}
917%\bibliography{mnrasmnemonic,sourceDetection}
918\begin{thebibliography}{}
919
920\bibitem[\protect\citeauthoryear{{Calabretta} \& {Greisen}}{{Calabretta} \&
921  {Greisen}}{2002}]{calabretta02}
922{Calabretta} M.,  {Greisen} E.,  2002, A\&A, 395, 1077
923
924\bibitem[\protect\citeauthoryear{{Greisen} \& {Calabretta}}{{Greisen} \&
925  {Calabretta}}{2002}]{greisen02}
926{Greisen} E.,  {Calabretta} M.,  2002, A\&A, 395, 1061
927
928\bibitem[\protect\citeauthoryear{{Hanisch}, {Farris}, {Greisen}, {Pence},
929  {Schlesinger}, {Teuben}, {Thompson} \& {Warnock}}{{Hanisch}
930  et~al.}{2001}]{hanisch01}
931{Hanisch} R.,  {Farris} A.,  {Greisen} E.,  {Pence} W.,  {Schlesinger} B.,
932  {Teuben} P.,  {Thompson} R.,    {Warnock} A.,  2001, A\&A, 376, 359
933
934\bibitem[\protect\citeauthoryear{{Hopkins}, {Miller}, {Connolly}, {Genovese},
935  {Nichol} \& {Wasserman}}{{Hopkins} et~al.}{2002}]{hopkins02}
936{Hopkins} A.,  {Miller} C.,  {Connolly} A.,  {Genovese} C.,  {Nichol} R.,
937  {Wasserman} L.,  2002, AJ, 123, 1086
938
939\bibitem[\protect\citeauthoryear{Lutz}{Lutz}{1980}]{lutz80}
940Lutz R.,  1980, The Computer Journal, 23, 262
941
942\bibitem[\protect\citeauthoryear{{Meyer} et~al.,}{{Meyer}
943  et~al.}{2004}]{meyer04:trunc}
944{Meyer} M.,  et~al., 2004, MNRAS, 350, 1195
945
946\bibitem[\protect\citeauthoryear{{Miller}, {Genovese}, {Nichol}, {Wasserman},
947  {Connolly}, {Reichart}, {Hopkins}, {Schneider} \& {Moore}}{{Miller}
948  et~al.}{2001}]{miller01}
949{Miller} C.,  {Genovese} C.,  {Nichol} R.,  {Wasserman} L.,  {Connolly} A.,
950  {Reichart} D.,  {Hopkins} A.,  {Schneider} J.,    {Moore} A.,  2001, AJ, 122,
951  3492
952
953\bibitem[\protect\citeauthoryear{Minchin}{Minchin}{1999}]{minchin99}
954Minchin R.,  1999, PASA, 16, 12
955
956\bibitem[\protect\citeauthoryear{Starck \& Murtagh}{Starck \&
957  Murtagh}{2002}]{starck02:book}
958Starck J.-L.,  Murtagh F.,  2002, {``Astronomical Image and Data Analysis''}.
959Springer
960
961\end{thebibliography}
962
963
964\appendix
965\newpage
966\section{Available parameters}
967\label{app-param}
968
969The full list of parameters that can be listed in the input file are
970given here. If not listed, they take the default value given in
971parentheses. Since the order of the parameters in the input file does
972not matter, they are grouped here in logical sections.
973
974\subsection*{Input-output related}
975\begin{entry}
976\item[ImageFile (no default assumed)] The filename of the
977  data cube to be analysed.
978\item[flagSubsection {\tt [false]}] A flag to indicate whether one
979  wants a subsection of the requested image.
980\item[Subsection {\tt [ [*,*,*] ]}] The requested subsection, which
981  should be specified in the format {\tt [x1:x2,y1:y2,z1:z2]}, where
982  the limits are inclusive. If the full range of a dimension is
983  required, use a {\tt *}, \eg if you want the full spectral range of
984  a subsection of the image, use {\tt [30:140,30:140,*]}.
985\item[flagReconExists {\tt [false]}] A flag to indicate whether the
986  reconstructed array has been saved by a previous run of Duchamp. If
987  set true, the reconstructed array will be read from the file given by
988  {\tt reconFile}, rather than calculated directly.
989\item[reconFile (no default assumed)] The FITS file that contains the
990  reconstructed array. If {\tt flagReconExists} is true and this
991  parameter is not defined, the default file searched will be
992  determined by the {\`a} trous parameters (see \S\ref{sec-recon}).
993\item[OutFile {\tt [duchamp-Results.txt]}] The file containing the
994  final list of detections. This also records the list of input
995  parameters.
996\item[SpectraFile {\tt [duchamp-Spectra.ps]}] The postscript file
997  containing the resulting integrated spectra and images of the
998  detections.
999\item[flagLog {\tt [true]}] A flag to indicate whether intermediate
1000  detections should be logged.
1001\item[LogFile {\tt [duchamp-Logfile.txt]}] The file in which intermediate
1002  detections are logged. These are detections that have not been
1003  merged. This is primarily for use in debugging and diagnostic
1004  purposes -- normal use of the program will probably not require
1005  this.
1006\item[flagOutputRecon {\tt [false]}] A flag to say whether or not to
1007  save the reconstructed cube as a FITS file. The filename will be
1008  derived from the ImageFile -- the reconstruction of {\tt image.fits}
1009  will be saved as {\tt image.RECON?.fits}, where {\tt ?} stands for
1010  the value of {\tt snrRecon} (see below).
1011\item[flagOutputResid {\tt [false]}] As for {\tt flagOutputRecon}, but
1012  for the residual array -- the difference between the original cube
1013  and the reconstructed cube. The filename will be {\tt
1014  image.RESID?.fits}.
1015\item[flagVOT {\tt [false]}] A flag to say whether to create a VOTable
1016  file corresponding to the information in {\tt outfile}. This will be
1017  an XML file in the Virtual Observatory VOTable format.
1018\item[votFile {\tt [duchamp-Results.xml]}] The VOTable file with the
1019  list of final detections. Some input parameters are also recorded.
1020\item[flagKarma {\tt [false]}] A flag to say whether to create a Karma
1021  annotation file corresponding to the information in {\tt
1022  outfile}. This can be used as an overlay for the Karma programs such
1023  as {\tt kvis}.
1024\item[karmaFile {\tt [duchamp-Results.ann]}] The Karma annotation
1025  file showing the list of final detections.
1026\item[flagMaps {\tt [true]}] A flag to say whether to save postscript
1027  files showing the 0th moment map of the whole cube (parameter {\tt
1028  momentMap}) and the detection image ({\tt detectionMap}).
1029\item[momentMap {\tt [duchamp-MomentMap.ps]}] A postscript file
1030  containing a map of the 0th moment of the detected sources, as well
1031  as pixel and WCS coordinates.
1032\item[detectionMap {\tt [duchamp-DetectionMap.ps]}] A postscript
1033  file showing each of the detected objects, coloured in greyscale by
1034  the number of channels they span. Also shows pixel and WCS
1035  coordinates.
1036\end{entry}
1037
1038\subsection*{Modifying the cube}
1039\begin{entry}
1040\item[flagBlankPix {\tt [true]}] A flag to say whether to remove BLANK
1041  pixels from the analysis -- these are pixels set to some particular
1042  value because they fall outside the imaged area.
1043\item[blankPixValue {\tt [-8.00061]}] The value of the BLANK pixels,
1044  if this information is not contained in the FITS header (the usual
1045  procedure is to obtain this value from the header information -- in
1046  which case the value set by this parameter is ignored).
1047\item[flagMW {\tt [false]}] A flag to say whether to remove channels
1048  contaminated by Milky Way (or other) emission -- the flux in these
1049  channels is currently just set to 0.
1050\item[maxMW {\tt [112]}] The maximum channel for the Milky Way
1051  emission.
1052\item[minMW {\tt [75]}] The minimum channel for the Milky Way
1053  emission. Note that the channels specified by {\tt maxMW} and {\tt
1054  minMW} are assumed to be Milky Way channels (\ie the range is
1055  inclusive).
1056\item[flagBaseline {\tt [false]}] A flag to say whether to remove the
1057  baseline from each spectrum in the cube for the purposes of
1058  reconstruction and detection.
1059\end{entry}
1060
1061\subsection*{Detection related}
1062
1063\subsubsection*{General detection}
1064\begin{entry}
1065\item[flagNegative {\tt [false]}] A flag to indicate that the features
1066  being searched for are negative. The cube will be inverted prior to
1067  searching.
1068\item[snrCut {\tt [3.]}] The cut-off value for thresholding, in terms
1069  of number of $\sigma$ above the mean.
1070\item[flagGrowth {\tt [false]}] A flag indicating whether or not to
1071  grow the detected objects to a smaller threshold.
1072\item[growthCut {\tt [2.]}] The smaller threshold using in growing
1073  detections. In units of $\sigma$ above the mean.
1074\end{entry}
1075
1076\subsubsection*{{\` a} trous reconstruction}
1077\begin{entry}
1078\item [flagATrous {\tt [true]}] A flag indicating whether or not to
1079  reconstruct the cube using the {\it {\`a} trous} wavelet
1080  reconstruction. Currently does this in 3-dimensions. See
1081  \S\ref{sec-recon} for details.
1082\item[scaleMin {\tt [1]}] The minimum wavelet scale to be used in the
1083  reconstruction. A value of 1 means ``use all scales''.
1084\item[snrRecon {\tt [4]}] The thresholding cutoff used in the
1085  reconstruction -- only wavelet coefficients this many $\sigma$ above
1086  the mean (or greater) are included in the reconstruction.
1087\item[filterCode {\tt [2]}] The code number of the filter to use in
1088  the reconstruction. The options are:
1089  \begin{itemize}
1090  \item {\bf 1:} B$_3$-spline filter: coefficients =
1091    $(\frac{1}{16}, \frac{1}{4}, \frac{3}{8}, \frac{1}{4}, \frac{1}{16})$
1092  \item {\bf 2:} Triangle filter: coefficients = $(\frac{1}{4}, \frac{1}{2}, \frac{1}{4})$
1093  \item {\bf 3:} Haar wavelet: coefficients = $(0, \frac{1}{2}, \frac{1}{2})$
1094  \end{itemize}
1095\end{entry}
1096
1097\subsubsection*{FDR method}
1098\begin{entry}
1099\item[flagFDR {\tt [false]}] A flag indicating whether or not to use
1100  the False Discovery Rate method in thresholding the pixels.
1101\item[alphaFDR {\tt [0.01]}] The $\alpha$ parameter used in the FDR
1102analysis. The average number of false detections, as a fraction of the
1103total number, will be less than $\alpha$ (see \S\ref{sec-detection}).
1104\end{entry}
1105
1106\subsubsection*{Merging detections}
1107\begin{entry}
1108\item[minPix {\tt [2]}] The minimum number of spatial pixels for a single
1109  detection to be counted.
1110\item[minChannels {\tt [3]}] The minimum number of consecutive
1111  channels that must be present in the detection for it to be accepted
1112  by the Merging algorithm.
1113%The minimum number of channels that a
1114%  detection must span for it to be accepted by the Merging algorithm.
1115\item[flagAdjacent {\tt [true]}] A flag indicating whether to use the
1116  ``adjacent pixel'' criterion to decide whether to merge objects. If
1117  not, the next two parameters are used to determine whether objects
1118  are within the necessary thresholds.
1119\item[threshSpatial {\tt [3.]}] The maximum allowed minimum spatial
1120  separation (in pixels) between two detections for them to be merged
1121  into one. Only used if {\tt flagAdjacent = false}.
1122\item[threshVelocity {\tt [7.]}] The maximum allowed minimum channel
1123  separation between two detections for them to be merged into
1124  one. %Only used if {\tt flagAdjacent = false}.
1125\end{entry}
1126
1127\subsubsection*{Other parameters}
1128\begin{entry}
1129\item[spectralMethod {\tt [peak]}] This indicates which method is used
1130  to plot the output spectra: {\tt peak} means plot the spectrum
1131  containing the detection's peak pixel; {\tt sum} means sum the
1132  spectra of each detected spatial pixel, and correct for the beam
1133  size. Any other choice defaults to {\tt peak}.
1134\item[drawBorders {\tt [true]}] A flag indicating whether borders
1135  are to be drawn around the detected objects in the moment maps
1136  included in the output (see for example Fig.~\ref{fig-spect}).
1137\item[verbose {\tt [true]}] A flag indicating whether to print the
1138  progress of computationally-intensive algorithms (such as the
1139  searching and merging) to screen.
1140\end{entry}
1141
1142
1143\newpage
1144\section{Example parameter files}
1145\label{app-input}
1146
1147This is what a typical parameter file would look like.
1148
1149\begin{verbatim}
1150imageFile       /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits
1151logFile         logfile.txt
1152outFile         results.txt
1153spectraFile     spectra.ps
1154flagSubsection  0
1155flagOutputRecon 0
1156flagOutputResid 0
1157flagBlankPix    1
1158flagMW          1
1159minMW           75
1160maxMW           112
1161minPix          3
1162flagGrowth      1
1163growthCut       1.5
1164flagATrous      0
1165scaleMin        1
1166snrRecon        4
1167flagFDR         1
1168alphaFDR        0.1
1169numPixPSF       20
1170snrCut          3
1171threshSpatial   3
1172threshVelocity  7
1173\end{verbatim}
1174
1175Note that it is not necessary to include all these parameters in the
1176file, only those that need to be changed from the defaults (as listed
1177in Appendix~\ref{app-param}), which in this case would be very few. A
1178minimal parameter file might look like:
1179\begin{verbatim}
1180imageFile       /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits
1181flagLog         0
1182snrRecon        3
1183snrCut          2.5
1184minChannels     4
1185\end{verbatim}
1186This will reconstruct the cube with a lower SNR value than the
1187default, select objects at a lower threshold,  with a looser minimum
1188channel requirement, and not keep a log of the intermediate
1189detections.
1190
1191The following page demonstrates how the parameters are presented to
1192the user, both on the screen at execution time and in the output and
1193log files:
1194\newpage
1195\begin{landscape}
1196Presentation of parameters in output and log files: 
1197\begin{verbatim}
1198---- Parameters ----
1199Image to be analysed                    = /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits
1200Intermediate Logfile                    = duchamp-Logfile.txt
1201Final Results file                      = duchamp-Results.txt
1202Spectrum file                           = duchamp-Spectra.ps
1203VOTable file                            = duchamp-Results.xml
12040th Moment Map                          = duchamp-MomentMap.ps
1205Detection Map                           = duchamp-DetectionMap.ps
1206Saving reconstructed cube?              = false
1207Saving residuals from reconstruction?   = false
1208------
1209Searching for Negative features?        = false
1210Fixing Blank Pixels?                    = true
1211Blank Pixel Value                       = -8.00061
1212Removing Milky Way channels?            = true
1213Milky Way Channels                      = 75-112
1214Beam Size (pixels)                      = 10.1788
1215Removing baselines before search?       = false
1216Minimum # Pixels in a detection         = 2
1217Growing objects after detection?        = false
1218Using A Trous reconstruction?           = true
1219Minimum scale in reconstruction         = 1
1220SNR Threshold within reconstruction     = 4
1221Filter being used for reconstruction    = B3 spline function
1222Using FDR analysis?                     = false
1223SNR Threshold                           = 2.5
1224Using Adjacent-pixel criterion?         = true
1225Max. velocity separation for merging    = 7
1226Min. # channels for merging             = 4
1227Method of spectral plotting             = peak
1228\end{verbatim}
1229
1230\newpage
1231\section{Example output file}
1232\label{app-output}
1233This the typical content of an output file, after running Duchamp
1234with the parameters illustrated on the previous page.
1235
1236{\scriptsize
1237  \begin{verbatim}
1238Results of the Duchamp source finder: Tue May 23 14:51:38 2006
1239---- Parameters ----
1240
1241(... omitted for clarity -- see previous page for examples...)
1242
1243--------------------
1244Total number of detections = 23
1245--------------------
1246 Obj#          Name     X     Y      Z           RA          DEC    w_RA   w_DEC       VEL    w_VEL     F_tot   F_peak  X1  X2  Y1  Y2   Z1   Z2  Npix Flag
1247-----------------------------------------------------------------------------------------------------------------------------------------------------------
1248    1    J0609-2156  59.4 140.6  114.7  06:09:21.03 -21:56:51.08   48.48   39.45   226.253   65.957    17.572    0.213  55  66 136 145  113  118   185     
1249    2    J0607-2601  65.2  79.6  116.2  06:07:52.21 -26:01:09.34   44.44   39.50   246.310   39.574     4.144    0.100  60  70  76  85  115  118    50     
1250    3    J0606-2720  70.8  59.8  121.4  06:06:14.90 -27:20:45.24   52.45   47.59   315.404   39.574    17.066    0.150  65  77  53  64  120  123   213     
1251    4    J0611-2138  52.5 145.1  162.5  06:11:18.85 -21:38:03.71   32.39   23.49   856.919  118.722    44.394    0.410  49  56 142 147  158  167   303    E
1252    5    J0600-2859  89.7  35.3  202.4  06:00:33.13 -28:59:01.59   23.92   28.10  1383.476  184.678    26.573    0.173  87  92  32  38  195  209   319     
1253    6    J0558-2639  95.5  70.2  222.6  05:58:52.79 -26:39:04.56   15.93   12.10  1650.508  105.531     1.925    0.063  94  97  69  71  219  227    35     
1254    7    J0617-2724  34.8  58.3  227.5  06:17:05.84 -27:24:00.93   20.75   23.42  1714.993  303.400    11.414    0.093  33  37  56  61  215  238   176     
1255    8    J0609-2141  60.3 144.4  229.6  06:09:05.74 -21:41:38.75   16.14   11.82  1742.470  105.531     1.476    0.068  59  62 143 145  225  233    25     
1256    9    J0558-2525  95.7  88.6  231.1  05:58:51.19 -25:25:33.12   27.87   24.16  1762.632  250.635    16.930    0.115  92  98  86  91  220  239   257     
1257   10    J0600-2141  88.9 144.4  232.3  06:00:52.94 -21:41:57.48   31.95   24.15  1777.848  224.252    34.030    0.166  86  93 142 147  222  239   415    E
1258   11    J0615-2634  40.0  70.8  232.6  06:15:25.93 -26:34:35.73   16.54   19.58  1782.224   52.765     2.757    0.068  38  41  69  73  231  235    44     
1259   12    J0604-2606  75.9  78.4  233.1  06:04:42.24 -26:06:22.98   28.12   23.86  1788.258  224.252    27.059    0.155  73  79  76  81  225  242   352     
1260   13    J0601-2340  88.0 114.9  235.7  06:01:08.27 -23:40:17.66   35.94   32.09  1822.941  263.826    85.132    0.297  84  92 112 119  226  246   724     
1261   14    J0615-2234  38.2 130.6  253.6  06:15:30.57 -22:34:51.69   12.38   15.71  2059.721  118.722     2.317    0.070  37  39 129 132  248  257    40     
1262   15    J0617-2305  31.4 122.8  258.0  06:17:33.18 -23:05:36.24   16.45   15.54  2117.104   39.574     1.424    0.062  30  33 121 124  256  259    23     
1263   16    J0612-2149  49.5 142.3  271.1  06:12:11.78 -21:49:20.22   24.35   19.58  2290.167  395.740    20.712    0.101  47  52 140 144  257  287   318     
1264   17    J0616-2133  35.2 145.9  300.0  06:16:16.44 -21:33:36.96   20.21    7.47  2671.799  224.252     3.851    0.127  33  37 145 146  294  311    40    E
1265   18    J0544-2736 144.0  54.9  325.4  05:44:13.62 -27:36:34.24    3.57   12.13  3006.575   39.574     0.436    0.057 144 144  54  56  324  327     7    E
1266   19    J0555-2956 107.2  20.7  367.5  05:55:10.37 -29:56:43.13   19.65   24.31  3561.004   39.574     6.482    0.169 105 109  18  23  366  369    72     
1267   20    J0558-2321  96.0 119.6  532.1  05:58:47.64 -23:21:17.38   11.91   16.09  5733.479   52.765     1.287    0.051  95  97 118 121  530  534    27     
1268   21    J0616-2649  37.9  67.0  547.0  06:16:04.62 -26:49:18.33   12.35   11.67  5929.923   39.574     1.637    0.064  37  39  66  68  546  549    25     
1269   22    J0619-2252  25.1 125.9  724.2  06:19:21.57 -22:52:13.98   12.38   11.61  8267.304   39.573     0.698    0.059  24  26 125 127  723  726    13    E
1270   23    J0552-2916 116.9  30.5  727.0  05:52:15.05 -29:16:49.65   11.59   20.25  8304.033  303.400    35.834    0.479 116 118  28  32  716  739   132     
1271  \end{verbatim}
1272}
1273Note that the
1274width of the table can make it hard to read. A good trick for those
1275using UNIX/Linux is to make use of the {\tt a2ps} command. The
1276following works well, producing a postscript file {\tt results.ps}:
1277\\\verb|a2ps -1 -r -f8 -o duchamp-Results.ps duchamp-Results.txt|
1278
1279%\end{landscape}
1280
1281\newpage
1282\section{Example VOTable output}
1283\label{app-votable}
1284This is part of the VOTable, in XML format, corresponding to the
1285output file in Appendix~\ref{app-output} (the indentation has been removed to make it fit on the page!).
1286
1287%\begin{landscape}
1288{\scriptsize
1289  \begin{verbatim}
1290<?xml version="1.0"?>
1291<VOTABLE version="1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
1292 xsi:noNamespaceSchemaLocation="http://www.ivoa.net/xml/VOTable/VOTable/v1.1">
1293<COOSYS ID="J2000" equinox="J2000." epoch="J2000." system="eq_FK5"/>
1294<RESOURCE name="Duchamp Output">
1295<TABLE name="Detections">
1296<DESCRIPTION>Detected sources and parameters from running the Duchamp source finder.</DESCRIPTION>
1297<PARAM name="FITS file" datatype="char" ucd="meta.file;meta.fits" value="/DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits"/>
1298<PARAM name="Threshold" datatype="float" ucd="stat.snr" value="2.5">
1299<PARAM name="ATrous note" datatype="char" ucd="meta.note" value="The a trous reconstruction method was used, with the following parameters.">
1300<PARAM name="ATrous Cut" datatype="float" ucd="stat.snr" value="4">
1301<PARAM name="ATrous Minimum Scale" datatype="int" ucd="stat.param" value="1">
1302<PARAM name="ATrous Filter" datatype="char" ucd="meta.code;stat" value="B3 spline function">
1303<FIELD name="ID" ID="col1" ucd="meta.id" datatype="int" width="4"/>
1304<FIELD name="Name" ID="col2" ucd="meta.id;meta.main" datatype="char" arraysize="14"/>
1305<FIELD name="RA" ID="col3" ucd="pos.eq.ra;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/>
1306<FIELD name="Dec" ID="col4" ucd="pos.eq.dec;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/>
1307<FIELD name="w_RA" ID="col3" ucd="phys.angSize;pos.eq.ra" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/>
1308<FIELD name="w_Dec" ID="col4" ucd="phys.angSize;pos.eq.dec" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/>
1309<FIELD name="Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc" datatype="float" width="9" precision="3" unit="km/s"/>
1310<FIELD name="w_Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc;spect.line.width" datatype="float" width="8" precision="3" unit="km/s"/>
1311<FIELD name="Integrated_Flux" ID="col4" ucd="phys.flux;spect.line.intensity" datatype="float" width="10" precision="3" unit="km/s"/>
1312<DATA>
1313<TABLEDATA>
1314<TR>
1315<TD>   1</TD><TD>    J0609-2200</TD><TD> 92.410416</TD><TD>-22.013390</TD><TD>  48.50</TD><TD>  39.42</TD><TD>  213.061</TD><TD>  65.957</TD><TD>    17.572</TD>
1316</TR>
1317<TR>
1318<TD>   2</TD><TD>    J0608-2605</TD><TD> 92.042633</TD><TD>-26.085157</TD><TD>  44.47</TD><TD>  39.47</TD><TD>  233.119</TD><TD>  39.574</TD><TD>     4.144</TD>
1319</TR>
1320<TR>
1321<TD>   3</TD><TD>    J0606-2724</TD><TD> 91.637840</TD><TD>-27.412022</TD><TD>  52.48</TD><TD>  47.57</TD><TD>  302.213</TD><TD>  39.574</TD><TD>    17.066</TD>
1322</TR>
1323(... table truncated for clarity ...)
1324</TABLEDATA>
1325</DATA>
1326</TABLE>
1327</RESOURCE>
1328</VOTABLE>
1329  \end{verbatim}
1330}
1331\end{landscape}
1332
1333\newpage
1334\section{Example Karma Annotation File output}
1335\label{app-karma}
1336
1337This is the format of the Karma Annotation file, showing the locations
1338of the detected objects. This can be loaded by the plotting tools of
1339the Karma package (for instance, {\tt kvis}) as an overlay on the FITS
1340file.
1341
1342\begin{verbatim}
1343# Duchamp Source Finder results for
1344#  cube /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits
1345COLOR RED
1346COORD W
1347CIRCLE 92.3376 -21.9475 0.403992
1348TEXT 92.3376 -21.9475 1
1349CIRCLE 91.9676 -26.0193 0.37034
1350TEXT 91.9676 -26.0193 2
1351CIRCLE 91.5621 -27.3459 0.437109
1352TEXT 91.5621 -27.3459 3
1353CIRCLE 92.8285 -21.6344 0.269914
1354TEXT 92.8285 -21.6344 4
1355CIRCLE 90.1381 -28.9838 0.234179
1356TEXT 90.1381 -28.9838 5
1357CIRCLE 89.72 -26.6513 0.132743
1358TEXT 89.72 -26.6513 6
1359CIRCLE 94.2743 -27.4003 0.195175
1360TEXT 94.2743 -27.4003 7
1361CIRCLE 92.2739 -21.6941 0.134538
1362TEXT 92.2739 -21.6941 8
1363CIRCLE 89.7133 -25.4259 0.232252
1364TEXT 89.7133 -25.4259 9
1365CIRCLE 90.2206 -21.6993 0.266247
1366TEXT 90.2206 -21.6993 10
1367CIRCLE 93.8581 -26.5766 0.163153
1368TEXT 93.8581 -26.5766 11
1369CIRCLE 91.176 -26.1064 0.234356
1370TEXT 91.176 -26.1064 12
1371CIRCLE 90.2844 -23.6716 0.299509
1372TEXT 90.2844 -23.6716 13
1373CIRCLE 93.8774 -22.581 0.130925
1374TEXT 93.8774 -22.581 14
1375CIRCLE 94.3882 -23.0934 0.137108
1376TEXT 94.3882 -23.0934 15
1377CIRCLE 93.0491 -21.8223 0.202928
1378TEXT 93.0491 -21.8223 16
1379CIRCLE 94.0685 -21.5603 0.168456
1380TEXT 94.0685 -21.5603 17
1381CIRCLE 86.0568 -27.6095 0.101113
1382TEXT 86.0568 -27.6095 18
1383CIRCLE 88.7932 -29.9453 0.202624
1384TEXT 88.7932 -29.9453 19
1385\end{verbatim}
1386
1387\newpage
1388\section{Installing Duchamp (README file)}
1389\begin{verbatim}
1390There is an executable (Duchamp) that has been compiled on a Debian
1391Linux kernel 2.6.8-2-686, with gcc version 3.3.5 (Debian 1:3.3.5-13)
1392
1393If that is no good to you, you can compile it yourself using the
1394Makefile included in this directory (sorry for not having a configure
1395script or similar yet!).
1396
1397Duchamp uses three main external libraries: pgplot, cfitsio and
1398wcslib. You will need to set the paths for the base directory and
1399three libraries, as they are currently configured for my use and will
1400not be of much use to you! These are:
1401
1402BASE --> the current directory
1403PGDIR --> where the pgplot libraries (and header files) are located
1404CFITSIODIR --> where the header file fitsio.h is
1405CFITSIOLDIR --> where the cfitsio library is located (libcfitsio.a)
1406WCSDIR --> where the wcslib header files are
1407WCSLDIR --> where the wcslib library is located (libwcs.a)
1408
1409If you do not have the libraries, they can be downloaded from the
1410following locations:
1411PGPlot -- http://www.astro.caltech.edu/~tjp/pgplot/
1412cfitsio -- http://heasarc.gsfc.nasa.gov/docs/software/fitsio/fitsio.html
1413wcslib -- http://www.atnf.csiro.au/people/Mark.Calabretta/WCS/index.html
1414
1415Once you've set up the Makefile correctly, then simply typing
1416> make duchamp
1417will compile the program.
1418
1419To run it, you need to use the syntax
1420> Duchamp -p parameterFile
1421where parameterFile is a file with the input parameters, including the
1422name of the cube you want to search.
1423
1424There are two example input files included with the distribution. The
1425smaller one, InputExample, shows the typical parameters one might want
1426to set. The large one, InputComplete, lists all parameters that can be
1427entered, and a brief description of them. Refer to the documentation
1428for further details.
1429
1430To get going quickly, just replace the "your-file-here" in
1431InputExample with your image name, and type
1432> Duchamp -p InputExample
1433and you're off!
1434\end{verbatim}
1435
1436\section{Robust statistics for a Normal distribution}
1437\label{app-madfm}
1438
1439The Normal, or Gaussian, distribution for mean $\mu$ and standard
1440deviation $\sigma$ can be written as
1441\[
1442f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\ e^{-(x-\mu)^2/2\sigma^2}.
1443 \]
1444
1445When one has a purely Gaussian signal, it is straightforward to
1446estimate $\sigma$ by calculating the standard deviation (or rms) of
1447the data. However, if there is a small amount of signal present on top
1448of Gaussian noise, and one wants to estimate the $\sigma$ for the
1449noise, the presence of the large values from the signal can bias the
1450estimator to higher values.
1451
1452An alternative way is to use the median ($m$) and median absolute deviation
1453from the median ($s$) to estimate $\mu$ and $\sigma$. The median is the
1454middle of the distribution, defined for a continuous distribution by
1455\[
1456\int_{-\infty}^{m} f(x) \diff x = \int_{m}^{\infty} f(x) \diff x.
1457\]
1458From symmetry, we quickly see that for the continuous Normal
1459distribution, $m=\mu$. We consider the case henceforth of $\mu=0$,
1460without loss of generality.
1461
1462To find $s$, we find the distribution of the absolute deviation from
1463the median, and then find the median of that distribution. This
1464distribution is given by
1465\begin{eqnarray*}
1466g(x) &= &{\mbox{\rm distribution of }} |x|\\
1467     &= &f(x) + f(-x),\ x\ge0\\
1468     &= &\sqrt{\frac{2}{\pi\sigma^2}}\, e^{-x^2/2\sigma^2},\ x\ge0.
1469\end{eqnarray*}
1470So, the median absolute deviation from the median, $s$, is given by
1471\[
1472\int_{0}^{s} g(x) \diff x = \int_{s}^{\infty} g(x) \diff x.
1473\]
1474Now, $\int_{0}^{\infty}e^{-x^2/2\sigma^2} \diff x = \sqrt{\pi\sigma^2/2}$, and
1475so $\int_{s}^{\infty} e^{-x^2/2\sigma^2} \diff x =
1476\sqrt{\pi\sigma^2/2} - \int_{0}^{s} e^{-\frac{x^2}{2\sigma^2}} \diff x
1477$. Hence, to find $s$ we simply solve the following equation (setting $\sigma=1$ for
1478simplicity -- equivalent to stating $x$ and $s$ in units of $\sigma$):
1479\[
1480\int_{0}^{s}e^{-x^2/2} \diff x - \sqrt{\pi/8} = 0.
1481\]
1482This is hard to solve analytically (no nice analytic solution exists
1483for the finite integral that I'm aware of), but straightforward to
1484solve numerically, yielding the value of $s=0.6744888$. Thus, to
1485estimate $\sigma$ for a Normally distributed data set, one can calculate
1486$s$, then divide by 0.6744888 (or multiply by 1.4826042) to obtain the
1487correct estimator.
1488
1489Note that this is different to solutions quoted elsewhere,
1490specifically in \citet{meyer04:trunc}, where the same robust estimator
1491is used but with an incorrect conversion to standard deviation -- they
1492assume $\sigma = s\sqrt{\pi/2}$. This, in fact, is the conversion used
1493to convert the {\it mean} absolute deviation from the mean to the
1494standard deviation. This means that the cube noise in the \hipass\
1495catalogue (their parameter Rms$_{\rm cube}$) should be 18\% larger
1496than quoted.
1497
1498\section{How Gaussian noise changes with wavelet scale.}
1499\label{app-scaling}
1500
1501The key element in the wavelet reconstruction of an array is the
1502thresholding of the individual wavelet coefficient arrays. This is
1503usually done by choosing a level to be some number of standard
1504deviations above the mean value.
1505
1506However, since the wavelet arrays are produced by convolving the input
1507array by an increasingly large filter, the pixels in the coefficient
1508arrays become increasingly correlated as the scale of the filter
1509increases. This results in the measured standard deviation from a
1510given coefficient array decreasing with increasing scale. To calculate
1511this, we need to take into account how many other pixels each pixel in
1512the convolved array depends on.
1513
1514To demonstrate, suppose we have a 1-D array with $N$ pixel values
1515given by $F_i,\ i=1,...,N$, and we convolve it with the B$_3$-spline
1516filter, defined by the set of coefficients
1517$\{1/16,1/4,3/8,1/4,1/16\}$. The flux of the $i$th pixel in the
1518convolved array will be
1519\[
1520F'_i = \frac{1}{16}F_{i-2} + \frac{1}{16}F_{i-2} + \frac{3}{8}F_{i}
1521+ \frac{1}{4}F_{i-1} + \frac{1}{16}F_{i+2}
1522\]
1523and the flux of the corresponding pixel in the wavelet array will be
1524\[
1525W'_i = F_i - F'_i = \frac{1}{16}F_{i-2} + \frac{1}{16}F_{i-2} + \frac{5}{8}F_{i}
1526+ \frac{1}{4}F_{i-1} + \frac{1}{16}F_{i+2}
1527\]
1528Now, assuming each pixel has the same standard deviation
1529$\sigma_i=\sigma$, we can work out the standard deviation for the
1530coefficient array:
1531\[
1532\sigma'_i = \sigma \sqrt{\left(\frac{1}{16}\right)^2 + \left(\frac{1}{4}\right)^2
1533  + \left(\frac{5}{8}\right)^2 + \left(\frac{1}{4}\right)^2 + \left(\frac{1}{16}\right)^2}
1534          = 0.72349\ \sigma
1535\]
1536Thus, the first scale wavelet coefficient array will have a standard
1537deviation of 72.3\% of the input array. This procedure can be followed
1538to calculate the necessary values for all scales, dimensions and
1539filters used by Duchamp.
1540
1541Calculating these values is, therefore, a critical step in performing
1542the reconstruction. \citet{starck02:book} did so by simulating data sets
1543with Gaussian noise, taking the wavelet transform, and measuring the
1544value of $\sigma$ for each scale. We take a different approach, by
1545calculating the scaling factors directly from the filter coefficients
1546by taking the wavelet transform of an array made up of a 1 in the
1547central pixel and 0s everywhere else. The scaling value is then
1548derived by adding in quadrature all the wavelet coefficient values at
1549each scale. We give the scaling factors for the three filters
1550available to Duchamp on the following page. These values are
1551hard-coded into Duchamp, so no on-the-fly calculation of them is
1552necessary.
1553
1554Memory limitations prevent us from calculating factors for large
1555scales, particularly for the three-dimensional case (hence the --
1556symbols in the tables). To calculate factors for
1557higher scales than those available, we note the following
1558relationships apply for large scales to a sufficient level of precision:
1559\begin{itemize}
1560\item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{2}$.
1561\item 2-D: factor(scale $i$) = factor(scale $i-1$)$/2$.
1562\item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{8}$.
1563\end{itemize}
1564
1565\newpage
1566\begin{itemize}
1567\item {\bf B$_3$-Spline Function:} $\{1/16,1/4,3/8,1/4,1/16\}$
1568
1569\begin{tabular}{llll}
1570Scale & 1 dimension      & 2 dimension     & 3 dimension\\ \hline
15711     & 0.723489806      & 0.890796310     & 0.956543592\\
15722     & 0.285450405      & 0.200663851     & 0.120336499\\
15733     & 0.177947535      & 0.0855075048    & 0.0349500154\\
15744     & 0.122223156      & 0.0412474444    & 0.0118164242\\
15755     & 0.0858113122     & 0.0204249666    & 0.00413233507\\
15766     & 0.0605703043     & 0.0101897592    & 0.00145703714\\
15777     & 0.0428107206     & 0.00509204670   & 0.000514791120\\
15788     & 0.0302684024     & 0.00254566946   & --\\
15799     & 0.0214024008     & 0.00127279050   & --\\
158010    & 0.0151336781     & 0.000636389722  & --\\
158111    & 0.0107011079     & 0.000318194170  & --\\
158212    & 0.00756682272    & --              & --\\
158313    & 0.00535055108    & --              & --\\
1584%14    & 0.00378341085   & --              & --\\
1585%15    & 0.00267527545   & --              & --\\
1586%16    & 0.00189170541   & --              & --\\
1587%17    & 0.00133763772   & --              & --\\
1588%18    & 0.000945852704   & --             & --
1589\end{tabular}
1590
1591\item {\bf Triangle Function:} $\{1/4,1/2,1/4\}$
1592
1593\begin{tabular}{llll}
1594Scale & 1 dimension      & 2 dimension     & 3 dimension\\ \hline
15951     & 0.612372436      & 0.800390530     & 0.895954449  \\
15962     & 0.330718914      & 0.272878894     & 0.192033014\\
15973     & 0.211947812      & 0.119779282     & 0.0576484078\\
15984     & 0.145740298      & 0.0577664785    & 0.0194912393\\
15995     & 0.102310944      & 0.0286163283    & 0.00681278387\\
16006     & 0.0722128185     & 0.0142747506    & 0.00240175885\\
16017     & 0.0510388224     & 0.00713319703   & 0.000848538128 \\
16028     & 0.0360857673     & 0.00356607618   & 0.000299949455 \\
16039     & 0.0255157615     & 0.00178297280   & -- \\
160410    & 0.0180422389     & 0.000891478237  & --  \\
160511    & 0.0127577667     & 0.000445738098  & --  \\
160612    & 0.00902109930    & 0.000222868922  & --  \\
160713    & 0.00637887978    & --              & -- \\
1608%14   & 0.00451054902    & --              & -- \\
1609%15   & 0.00318942978    & --              & -- \\
1610%16   & 0.00225527449    & --              & -- \\
1611%17   & 0.00159471988    & --              & -- \\
1612%18   & 0.000112763724   & --              & --
1613
1614\end{tabular}
1615
1616\item {\bf Haar Wavelet:} $\{0,1/2,1/2\}$
1617
1618\begin{tabular}{llll}
1619Scale & 1 dimension      & 2 dimension     & 3 dimension\\ \hline
16201     & 0.707167810      & 0.433012702     & 0.935414347 \\
16212     & 0.500000000      & 0.216506351     & 0.330718914\\
16223     & 0.353553391      & 0.108253175     & 0.116926793\\
16234     & 0.250000000      & 0.0541265877    & 0.0413398642\\
16245     & 0.176776695      & 0.0270632939    & 0.0146158492\\
16256     & 0.125000000      & 0.0135316469    & 0.00516748303
1626
1627\end{tabular}
1628
1629
1630\end{itemize}
1631
1632\end{document}
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
Note: See TracBrowser for help on using the repository browser.