[18] | 1 | \documentclass[12pt,a4paper]{article} |
---|
| 2 | |
---|
| 3 | %Define a test for doing PDF format -- use different code below |
---|
| 4 | \newif\ifPDF |
---|
| 5 | \ifx\pdfoutput\undefined\PDFfalse |
---|
| 6 | \else\ifnum\pdfoutput > 0\PDFtrue |
---|
| 7 | \else\PDFfalse |
---|
| 8 | \fi |
---|
| 9 | \fi |
---|
| 10 | |
---|
[3] | 11 | \textwidth=161 mm |
---|
[71] | 12 | \textheight=245 mm |
---|
| 13 | \topmargin=-15 mm |
---|
[3] | 14 | \oddsidemargin=0 mm |
---|
| 15 | \parindent=6 mm |
---|
| 16 | |
---|
[18] | 17 | \usepackage[sort]{natbib} |
---|
| 18 | \usepackage{lscape} |
---|
| 19 | \bibpunct[,]{(}{)}{;}{a}{}{,} |
---|
| 20 | |
---|
[3] | 21 | \newcommand{\eg}{e.g.\ } |
---|
| 22 | \newcommand{\ie}{i.e.\ } |
---|
| 23 | \newcommand{\hi}{H{\sc i}} |
---|
| 24 | \newcommand{\hipass}{{\sc hipass}} |
---|
| 25 | \newcommand{\progname}{{\tt Duchamp}} |
---|
| 26 | \newcommand{\diff}{{\rm d}} |
---|
| 27 | \newcommand{\entrylabel}[1]{\mbox{\textsf{\bf{#1:}}}\hfil} |
---|
| 28 | \newenvironment{entry} |
---|
| 29 | {\begin{list}{}% |
---|
| 30 | {\renewcommand{\makelabel}{\entrylabel}% |
---|
| 31 | \setlength{\labelwidth}{30mm}% |
---|
| 32 | \setlength{\labelsep}{5pt}% |
---|
| 33 | \setlength{\itemsep}{2pt}% |
---|
| 34 | \setlength{\parsep}{2pt}% |
---|
| 35 | \setlength{\leftmargin}{35mm}% |
---|
| 36 | }% |
---|
| 37 | }% |
---|
| 38 | {\end{list}} |
---|
| 39 | |
---|
[18] | 40 | |
---|
[3] | 41 | \title{A Guide to the {\it Duchamp} Source Finding Software} |
---|
[18] | 42 | \author{Matthew Whiting\\ |
---|
| 43 | %{\small \href{mailto:Matthew.Whiting@csiro.au}{Matthew.Whiting@csiro.au}}\\ |
---|
[15] | 44 | Australia Telescope National Facility\\CSIRO} |
---|
[3] | 45 | %\date{January 2006} |
---|
| 46 | \date{} |
---|
| 47 | |
---|
[18] | 48 | % If we are creating a PDF, use different options for graphicx, hyperref. |
---|
| 49 | \ifPDF |
---|
| 50 | \usepackage[pdftex]{graphicx,color} |
---|
| 51 | \usepackage[pdftex]{hyperref} |
---|
| 52 | \hypersetup{colorlinks=true,% |
---|
[80] | 53 | citecolor=red,% |
---|
| 54 | filecolor=red,% |
---|
| 55 | linkcolor=red,% |
---|
| 56 | urlcolor=red,% |
---|
| 57 | } |
---|
[18] | 58 | \else |
---|
| 59 | \usepackage[dvips]{graphicx} |
---|
| 60 | \usepackage[dvips]{hyperref} |
---|
| 61 | \fi |
---|
| 62 | |
---|
[80] | 63 | \pagestyle{headings} |
---|
[3] | 64 | \begin{document} |
---|
| 65 | |
---|
| 66 | \maketitle |
---|
[80] | 67 | \thispagestyle{empty} |
---|
| 68 | \begin{figure}[!h] |
---|
| 69 | \begin{center} |
---|
| 70 | \includegraphics[width=\textwidth]{cover_image} |
---|
| 71 | \end{center} |
---|
| 72 | \end{figure} |
---|
| 73 | |
---|
| 74 | \newpage |
---|
[3] | 75 | \tableofcontents |
---|
| 76 | |
---|
| 77 | \newpage |
---|
| 78 | \section{Introduction and getting going quickly} |
---|
| 79 | |
---|
| 80 | This document gives details on the use of the program Duchamp. This |
---|
[80] | 81 | has been designed to provide a source-detection facility for |
---|
[3] | 82 | spectral-line data cubes. The basic execution of Duchamp is to read |
---|
| 83 | in a FITS data cube, find sources in the cube, and produce a text |
---|
| 84 | file of positions, velocities and fluxes of the detections, as well as |
---|
| 85 | a postscript file of the spectra of each detection. |
---|
| 86 | |
---|
| 87 | So, you have a FITS cube, and you want to find the sources in it. What |
---|
| 88 | do you do? The first step is to make an input file that contains the |
---|
| 89 | list of parameters. Brief and detailed examples are shown in |
---|
| 90 | Appendix~\ref{app-input}. This provides the input file name, the various |
---|
| 91 | output files, and defines various parameters that control the |
---|
| 92 | execution. |
---|
| 93 | |
---|
[85] | 94 | The standard way to run Duchamp is by the command |
---|
[3] | 95 | \begin{quote} |
---|
| 96 | {\tt Duchamp -p [parameter file]} |
---|
| 97 | \end{quote} |
---|
| 98 | replacing {\tt [parameter file]} with the name of the file you have |
---|
[85] | 99 | just created/copied. Alternatively, you can use the syntax |
---|
| 100 | \begin{quote} |
---|
| 101 | {\tt Duchamp -f [FITS file]} |
---|
| 102 | \end{quote} |
---|
| 103 | where {\tt [FITS file]} is the file you wish to search. In the latter |
---|
| 104 | case, the rest of the parameters will take their default values |
---|
| 105 | detailed in Appendix~\ref{app-param}. In either case, the program will |
---|
| 106 | then work away and give you the list of detections and their |
---|
[87] | 107 | spectra. The program execution is summarised below, and detailed in |
---|
[85] | 108 | \S\ref{sec-flow}. Information on inputs is in \S\ref{sec-param} and |
---|
| 109 | Appendix~\ref{app-param}, and descriptions of the output is in |
---|
| 110 | \S\ref{sec-output}. |
---|
[3] | 111 | |
---|
| 112 | \subsection{A summary of the execution steps} |
---|
| 113 | |
---|
| 114 | The basic flow of the program is summarised here. All these steps are |
---|
| 115 | discussed in more detail in the following sections, so read on if |
---|
[18] | 116 | you have questions! |
---|
[3] | 117 | \begin{enumerate} |
---|
| 118 | \item The parameter file given on the command line is read in, and the |
---|
| 119 | parameters absorbed. |
---|
| 120 | \item From the parameter file, the FITS image is located and read in |
---|
| 121 | to memory. |
---|
[80] | 122 | \item If requested, a FITS image with a previously reconstructed array |
---|
| 123 | is read in. |
---|
[3] | 124 | \item If requested, blank pixels are trimmed from the edges, and |
---|
| 125 | channels corresponding to bright (\eg Galactic) emission are |
---|
| 126 | excised. |
---|
| 127 | \item If requested, the baseline of each spectrum is removed. |
---|
[80] | 128 | \item If the reconstruction method is requested, and the reconstructed |
---|
| 129 | array has not been read in at Step 3 above, the cube is |
---|
[3] | 130 | reconstructed using the {\it {\' a} trous} wavelet method. |
---|
| 131 | \item Searching for objects then takes place, using the requested |
---|
| 132 | thresholding method. |
---|
| 133 | \item The list of objects is trimmed by merging neighbouring objects |
---|
| 134 | and removing those deemed unacceptable. |
---|
| 135 | \item The baselines and trimmed pixels are replaced prior to output. |
---|
| 136 | \item The details on the detections are written to screen and to the |
---|
| 137 | requested output file. |
---|
| 138 | \item Maps showing the spatial location of the detections are written. |
---|
| 139 | \item The integrated spectra of each detection are written to a |
---|
| 140 | postscript file. |
---|
| 141 | \item If requested, the reconstructed array can be written to a new |
---|
| 142 | FITS file. |
---|
| 143 | \end{enumerate} |
---|
| 144 | |
---|
| 145 | \subsection{Guide to terminology} |
---|
| 146 | |
---|
[18] | 147 | First, a brief note on the use of terminology in this guide. Duchamp |
---|
| 148 | is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is |
---|
| 149 | the Flexible Image Transport System -- see \citet{hanisch01} or |
---|
| 150 | websites such as |
---|
| 151 | \href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html} |
---|
| 152 | for details.} image arrays with three dimensions -- they are assumed |
---|
| 153 | to have the following form: the first two dimensions (referred to as |
---|
| 154 | $x$ and $y$) are spatial directions (that is, relating to the position |
---|
| 155 | on the sky), while the third dimension, $z$, is the spectral |
---|
| 156 | direction, which can correspond to frequency, wavelength, or velocity. |
---|
[3] | 157 | |
---|
| 158 | Each spatial pixel (a given $(x,y)$ coordinate) can be said to be a |
---|
| 159 | single spectrum, while a slice through the cube perpendicular to the |
---|
| 160 | spectral direction at a given $z$-value is a single channel (the 2-D |
---|
| 161 | image is a channel map). |
---|
| 162 | |
---|
[60] | 163 | Features that are detected are assumed to be positive. The user can |
---|
| 164 | choose to search for negative features by setting an input parameter |
---|
| 165 | -- this inverts the cube prior to the search (see |
---|
| 166 | \S~\ref{sec-detection} for details). |
---|
[24] | 167 | |
---|
[18] | 168 | Note that it is possible to run Duchamp on a two-dimensional image |
---|
| 169 | (\ie one with no frequency or velocity information), or indeed a |
---|
| 170 | one-dimensional array, and many of the features of the program will |
---|
| 171 | work fine. The focus, however, is on object detection in three |
---|
| 172 | dimensions. |
---|
| 173 | |
---|
[15] | 174 | \subsection{Why ``Duchamp''?} |
---|
| 175 | |
---|
| 176 | Well, it's important for a program to have a name, and it certainly |
---|
[80] | 177 | beats the initial working title of ``cubefind''. I had planned to call |
---|
| 178 | it ``Picasso'' (as in the father of cubism), but sadly this had |
---|
| 179 | already been used before \citep{minchin99}. So I settled on naming it |
---|
| 180 | after Marcel Duchamp, another cubist, but also one of the first |
---|
| 181 | artists to work with ``found objects''. |
---|
[15] | 182 | |
---|
[3] | 183 | \section{User Inputs} |
---|
| 184 | \label{sec-param} |
---|
| 185 | |
---|
| 186 | Input to the program is provided by means of a parameter file. Parameters |
---|
| 187 | are listed in the file, followed by the value that should be assigned |
---|
| 188 | to them. The syntax used is {\tt paramName value}. The file is not |
---|
| 189 | case-sensitive, and lines in the input file that start with {\tt \#} are |
---|
| 190 | ignored. If a parameter is listed more than once, the latter value is |
---|
| 191 | used, but otherwise the order in which the parameters are listed in the |
---|
| 192 | input file is arbitrary. |
---|
| 193 | |
---|
| 194 | If a parameter is not listed, the default value is assumed. The |
---|
| 195 | defaults are chosen to provide a good result (using the reconstruction |
---|
[24] | 196 | method), so the user doesn't need to specify many new parameters in |
---|
| 197 | the input file. Note that the image file {\bf must} be specified! The |
---|
| 198 | parameters that can be set are listed in Appendix~\ref{app-param}, |
---|
| 199 | with their default values in parentheses. |
---|
[3] | 200 | |
---|
| 201 | The 'flag' parameters are stored as {\tt bool} variables, and so are |
---|
[18] | 202 | either {\tt true = 1} or {\tt false = 0}. Currently the program only |
---|
[3] | 203 | reads them from the file as integers, and so they should be entered in |
---|
| 204 | the file as 0 or 1 (see example file in Appendix~\ref{app-input}). |
---|
| 205 | |
---|
| 206 | \section{What the program is doing} |
---|
| 207 | \label{sec-flow} |
---|
| 208 | |
---|
| 209 | The execution flow of the program is detailed here, indicating the |
---|
| 210 | main algorithmic steps that are used. The program is written in C/C++ |
---|
| 211 | and makes use of the {\sc cfitsio}, {\sc wcslib} and {\sc pgplot} |
---|
| 212 | libraries. |
---|
| 213 | |
---|
| 214 | %\subsection{Parameter input} |
---|
| 215 | % |
---|
| 216 | %The user provides parameters that govern the selection of files and |
---|
| 217 | %the parameters used by the various subroutines in the program. This is |
---|
| 218 | %done via a parameter file, and the parameters are stored in a C++ |
---|
| 219 | %class for use throughout the program. The form of the parameter file is |
---|
| 220 | %discussed in \S\ref{sec-param}, and the parameters themselves are |
---|
| 221 | %listed in Appendix~\ref{app-param}. |
---|
| 222 | |
---|
| 223 | \subsection{Image input} |
---|
| 224 | |
---|
| 225 | The cube is read in using basic {\sc cfitsio} commands, and stored as |
---|
| 226 | an array in a special C++ class structure. This class keeps track of |
---|
| 227 | the list of detected objects, as well as any reconstructed arrays that |
---|
[18] | 228 | are made (see \S\ref{sec-recon}). The World Coordinate System (WCS) |
---|
| 229 | information for the cube is also obtained from the FITS header by {\sc |
---|
| 230 | wcslib} functions \citep{greisen02, calabretta02}, and this |
---|
| 231 | information, in the form of a {\tt wcsprm} structure, is also stored |
---|
| 232 | in the same class. |
---|
[3] | 233 | |
---|
[80] | 234 | A sub-section of an image can be requested via the {\tt subsection} |
---|
| 235 | parameter in the parameter file -- this can be a good idea if the cube |
---|
| 236 | has very noisy edges, which may produce many spurious detections. The |
---|
| 237 | generalised form of the subsection that is used by {\sc cfitsio} is |
---|
| 238 | {\tt [x1:x2:dx,y1:y2:dy,z1:z2:dz]}, such that the x-coordinates run |
---|
| 239 | from {\tt x1} to {\tt x2} (inclusive), with steps of {\tt dx}. The |
---|
| 240 | step value can be omitted (so a subsection of the form {\tt |
---|
| 241 | [2:50,2:50,10:1000]} is still valid). Duchamp does not at this stage |
---|
| 242 | deal with the presence of steps in the subsection string, and any that |
---|
| 243 | are present are removed before the file is opened. |
---|
[3] | 244 | |
---|
[80] | 245 | If one wants the full range of a coordinate then replace the range |
---|
| 246 | with an asterisk, \eg {\tt [2:50,2:50,*]}. If one wants to use just a |
---|
[3] | 247 | subsection, one must set {\tt flagSubsection = 1}. A complete |
---|
| 248 | description of the section syntax can be found at the {\sc fitsio} web |
---|
| 249 | site |
---|
[15] | 250 | \footnote{ |
---|
| 251 | \href{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}% |
---|
| 252 | {http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}}. |
---|
[3] | 253 | |
---|
| 254 | \subsection{Image modification} |
---|
| 255 | \label{sec-modify} |
---|
| 256 | |
---|
| 257 | Several modifications to the cube can be made that improve the |
---|
| 258 | execution and efficiency of Duchamp (these are optional -- their |
---|
| 259 | use is indicated by the relevant flags set in the input parameter |
---|
| 260 | file). |
---|
| 261 | |
---|
| 262 | \subsubsection{Milky-Way removal} |
---|
| 263 | |
---|
| 264 | First, a single set of contiguous channels can be removed -- these may |
---|
| 265 | exhibit very strong emission, such as that from the Milky Way as seen |
---|
| 266 | in extragalactic \hi\ cubes (hence the references to ``Milky Way'' in |
---|
| 267 | relation to this task -- apologies to Galactic astronomers!). Such |
---|
| 268 | dominant channels will both produce many unnecessary, uninteresting |
---|
[24] | 269 | and large (in size and hence in memory usage) detections, and will |
---|
| 270 | also affect any reconstruction that is performed (see next |
---|
| 271 | section). The use of this feature is controlled by the {\tt flagMW} |
---|
| 272 | parameter, and the exact channels concerned are able to be set by the |
---|
| 273 | user (using {\tt maxMW} and {\tt minMW}). When employed, the flux in |
---|
| 274 | these channels is set to zero. The information in those channels is |
---|
| 275 | not kept. |
---|
[3] | 276 | |
---|
| 277 | \subsubsection{Blank pixel removal} |
---|
| 278 | |
---|
[24] | 279 | Second, the cube is trimmed of any BLANK pixels that pad the image |
---|
[3] | 280 | out to a rectangular shape. This is also optional, being determined by |
---|
| 281 | the {\tt flagBlankPix} parameter. The value for these pixels is read from |
---|
| 282 | the FITS header (using the BLANK, BSCALE and BZERO keywords), but if |
---|
| 283 | these are not present then the value can be specified by the user in |
---|
| 284 | the parameter file. If these blank pixels are stored as NaNs, then a |
---|
| 285 | normal number will be substituted (allowing these pixels to be |
---|
[24] | 286 | accurately removed without adverse effects). [NOTE: this appears not |
---|
| 287 | to be working correctly at time of writing. If your data has |
---|
| 288 | unspecified BLANKs, be wary...] |
---|
[3] | 289 | |
---|
| 290 | This stage is particularly important for the reconstruction step, as |
---|
| 291 | lots of BLANK pixels on the edges will smooth out features in the |
---|
| 292 | wavelet calculation stage. The trimming will also reduce the size of |
---|
| 293 | the cube's array, speeding up the execution. The amount of trimming is |
---|
| 294 | recorded, and these pixels are added back in once the source-detection |
---|
| 295 | is completed (so that quoted pixel positions are applicable to the |
---|
| 296 | original cube). |
---|
| 297 | |
---|
| 298 | Rows and columns are trimmed one at a time until the first non-BLANK |
---|
| 299 | pixel is reached, so that the image remains rectangular. In practice, |
---|
| 300 | this means that there will be BLANK pixels left in the trimmed image |
---|
| 301 | (if the non-BLANK region is non-rectangular). However, these are |
---|
| 302 | ignored in all further calculations done on the cube. |
---|
| 303 | |
---|
| 304 | \subsubsection{Baseline removal} |
---|
| 305 | |
---|
| 306 | Finally, the user may request the removal of baselines from the |
---|
| 307 | spectra, via the parameter {\tt flagBaseline}. This may be necessary |
---|
| 308 | if there is a strong baseline ripple present, which can result in |
---|
| 309 | spurious detections on the high points of the ripple. The baseline is |
---|
| 310 | calculated from a wavelet reconstruction procedure (see |
---|
| 311 | \S\ref{sec-recon}) that keeps only the two largest scales. This is |
---|
| 312 | done separately for each spatial pixel (\ie for each spectrum in the |
---|
| 313 | cube), and the baselines are stored and added back in before any |
---|
| 314 | output is done. In this way the quoted fluxes and displayed spectra |
---|
[24] | 315 | are as one would see from the input cube itself -- even though the |
---|
| 316 | detection (and reconstruction if applicable) is done on the |
---|
| 317 | baseline-removed cube. |
---|
[3] | 318 | |
---|
[43] | 319 | The presence of very strong signals (for instance, masers at several |
---|
| 320 | hundred Jy) can affect the determination of the baseline, leading to a |
---|
| 321 | large dip centred on the signal in the baseline-subtracted |
---|
| 322 | spectrum. To prevent this, the signal is trimmed prior to the |
---|
[80] | 323 | reconstruction process at some standard threshold (at $8\sigma$ above |
---|
| 324 | the mean). The baseline determined should thus be representative of |
---|
| 325 | the true, signal-free baseline. Note that this trimming is only a |
---|
[43] | 326 | temporary measure which does not affect the source-detection. |
---|
| 327 | |
---|
[3] | 328 | \subsection{Image reconstruction} |
---|
| 329 | \label{sec-recon} |
---|
| 330 | |
---|
| 331 | This is an optional step. The user can direct Duchamp to |
---|
| 332 | reconstruct the data cube using the {\it {\`a} trous} wavelet |
---|
| 333 | procedure. A good description of the procedure can be found in |
---|
| 334 | \citet{starck02:book}. This is an effective way of removing a |
---|
| 335 | lot of the noise in the image, but at this stage is relatively time- |
---|
[24] | 336 | and memory-intensive. The steps in the procedure are as follows: |
---|
[3] | 337 | \begin{enumerate} |
---|
| 338 | \item Set the reconstructed array to 0 everywhere. |
---|
| 339 | \item The cube is discretely convolved with a given filter |
---|
| 340 | function. This is determined from the parameter file via the {\tt |
---|
| 341 | filterCode} parameter -- see Appendix~\ref{app-param} for details on |
---|
| 342 | the filters available. |
---|
| 343 | \item The wavelet coefficients are calculated by taking the difference |
---|
| 344 | between the convolved array and the input array. |
---|
| 345 | \item If the wavelet coefficients at a given point are above the |
---|
| 346 | threshold requested (given by {\tt snrRecon} as the number of |
---|
| 347 | $\sigma$ above the mean and adjusted to the current scale), add |
---|
| 348 | these to the reconstructed array. |
---|
| 349 | \item The separation of the filter coefficients is doubled. |
---|
| 350 | \item The procedure is repeated from step 2, using the convolved array |
---|
| 351 | as the input array. |
---|
| 352 | \item Continue until the required maximum number of scales is reached. |
---|
[24] | 353 | \item Add the final smoothed (\ie convolved) array to the |
---|
| 354 | reconstructed array. This provides the ``DC offset'', as each of the |
---|
| 355 | wavelet coefficient arrays will have zero mean. |
---|
[3] | 356 | \end{enumerate} |
---|
| 357 | |
---|
[71] | 358 | Note that any BLANK pixels that are still in the cube will not be |
---|
| 359 | altered by the reconstruction -- they will be left as BLANK so that |
---|
| 360 | the shape of the valid part of the cube is preserved. |
---|
| 361 | |
---|
[24] | 362 | It is important to note that the {\it {\`a} trous} decomposition is |
---|
| 363 | an example of a ``redundant'' transformation. If no thresholding is |
---|
| 364 | performed, the sum of all the wavelet coefficient arrays and the final |
---|
| 365 | smoothed array is identical to the input array. The thresholding thus |
---|
| 366 | removes only the unwanted structure in the array. |
---|
[3] | 367 | |
---|
| 368 | The statistics of the cube are estimated using robust methods, to |
---|
| 369 | avoid corruption by strong outlying points. The mean is actually |
---|
| 370 | estimated by the median, while the median absolute deviation from the |
---|
| 371 | median (MADFM) is calculated and corrected assuming Gaussianity to |
---|
| 372 | estimate the standard deviation $\sigma$. The Gaussianity (or |
---|
| 373 | Normality) assumption is critical, as the MADFM does not give the same |
---|
| 374 | value as the usual rms or standard deviation value -- for a normal |
---|
| 375 | distribution $N(\mu,\sigma)$ we find MADFM$=0.6744888\sigma$. The |
---|
| 376 | difference between the MADFM and $\sigma$ is corrected for, so the |
---|
| 377 | user need only think in the usual multiples of $\sigma$ when setting |
---|
| 378 | {\tt snrRecon}. See Appendix~\ref{app-madfm} for a derivation of this |
---|
| 379 | value. |
---|
| 380 | |
---|
| 381 | When thresholding the different wavelet scales, the value of $\sigma$ |
---|
| 382 | as measured from the input array needs to be scaled to account for the |
---|
| 383 | increased amount of correlation between neighbouring pixels (due to |
---|
[80] | 384 | the convolution). See Appendix~\ref{app-scaling} for details on this |
---|
[24] | 385 | scaling. |
---|
[3] | 386 | |
---|
| 387 | The user can also select the minimum scale to be used in the |
---|
| 388 | reconstruction -- the first scale exhibits the highest frequency |
---|
| 389 | variations, and so ignoring this one can sometimes be beneficial in |
---|
| 390 | removing excess noise. The default, however, is to use all scales |
---|
| 391 | ({\tt minscale = 1}). |
---|
| 392 | |
---|
| 393 | The reconstruction has at least two iterations. The first iteration |
---|
| 394 | makes a first pass at the wavelet reconstruction (the process outlined |
---|
| 395 | in the 8 stages above), but the residual array will inevitably have |
---|
| 396 | some structure still in it, so the wavelet filtering is done on the |
---|
| 397 | residual, and any significant wavelet terms are added to the final |
---|
| 398 | reconstruction. This step is repeated until the change in the $\sigma$ |
---|
| 399 | of the background is less than some fiducial amount. |
---|
| 400 | |
---|
[80] | 401 | \subsection{Reconstruction I/O} |
---|
| 402 | |
---|
| 403 | The reconstruction stage can be relatively time-consuming, |
---|
| 404 | particularly for large cubes. Duchamp thus has a shortcut to allow |
---|
| 405 | users to quickly do multiple searches (\eg with different thresholds) |
---|
| 406 | on the same reconstruction. |
---|
| 407 | |
---|
| 408 | The first step is to select to save the reconstructed image as a |
---|
[3] | 409 | FITS file -- at the moment this is just saved in the same directory as |
---|
| 410 | the input file, so it won't work if the user does not have write |
---|
[71] | 411 | permissions on that directory. The name of the file will be derived |
---|
| 412 | from the input file, in the following manner: if the input file is |
---|
| 413 | {\tt image.fits}, the reconstructed array will be saved in {\tt |
---|
| 414 | image.RECON?.fits}, where {\tt ?} stands for the value of {\tt |
---|
| 415 | snrRecon} (for instance, if {\tt snrRecon}$=4$, it will be {\tt |
---|
| 416 | image.RECON4.fits}, and if {\tt snrRecon}$=4.5$, it will be {\tt |
---|
| 417 | image.RECON4.5.fits}). To save the reconstructed array, set {\tt |
---|
| 418 | flagOutputRecon = true}. |
---|
[3] | 419 | |
---|
[71] | 420 | Likewise, the residual image, defined as the difference between the |
---|
| 421 | input image and the reconstructed image, can also be saved in the same |
---|
| 422 | manner -- its filename will be {\tt image.RESID?.fits}. This is done |
---|
| 423 | by setting {\tt flagOutputResid = true}. |
---|
[3] | 424 | |
---|
[80] | 425 | If a reconstructed image has been saved, it can be read in and used |
---|
| 426 | instead of redoing the reconstruction. To do so, the user should set |
---|
| 427 | {\tt flagReconExists = true}. The user can indicate the name of the |
---|
| 428 | reconstructed FITS file using the {\tt reconFile} parameter, or, if |
---|
| 429 | this is not specified, Duchamp searches for the file {\tt |
---|
| 430 | image.RECON?.fits} (as defined above). If the file is not found, the |
---|
| 431 | reconstruction is performed as normal. Note that to do this, the user |
---|
| 432 | needs to set {\tt flagAtrous = true} (obviously, if this is {\tt |
---|
| 433 | false}, the reconstruction is not needed). |
---|
[71] | 434 | |
---|
[3] | 435 | \subsection{Searching the image} |
---|
| 436 | \label{sec-detection} |
---|
| 437 | |
---|
[24] | 438 | The image is searched for detections in two ways: spectrally (a |
---|
| 439 | 1-dimensional search in the spectrum in each spatial pixel), and |
---|
| 440 | spatially (a 2-dimensional search in the spatial image in each |
---|
| 441 | channel). In both cases, the algorithm finds connected pixels that are |
---|
| 442 | above the user-specified threshold. In the case of the spatial image |
---|
| 443 | search, the algorithm of \citet{lutz80} is used to raster scan through |
---|
| 444 | the image and connect groups of pixels on neighbouring rows. |
---|
[3] | 445 | |
---|
| 446 | Note that this algorithm cannot be applied directly to a 3-dimensional |
---|
| 447 | case, as it requires that objects are completely nested in a row: that |
---|
| 448 | is, if you are scanning along a row, and one object finishes and |
---|
| 449 | another starts, you know that you will not get back to the first one |
---|
| 450 | (if at all) until the second is finished for that |
---|
| 451 | row. Three-dimensional data does not have this property, which is why |
---|
| 452 | we break up the searching into 1- and 2-dimensional cases. |
---|
| 453 | |
---|
| 454 | The determination of the threshold is done in one of two ways. The |
---|
[80] | 455 | first way is a simple sigma-clipping, where a threshold is set at |
---|
| 456 | $n\sigma$ above the mean and pixels above this threshold are |
---|
| 457 | flagged as detected. The value of $n$ is set with the parameter {\tt |
---|
| 458 | snrCut}. As before, the value for $\sigma$ is estimated by |
---|
[3] | 459 | the MADFM, and corrected by the ratio derived in |
---|
| 460 | Appendix~\ref{app-madfm}. |
---|
| 461 | |
---|
| 462 | The second method uses the False Discovery Rate (FDR) technique |
---|
| 463 | \citep{miller01,hopkins02}, whose basis we briefly detail here. The |
---|
| 464 | false discovery rate (given by the number of false detections divided |
---|
| 465 | by the total number of detections) is fixed at a certain value |
---|
| 466 | $\alpha$ (\eg $\alpha=0.05$ implies 5\% of detections are false |
---|
| 467 | positives). In practice, an $\alpha$ value is chosen, and the ensemble |
---|
| 468 | average FDR (\ie $<FDR>$) when the method is used will be less than |
---|
| 469 | $\alpha$. One calculates $p$ -- the probability, assuming the null |
---|
| 470 | hypothesis is true, of obtaining a test statistic as extreme as the |
---|
| 471 | pixel value (the observed test statistic) -- for each pixel, and sorts |
---|
| 472 | them in increasing order. One then calculates $d$ where |
---|
| 473 | \[ |
---|
[18] | 474 | d = \max_j \left\{ j : P_j < \frac{j\alpha}{c_N N} \right\}, |
---|
[3] | 475 | \] |
---|
[18] | 476 | and then rejects all hypotheses whose $p$-values are less than or equal |
---|
[3] | 477 | to $P_d$. (So a $P_i<P_d$ will be rejected even if $P_i \geq |
---|
| 478 | j\alpha/c_N N$.) Note that ``reject hypothesis'' here means ``accept |
---|
| 479 | the pixel as an object pixel'' (\ie we are rejecting the null |
---|
| 480 | hypothesis that the pixel belongs to the background). |
---|
| 481 | |
---|
| 482 | The $c_N$ values here are normalisation constants that depend on the |
---|
| 483 | correlated nature of the pixel values. If all the pixels are |
---|
| 484 | uncorrelated, then $c_N=1$. If $N$ pixels are correlated, then their |
---|
| 485 | tests will be dependent on each other, and so $c_N = \sum_{i=1}^N |
---|
| 486 | i^{-1}$. \citet{hopkins02} consider real radio data, where the pixels |
---|
| 487 | are correlated over the beam. In this case the sum is made over the |
---|
| 488 | $N$ pixels that make up the beam. The value of $N$ is calculated from |
---|
[24] | 489 | the FITS header (if the correct keywords -- BMAJ, BMIN -- are not |
---|
| 490 | present, a default value of 10 pixels is assumed). |
---|
[3] | 491 | |
---|
| 492 | If a reconstruction has been made, the residuals (defined as original |
---|
| 493 | $-$ reconstruction) are used to estimate the noise parameters of the |
---|
| 494 | cube. Otherwise they are estimated directly from the cube itself. In |
---|
| 495 | both cases, the median is used as a robust estimator of the mean |
---|
| 496 | value, although the $\sigma$ is estimated by the standard deviation |
---|
| 497 | (of the residual array, in the case of the reconstruction, but of the |
---|
| 498 | original array otherwise). |
---|
| 499 | |
---|
[24] | 500 | Detections must have a minimum number of pixels to be counted. This |
---|
| 501 | minimum number is given by the input parameters {\tt minPix} (for |
---|
| 502 | 2-dimensional searches) and {\tt minChannels} (for 1-dimensional |
---|
[60] | 503 | searches). |
---|
[24] | 504 | |
---|
[60] | 505 | The search only looks for positive features. If one is interested |
---|
| 506 | instead in negative features (such as absorption lines), set the |
---|
| 507 | parameter {\tt flagNegative = true}. This will invert the cube (\ie |
---|
[80] | 508 | multiply all pixels by $-1$) prior to the search, and then re-invert |
---|
| 509 | the cube (and the fluxes of any detections) after searching is |
---|
| 510 | complete. All outputs are done in the same manner as normal, so that |
---|
| 511 | fluxes of detections will be negative. |
---|
[60] | 512 | |
---|
[3] | 513 | \subsection{Merging detected objects} |
---|
| 514 | \label{sec-merger} |
---|
| 515 | |
---|
| 516 | The searching step produces a list of detections that will have many |
---|
| 517 | repeated detections of a given object -- for instance, spectral |
---|
[18] | 518 | detections in adjacent pixels of the same object and/or spatial |
---|
| 519 | detections in neighbouring channels. These are then combined in an |
---|
| 520 | algorithm that matches all objects judged to be ``close''. This |
---|
| 521 | determination is made in one of two ways. |
---|
[3] | 522 | |
---|
| 523 | One way is to define two thresholds -- one spatial and one in velocity |
---|
| 524 | -- and say that two objects should be merged if there is at least one |
---|
| 525 | pair of pixels that lie within these threshold distances of each |
---|
| 526 | other. These thresholds are specified by the parameters {\tt |
---|
| 527 | threshSpatial} and {\tt threshVelocity} (in units of pixels and |
---|
| 528 | channels respectively). |
---|
| 529 | |
---|
| 530 | Alternatively, the spatial requirement can be changed to say that |
---|
| 531 | there must be a pair of pixels that are {\it adjacent} -- a stricter, |
---|
| 532 | but more realistic requirement, particularly when the spatial pixels |
---|
| 533 | have a large angular size (as is the case for \hi\ surveys). This |
---|
| 534 | method can be selected by setting the parameter |
---|
| 535 | {\tt flagAdjacent} to 1 (\ie {\tt true}) in the parameter file. The |
---|
| 536 | velocity thresholding is done in the same way as the first option. |
---|
| 537 | |
---|
[18] | 538 | Once the detections have been merged, they may be ``grown''. This is a |
---|
| 539 | process of increasing the size of the detection by adding adjacent |
---|
| 540 | pixels that are above some secondary threshold. This threshold is |
---|
| 541 | lower than the one used for the initial detection, but above the noise |
---|
| 542 | level, so that faint pixels are only detected when they are close to a |
---|
| 543 | bright pixel. The value of this threshold is a possible input |
---|
| 544 | parameter ({\tt growthCut}), with a default value of $1.5\sigma$. The |
---|
| 545 | use of the growth algorithm is controlled by the {\tt flagGrowth} |
---|
| 546 | parameter -- the default value of which is {\tt false}. If the |
---|
| 547 | detections are grown, they are sent through the merging algorithm a |
---|
[24] | 548 | second time, to pick up any detections that now overlap or have grown |
---|
[18] | 549 | over each other. |
---|
[3] | 550 | |
---|
[24] | 551 | Finally, to be accepted, the detections must span {\it both} a minimum |
---|
[18] | 552 | number of channels (to remove any spurious single-channel spikes that |
---|
[24] | 553 | may be present), and a minimum number of spatial pixels. These |
---|
| 554 | numbers, as for the original detection step, are set with the {\tt |
---|
| 555 | minChannels} and {\tt minPix} parameters. The channel requirement |
---|
| 556 | means there must be at least one set of this many consecutive channels |
---|
| 557 | in the source for it to be accepted. |
---|
[18] | 558 | |
---|
[3] | 559 | \section{Outputs} |
---|
| 560 | \label{sec-output} |
---|
| 561 | |
---|
| 562 | \subsection{During execution} |
---|
| 563 | |
---|
| 564 | Duchamp provides the user with feedback whilst it is running, to |
---|
| 565 | keep the user informed on the progress of the analysis. Most of this |
---|
| 566 | consists of self-explanatory messages about the particular stage the |
---|
| 567 | program is up to. The relevant parameters are printed to the screen at |
---|
| 568 | the start (once the file has been successfully read in), so the user |
---|
| 569 | is able to make a quick check that the setup is correct. |
---|
| 570 | |
---|
| 571 | If the cube is being trimmed (\S\ref{sec-modify}), the resulting |
---|
| 572 | dimensions are printed to indicate how much has been trimmed. If a |
---|
| 573 | reconstruction is being done, a continually updating message shows the |
---|
| 574 | current iteration and scale (compared to the maximum scale). |
---|
| 575 | |
---|
| 576 | During the searching algorithms, the progress through the 1D and 2D |
---|
| 577 | searches are shown. When the searches have completed, |
---|
| 578 | the number of objects found in both the 1D and 2D searches are |
---|
| 579 | reported (see \S\ref{sec-detection} for details). |
---|
| 580 | |
---|
| 581 | In the merging process (where multiple detections of the same object |
---|
| 582 | are combined -- see \S\ref{sec-merger}), two stages of output |
---|
| 583 | occur. The first is when each object in the list is compared with all |
---|
| 584 | others. The output shows two numbers: the first being how far through |
---|
| 585 | the list we are, and the second being the length of the list. As the |
---|
| 586 | algorithm proceeds, the first number should increase and the second |
---|
[80] | 587 | should decrease (as objects are combined). When the numbers meet (\ie |
---|
| 588 | the whole list has been compared), the second phase begins, in which |
---|
| 589 | multiply-appearing pixels in each object are removed, as are objects |
---|
| 590 | not meeting the minimum channels requirement. During this phase, the |
---|
| 591 | total number of accepted objects is shown, which should steadily |
---|
| 592 | increase until all have been accepted or rejected. Note that these |
---|
| 593 | steps can be very quick for small numbers of detections. |
---|
[3] | 594 | |
---|
| 595 | Since this continual printing to screen has some overhead of time and |
---|
| 596 | CPU involved, the user can elect to not print this information by |
---|
| 597 | setting the parameter {\tt verbose = 0}. In this case, the user is |
---|
| 598 | still informed as to the steps being undertaken, but the details of |
---|
| 599 | the progress are not shown. |
---|
| 600 | |
---|
| 601 | \subsection{Results} |
---|
| 602 | |
---|
[15] | 603 | Finally, we get to the results -- the reason for running Duchamp in |
---|
[18] | 604 | the first place. Once the detection list is finalised, it is sorted by |
---|
[80] | 605 | the mean velocity of the detections (or, if there is no good WCS |
---|
| 606 | associated with the cube, by the mean Z-pixel position). The results |
---|
| 607 | are then printed to the screen and to the output file, denoted by the |
---|
| 608 | {\tt OutFile} parameter. The results list, an example of which can be |
---|
| 609 | seen in Appendix~\ref{app-output}, contains the following columns |
---|
| 610 | (note that the title of the columns depending on WCS information will |
---|
| 611 | depend on the projection of the WCS): |
---|
[15] | 612 | |
---|
[3] | 613 | \begin{entry} |
---|
| 614 | \item[Obj\#] The ID number of the detection (simply the sequential |
---|
| 615 | count for the list, which is ordered by increasing velocity). |
---|
[80] | 616 | \item[Name] The IAU-format name of the detection (based on the WCS |
---|
| 617 | projection). |
---|
[3] | 618 | \item[X] The average X-pixel position. |
---|
| 619 | \item[Y] The average Y-pixel position. |
---|
| 620 | \item[Z] The average Z-pixel position. |
---|
[80] | 621 | \item[RA/GLON] The Right Ascension or Galactic Longitude of the centre |
---|
| 622 | of the object. |
---|
| 623 | \item[DEC/GLAT] The Declination or Galactic Latitude of the centre of |
---|
| 624 | the object. |
---|
[85] | 625 | \item[w\_RA/w\_GLON] The width of the object in Right Ascension or |
---|
| 626 | Galactic Longitude [arcmin]. |
---|
[80] | 627 | \item[w\_DEC/w\_GLAT] The width of the object in Declination Galactic |
---|
| 628 | Latitude [arcmin]. |
---|
[3] | 629 | \item[VEL] The mean velocity of the object [km/s]. |
---|
| 630 | \item[w\_VEL] The full velocity width of the detection (max channel |
---|
| 631 | $-$ min channel, in velocity units [km/s]). |
---|
[87] | 632 | \item[F\_tot] The integrated flux over the object, in the units of |
---|
| 633 | flux times velocity (\eg Jy km/s). |
---|
| 634 | \item[F\_peak] The peak flux over the object, in the units of flux. |
---|
[3] | 635 | \item[X1, X2] The minimum and maximum X-pixel coordinates. |
---|
| 636 | \item[Y1, Y2] The minimum and maximum Y-pixel coordinates. |
---|
| 637 | \item[Z1, Z2] The minimum and maximum Z-pixel coordinates. |
---|
| 638 | \item[Npix] The number of pixels \& channels (\ie distinct $(x,y,z)$ |
---|
| 639 | coordinates) in the detection. |
---|
[87] | 640 | \item[Flag] Whether the detection has any warning flags (see below). |
---|
[3] | 641 | \end{entry} |
---|
[80] | 642 | The Name is derived from the WCS position. For instance, the (RA,Dec) |
---|
| 643 | position 12$^h$53$^m$45$^s$, -36$^\circ$24$'$12$''$ will be called |
---|
| 644 | J1253$-$3624 (if the epoch is J2000) or B1253$-$3624 (if B1950). An |
---|
| 645 | alternative form is used for Galactic coordinates: the position |
---|
| 646 | ($l$,$b$) = (323.1245, 5.4567) will be called G323.12$+$05.45. If the |
---|
| 647 | WCS is not valid (\ie is not present or does not have all the |
---|
| 648 | necessary information), the Name, RA, DEC, VEL and related columns are |
---|
| 649 | not printed, but the pixel coordinates are still provided. |
---|
[3] | 650 | |
---|
[18] | 651 | \begin{figure}[t] |
---|
| 652 | \begin{center} |
---|
| 653 | \includegraphics[width=\textwidth]{example_spectrum} |
---|
| 654 | \end{center} |
---|
| 655 | \caption{\footnotesize An example of the spectrum output. Note several |
---|
| 656 | of the features discussed in the text: the removal of the Milky Way |
---|
| 657 | emission around 0 km/s; the red lines indicating the reconstructed |
---|
| 658 | spectrum; the blue dashed lines indicating the spectral extent of |
---|
[85] | 659 | the detection; the blue border showing its spatial extent on the |
---|
| 660 | 0th moment map; and the 15~arcmin-long scale bar.} |
---|
[18] | 661 | \label{fig-spect} |
---|
| 662 | \end{figure} |
---|
[3] | 663 | |
---|
[87] | 664 | The last column contains any warning flags about the detection. There |
---|
| 665 | are currently two options here. An `E' is printed if the detection is |
---|
| 666 | next to the edge of the image, meaning either the limit of the pixels, |
---|
| 667 | or the limit of the non-BLANK pixel region. An `N' is printed if the |
---|
| 668 | total flux, summed over all the (non-BLANK) pixels in the smallest box |
---|
| 669 | that completely encloses the detection, is negative. Note that this |
---|
| 670 | sum will possibly include non-detected pixels. It is of use in |
---|
| 671 | pointing out detections that lie next to strongly negative pixels, |
---|
| 672 | such as might arise due to interference -- the detected pixels might |
---|
| 673 | then also be due to the interference, so caution is advised. |
---|
| 674 | |
---|
[21] | 675 | Two alternative results files can also be requested. One option is a |
---|
| 676 | VOTable-format XML file, containing just the RA, Dec, Velocity and the |
---|
| 677 | corresponding widths of the detections, as well as the fluxes. The |
---|
| 678 | user should set {\tt flagVOT = 1}, and put the desired filename in the |
---|
| 679 | parameter {\tt votFile} -- note that the default is for it not to be |
---|
| 680 | produced. This file should be compatible with all Virtual Observatory |
---|
| 681 | tools (such as Aladin\footnote{ Aladin can be found on the web at |
---|
| 682 | \href{http://aladin.u-strasbg.fr/}{http://aladin.u-strasbg.fr/}}). The |
---|
| 683 | second option is an annotation file for use with the Karma toolkit of |
---|
| 684 | visualisation tools (in particular, with {\tt kvis}). This will draw a |
---|
| 685 | circle at the position of each detection, and number it according to |
---|
| 686 | the Obj\# given above. To use, the user should set {\tt flagKarma = 1}, |
---|
| 687 | and put the desired filename in the parameter {\tt karmaFile} -- again, |
---|
| 688 | the default is for it not to be produced. |
---|
[18] | 689 | |
---|
[3] | 690 | As the program is running, it also (optionally) records the detections |
---|
| 691 | made in each individual spectrum or channel (see |
---|
| 692 | \S\ref{sec-detection} for details on this process). This is |
---|
| 693 | recorded in the file denoted by the parameter {\tt LogFile}. This file |
---|
| 694 | does not include the columns {\tt Name, RA, DEC, w\_RA, w\_DEC, VEL, |
---|
| 695 | w\_VEL}. This file is designed primarily for diagnostic purposes: \eg |
---|
| 696 | to see if a given set of pixels is detected in, say, one channel |
---|
| 697 | image, but does not survive the merging process. The list of pixels |
---|
| 698 | (and their fluxes) in the final detection list are also printed to |
---|
| 699 | this file, again for diagnostic purposes. This feature can be turned |
---|
| 700 | off by setting {\tt flagLog = false}. (This may be a good idea if you |
---|
| 701 | are not interested in its contents, as it can be a large file.) |
---|
| 702 | |
---|
[18] | 703 | \begin{figure}[!t] |
---|
| 704 | \begin{center} |
---|
| 705 | \includegraphics[width=\textwidth]{example_moment_map} |
---|
| 706 | \end{center} |
---|
| 707 | \caption{\footnotesize An example of the moment map created by |
---|
| 708 | Duchamp. The full extent of the cube is covered, and the 0th moment |
---|
| 709 | of each object is shown (integrated individually over all the |
---|
| 710 | detected channels).} |
---|
| 711 | \label{fig-moment} |
---|
| 712 | \end{figure} |
---|
| 713 | |
---|
[3] | 714 | As well as the output data file, a postscript file is created that |
---|
[80] | 715 | shows the spectrum for each detection, together with a small cutout |
---|
[87] | 716 | image (0th moment) and basic information about the detection (note |
---|
| 717 | that any flags are printed after the name of the detection, in the |
---|
| 718 | format {\tt [E]}). If the |
---|
| 719 | cube was reconstructed, the spectrum from the reconstruction is shown |
---|
| 720 | in red, over the top of the original spectrum. The spectrum that is |
---|
[45] | 721 | plotted is governed by the {\tt spectralMethod} parameter. It can be |
---|
[80] | 722 | either {\tt peak}, where the spectrum is from the spatial pixel |
---|
| 723 | containing the detection's peak flux; or {\tt sum}, where the spectrum |
---|
| 724 | is summed over all spatial pixels, and then corrected for the beam |
---|
| 725 | size. |
---|
[3] | 726 | |
---|
[45] | 727 | The spectral extent of the detection is indicated with blue lines, and |
---|
| 728 | a zoom is shown in a separate window. The cutout image can optionally |
---|
| 729 | include a border around the spatial pixels that are in the detection |
---|
| 730 | (turned on and off by the parameter {\tt drawBorders}). It also |
---|
| 731 | includes a scale bar in the bottom left corner to indicate size -- it |
---|
[85] | 732 | is 15~arcmin long (note that due to projection effects it may be a |
---|
| 733 | slightly different physical length from object to object). An example |
---|
| 734 | detection can be seen below in Fig.~\ref{fig-spect}. |
---|
[45] | 735 | |
---|
[17] | 736 | Finally, a couple of images are optionally produced: a 0th moment map |
---|
[3] | 737 | of the cube, combining just the detected channels in each object, |
---|
| 738 | showing the integrated flux in grey-scale; and a ``detection image'', |
---|
| 739 | a grey-scale image where the pixel values are the number of channels |
---|
| 740 | that spatial pixel is detected in. In both cases, if {\tt drawBorders = |
---|
| 741 | true}, a border is drawn around the spatial extent of each |
---|
| 742 | detection. An example moment map is shown in Fig.~\ref{fig-moment}. |
---|
| 743 | The production or otherwise of these images is governed by the {\tt |
---|
| 744 | flagMaps} parameter. |
---|
| 745 | |
---|
| 746 | The purpose of these images are to provide a visual guide to where the |
---|
| 747 | detections have been made, and, particularly in the case of the moment |
---|
| 748 | map, to provide an indication of the strength of the source. In both |
---|
| 749 | cases, the detections are numbered (in the same way as the output |
---|
| 750 | list), and the spatial borders are marked out as for the cutout images |
---|
| 751 | in the spectra file. Both these images are saved as postscript files |
---|
| 752 | (given by the parameters {\tt momentMap} and {\tt detectionMap} |
---|
| 753 | respectively), with the latter also displayed in a {\sc pgplot} |
---|
| 754 | window (regardless of the state of {\tt flagMaps}). |
---|
| 755 | |
---|
| 756 | \section{Notes and hints on the use of Duchamp} |
---|
| 757 | |
---|
| 758 | In using Duchamp, the user has to make a number of decisions about |
---|
| 759 | the way the program runs. This section is designed to give the user |
---|
[80] | 760 | some idea about what to choose. |
---|
[3] | 761 | |
---|
| 762 | The main choice is whether or not to use the wavelet |
---|
| 763 | reconstruction. The main benefits of this are the marked reduction in |
---|
| 764 | the noise level, leading to regularly-shaped detections, and good |
---|
| 765 | reliability for faint sources. The main drawback with its use is the |
---|
| 766 | long execution time: to reconstruct a $170\times160\times1024$ |
---|
| 767 | (\hipass) cube often requires three iterations and takes about 20-25 |
---|
| 768 | minutes. The searching part of the procedure is much quicker (although |
---|
| 769 | see the note on merging, below), so if one uses the FDR method on the |
---|
[71] | 770 | un-reconstructed cube, the execution time is only a couple of |
---|
| 771 | minutes. Alternatively, using the ability to read in previously-saved |
---|
| 772 | reconstructed arrays makes running the reconstruction more than once a |
---|
| 773 | more feasible prospect. |
---|
[3] | 774 | |
---|
[80] | 775 | %A further drawback with the reconstruction is that it is susceptible |
---|
| 776 | %to edge effects. If the valid area in the cube (\ie the part that is |
---|
| 777 | %not BLANK) has very curved edges (such as the \hipass\ polar cap cube, |
---|
| 778 | %H001, which has a roughly circular shape after gridding), the |
---|
| 779 | %convolution can produce artefacts in the reconstruction that mimic the |
---|
| 780 | %edges and can lead (depending on the selection threshold) to some |
---|
| 781 | %spurious sources. Caution is advised with such data -- the user is |
---|
| 782 | %advised to check carefully the reconstructed cube for the presence of |
---|
| 783 | %such artefacts. |
---|
[3] | 784 | |
---|
| 785 | If one chooses the reconstruction method, a further decision is |
---|
| 786 | required on the signal-to-noise cutoff used in determining acceptable |
---|
| 787 | wavelet coefficients. A larger value will remove more noise from the |
---|
| 788 | cube, at the expense of losing fainter sources, while a smaller value |
---|
| 789 | will include more noise, which may produce spurious detections, but |
---|
| 790 | will be more sensitive to faint sources. Values of less than about |
---|
| 791 | $3\sigma$ tend to not reduce the noise a great deal and can lead to |
---|
[71] | 792 | many spurious sources (although this will depend on the nature of the |
---|
| 793 | cube). |
---|
[3] | 794 | |
---|
| 795 | The FDR method certainly produces more reliable results than a simple |
---|
| 796 | sigma-clipping (\ie thresholding at some number of $\sigma$ above the |
---|
[80] | 797 | mean), particularly if no reconstruction is done. However, at this |
---|
| 798 | point it does not seem to be giving the sensitivity expected for the |
---|
| 799 | supplied value of {\tt alpha} (\ie it is not finding as many sources |
---|
| 800 | as expected). Work is being done to assess this, and to judge whether |
---|
| 801 | there is a real problem (such as with the determination of the |
---|
| 802 | statistics), or simply a result of working in 3 dimensions as opposed |
---|
| 803 | to 2. |
---|
[3] | 804 | |
---|
| 805 | A further point to bear in mind is that the shape of the detections in |
---|
| 806 | a cube that has been reconstructed will be much more regular and |
---|
| 807 | smooth -- the ragged edges that objects in the raw cube possess are |
---|
| 808 | smoothed by the removal of most of the noise. |
---|
| 809 | |
---|
[24] | 810 | Finally, as Duchamp is still undergoing development, there are some |
---|
| 811 | elements that are not fully developed. In particular, it is not as |
---|
| 812 | clever as I would like at avoiding interference. The ability to place |
---|
| 813 | requirements on the minimum number of channels and pixels partially |
---|
| 814 | circumvents this problem, but work is being done to make Duchamp |
---|
| 815 | smarter at rejecting signals that are clearly (to a human eye at |
---|
| 816 | least) interference. See the following section for further |
---|
| 817 | improvements that are planned. |
---|
[3] | 818 | |
---|
[24] | 819 | %\section{Drawbacks of the current program} |
---|
| 820 | % |
---|
| 821 | %The program currently has a few problems/drawbacks/things to be aware |
---|
| 822 | %of that will hopefully be fixed in the future: |
---|
| 823 | %\begin{itemize} |
---|
| 824 | % |
---|
| 825 | %\item Narrow interference spikes are still getting found, particularly |
---|
| 826 | % if there is no reconstruction, or reconstruction with a relatively |
---|
| 827 | % low {\tt snrRecon} (such as 2 or 3). Increasing the {\tt |
---|
| 828 | % minChannels} parameter is one way to circumvent this, but making the |
---|
| 829 | % algorithm a bit more clever would be preferable. |
---|
| 830 | % |
---|
| 831 | %\item Sources that have strong continuum ripple and/or artefacts often |
---|
| 832 | % generate many spurious detections. This needs some work to avoid |
---|
| 833 | % Duchamp doing this, and until then users are advised to be aware |
---|
| 834 | % of the possibility. Strong continuum ripples may generate many |
---|
| 835 | % sources on the same spatial pixel, and this will be apparent on the |
---|
| 836 | % detection images. |
---|
| 837 | % |
---|
| 838 | %\item Spectra are integrated over every spatial pixel of the |
---|
| 839 | % detection, and this may dilute the actual detection, making it |
---|
| 840 | % harder to see \ie the apparent strength of the line as plotted may |
---|
| 841 | % not give a true indication of how strong it really is. |
---|
| 842 | % |
---|
| 843 | %%\item A caution on the merging part of the procedure. This can be time |
---|
| 844 | %% consuming if there are many detections that do not require merging |
---|
| 845 | %% -- in this case, the time will go like $N^2$ ($N$ = number of |
---|
| 846 | %% detections). If there are plenty of mergers, the size of the list |
---|
| 847 | %% reduces quickly, so the execution time will be less. |
---|
| 848 | % |
---|
| 849 | % |
---|
| 850 | %\end{itemize} |
---|
[3] | 851 | |
---|
| 852 | |
---|
| 853 | %\section{Comparison with other software (to be developed further...)} |
---|
| 854 | % |
---|
| 855 | %\subsection{fred, by Matt Howlett} |
---|
| 856 | % |
---|
| 857 | %This is the program used in the \hipass\ analysis. It smoothes the |
---|
| 858 | %data spectrally with a boxcar filter of a size that varies over a |
---|
| 859 | %user-specified range, and then thresholds the data. |
---|
| 860 | % |
---|
| 861 | %Works effectively, but generally doesn't find as many sources as |
---|
| 862 | %Duchamp, particularly when the reconstruction is used. Sensitive to |
---|
| 863 | %faint, broad features that fall below the reconstruction threshold. |
---|
| 864 | % |
---|
| 865 | %Execution takes a long time, depending on the range of filter widths |
---|
| 866 | %that are used. |
---|
| 867 | % |
---|
| 868 | %\subsection{sfind} |
---|
| 869 | % |
---|
| 870 | %Hard to evaluate, as it does not (as far as I can see) output the |
---|
| 871 | %channel number at which detections are made, and does not merge |
---|
| 872 | %detections made at adjacent channels (\ie it just works in 2 |
---|
| 873 | %dimensions). |
---|
| 874 | % |
---|
| 875 | |
---|
| 876 | \section{Future Developments} |
---|
| 877 | |
---|
| 878 | This is both a list of planned improvements and a wish-list of |
---|
| 879 | features that would be nice to include (but are not planned in the |
---|
[71] | 880 | immediate future). Let me know if there are items not on this list, or |
---|
| 881 | items on the list you would like prioritised. |
---|
[3] | 882 | |
---|
| 883 | \begin{itemize} |
---|
| 884 | |
---|
[24] | 885 | \item More varied output formats. {\bf Planned.} |
---|
| 886 | |
---|
[3] | 887 | \item Better determination of the noise characteristics of |
---|
| 888 | spectral-line cubes, including understanding how the noise is |
---|
| 889 | generated and developing a model for it. {\bf Planned.} |
---|
| 890 | |
---|
| 891 | \item Include more source analysis. Examples could be: shape |
---|
| 892 | information; measurements of HI mass; better measurements of |
---|
| 893 | velocity width and profile... {\bf Some planned.} |
---|
| 894 | |
---|
| 895 | \item Provide some indication of the significance of the detection |
---|
| 896 | (\ie some S/N-like value). {\bf Planned.} |
---|
| 897 | |
---|
[24] | 898 | \item Improved ability to reject interference, possibly on the |
---|
| 899 | spectral shape of features. {\bf Planned.} |
---|
| 900 | |
---|
[71] | 901 | \item Ability to separate (de-blend) distinct sources that have been |
---|
| 902 | merged. {\bf Planned.} |
---|
| 903 | |
---|
[3] | 904 | \item Link to lists of possible counterparts (\eg via NED/SIMBAD/other |
---|
| 905 | VO tools?). {\bf Wishlist.} |
---|
| 906 | |
---|
| 907 | \item At this point, the ``Milky Way'' channels are discarded and set |
---|
| 908 | to zero. It may be that users would like to have those put back in |
---|
| 909 | the final cube after the source detection is done, so at some point |
---|
| 910 | this option may be added. {\bf Wishlist -- if needed.} |
---|
| 911 | |
---|
| 912 | \end{itemize} |
---|
| 913 | |
---|
| 914 | |
---|
| 915 | %\bibliographystyle{mn2e} |
---|
| 916 | %\bibliographystyle{abbrvnat} |
---|
| 917 | %\bibliography{mnrasmnemonic,sourceDetection} |
---|
| 918 | \begin{thebibliography}{} |
---|
| 919 | |
---|
| 920 | \bibitem[\protect\citeauthoryear{{Calabretta} \& {Greisen}}{{Calabretta} \& |
---|
| 921 | {Greisen}}{2002}]{calabretta02} |
---|
| 922 | {Calabretta} M., {Greisen} E., 2002, A\&A, 395, 1077 |
---|
| 923 | |
---|
| 924 | \bibitem[\protect\citeauthoryear{{Greisen} \& {Calabretta}}{{Greisen} \& |
---|
| 925 | {Calabretta}}{2002}]{greisen02} |
---|
| 926 | {Greisen} E., {Calabretta} M., 2002, A\&A, 395, 1061 |
---|
| 927 | |
---|
[18] | 928 | \bibitem[\protect\citeauthoryear{{Hanisch}, {Farris}, {Greisen}, {Pence}, |
---|
| 929 | {Schlesinger}, {Teuben}, {Thompson} \& {Warnock}}{{Hanisch} |
---|
| 930 | et~al.}{2001}]{hanisch01} |
---|
| 931 | {Hanisch} R., {Farris} A., {Greisen} E., {Pence} W., {Schlesinger} B., |
---|
| 932 | {Teuben} P., {Thompson} R., {Warnock} A., 2001, A\&A, 376, 359 |
---|
| 933 | |
---|
[3] | 934 | \bibitem[\protect\citeauthoryear{{Hopkins}, {Miller}, {Connolly}, {Genovese}, |
---|
| 935 | {Nichol} \& {Wasserman}}{{Hopkins} et~al.}{2002}]{hopkins02} |
---|
| 936 | {Hopkins} A., {Miller} C., {Connolly} A., {Genovese} C., {Nichol} R., |
---|
| 937 | {Wasserman} L., 2002, AJ, 123, 1086 |
---|
| 938 | |
---|
| 939 | \bibitem[\protect\citeauthoryear{Lutz}{Lutz}{1980}]{lutz80} |
---|
| 940 | Lutz R., 1980, The Computer Journal, 23, 262 |
---|
| 941 | |
---|
| 942 | \bibitem[\protect\citeauthoryear{{Meyer} et~al.,}{{Meyer} |
---|
| 943 | et~al.}{2004}]{meyer04:trunc} |
---|
| 944 | {Meyer} M., et~al., 2004, MNRAS, 350, 1195 |
---|
| 945 | |
---|
| 946 | \bibitem[\protect\citeauthoryear{{Miller}, {Genovese}, {Nichol}, {Wasserman}, |
---|
| 947 | {Connolly}, {Reichart}, {Hopkins}, {Schneider} \& {Moore}}{{Miller} |
---|
| 948 | et~al.}{2001}]{miller01} |
---|
| 949 | {Miller} C., {Genovese} C., {Nichol} R., {Wasserman} L., {Connolly} A., |
---|
| 950 | {Reichart} D., {Hopkins} A., {Schneider} J., {Moore} A., 2001, AJ, 122, |
---|
| 951 | 3492 |
---|
| 952 | |
---|
| 953 | \bibitem[\protect\citeauthoryear{Minchin}{Minchin}{1999}]{minchin99} |
---|
| 954 | Minchin R., 1999, PASA, 16, 12 |
---|
| 955 | |
---|
| 956 | \bibitem[\protect\citeauthoryear{Starck \& Murtagh}{Starck \& |
---|
| 957 | Murtagh}{2002}]{starck02:book} |
---|
| 958 | Starck J.-L., Murtagh F., 2002, {``Astronomical Image and Data Analysis''}. |
---|
| 959 | Springer |
---|
| 960 | |
---|
| 961 | \end{thebibliography} |
---|
| 962 | |
---|
| 963 | |
---|
| 964 | \appendix |
---|
| 965 | \newpage |
---|
| 966 | \section{Available parameters} |
---|
| 967 | \label{app-param} |
---|
| 968 | |
---|
| 969 | The full list of parameters that can be listed in the input file are |
---|
| 970 | given here. If not listed, they take the default value given in |
---|
| 971 | parentheses. Since the order of the parameters in the input file does |
---|
| 972 | not matter, they are grouped here in logical sections. |
---|
| 973 | |
---|
| 974 | \subsection*{Input-output related} |
---|
| 975 | \begin{entry} |
---|
| 976 | \item[ImageFile (no default assumed)] The filename of the |
---|
| 977 | data cube to be analysed. |
---|
[71] | 978 | \item[flagSubsection {\tt [false]}] A flag to indicate whether one |
---|
| 979 | wants a subsection of the requested image. |
---|
| 980 | \item[Subsection {\tt [ [*,*,*] ]}] The requested subsection, which |
---|
| 981 | should be specified in the format {\tt [x1:x2,y1:y2,z1:z2]}, where |
---|
| 982 | the limits are inclusive. If the full range of a dimension is |
---|
| 983 | required, use a {\tt *}, \eg if you want the full spectral range of |
---|
| 984 | a subsection of the image, use {\tt [30:140,30:140,*]}. |
---|
| 985 | \item[flagReconExists {\tt [false]}] A flag to indicate whether the |
---|
| 986 | reconstructed array has been saved by a previous run of Duchamp. If |
---|
| 987 | set true, the reconstructed array will be read from the file given by |
---|
| 988 | {\tt reconFile}, rather than calculated directly. |
---|
| 989 | \item[reconFile (no default assumed)] The FITS file that contains the |
---|
| 990 | reconstructed array. If {\tt flagReconExists} is true and this |
---|
| 991 | parameter is not defined, the default file searched will be |
---|
| 992 | determined by the {\`a} trous parameters (see \S\ref{sec-recon}). |
---|
[87] | 993 | \item[OutFile {\tt [duchamp-Results.txt]}] The file containing the |
---|
| 994 | final list of detections. This also records the list of input |
---|
[3] | 995 | parameters. |
---|
[80] | 996 | \item[SpectraFile {\tt [duchamp-Spectra.ps]}] The postscript file |
---|
[3] | 997 | containing the resulting integrated spectra and images of the |
---|
| 998 | detections. |
---|
| 999 | \item[flagLog {\tt [true]}] A flag to indicate whether intermediate |
---|
| 1000 | detections should be logged. |
---|
[80] | 1001 | \item[LogFile {\tt [duchamp-Logfile.txt]}] The file in which intermediate |
---|
[3] | 1002 | detections are logged. These are detections that have not been |
---|
| 1003 | merged. This is primarily for use in debugging and diagnostic |
---|
| 1004 | purposes -- normal use of the program will probably not require |
---|
| 1005 | this. |
---|
| 1006 | \item[flagOutputRecon {\tt [false]}] A flag to say whether or not to |
---|
| 1007 | save the reconstructed cube as a FITS file. The filename will be |
---|
| 1008 | derived from the ImageFile -- the reconstruction of {\tt image.fits} |
---|
| 1009 | will be saved as {\tt image.RECON?.fits}, where {\tt ?} stands for |
---|
| 1010 | the value of {\tt snrRecon} (see below). |
---|
| 1011 | \item[flagOutputResid {\tt [false]}] As for {\tt flagOutputRecon}, but |
---|
| 1012 | for the residual array -- the difference between the original cube |
---|
| 1013 | and the reconstructed cube. The filename will be {\tt |
---|
| 1014 | image.RESID?.fits}. |
---|
| 1015 | \item[flagVOT {\tt [false]}] A flag to say whether to create a VOTable |
---|
| 1016 | file corresponding to the information in {\tt outfile}. This will be |
---|
| 1017 | an XML file in the Virtual Observatory VOTable format. |
---|
[80] | 1018 | \item[votFile {\tt [duchamp-Results.xml]}] The VOTable file with the |
---|
[3] | 1019 | list of final detections. Some input parameters are also recorded. |
---|
[20] | 1020 | \item[flagKarma {\tt [false]}] A flag to say whether to create a Karma |
---|
| 1021 | annotation file corresponding to the information in {\tt |
---|
| 1022 | outfile}. This can be used as an overlay for the Karma programs such |
---|
| 1023 | as {\tt kvis}. |
---|
[80] | 1024 | \item[karmaFile {\tt [duchamp-Results.ann]}] The Karma annotation |
---|
[20] | 1025 | file showing the list of final detections. |
---|
[3] | 1026 | \item[flagMaps {\tt [true]}] A flag to say whether to save postscript |
---|
[17] | 1027 | files showing the 0th moment map of the whole cube (parameter {\tt |
---|
[3] | 1028 | momentMap}) and the detection image ({\tt detectionMap}). |
---|
[80] | 1029 | \item[momentMap {\tt [duchamp-MomentMap.ps]}] A postscript file |
---|
[17] | 1030 | containing a map of the 0th moment of the detected sources, as well |
---|
[3] | 1031 | as pixel and WCS coordinates. |
---|
[80] | 1032 | \item[detectionMap {\tt [duchamp-DetectionMap.ps]}] A postscript |
---|
[3] | 1033 | file showing each of the detected objects, coloured in greyscale by |
---|
| 1034 | the number of channels they span. Also shows pixel and WCS |
---|
| 1035 | coordinates. |
---|
| 1036 | \end{entry} |
---|
| 1037 | |
---|
| 1038 | \subsection*{Modifying the cube} |
---|
| 1039 | \begin{entry} |
---|
| 1040 | \item[flagBlankPix {\tt [true]}] A flag to say whether to remove BLANK |
---|
| 1041 | pixels from the analysis -- these are pixels set to some particular |
---|
| 1042 | value because they fall outside the imaged area. |
---|
[15] | 1043 | \item[blankPixValue {\tt [-8.00061]}] The value of the BLANK pixels, |
---|
| 1044 | if this information is not contained in the FITS header (the usual |
---|
[3] | 1045 | procedure is to obtain this value from the header information -- in |
---|
| 1046 | which case the value set by this parameter is ignored). |
---|
[55] | 1047 | \item[flagMW {\tt [false]}] A flag to say whether to remove channels |
---|
[3] | 1048 | contaminated by Milky Way (or other) emission -- the flux in these |
---|
| 1049 | channels is currently just set to 0. |
---|
| 1050 | \item[maxMW {\tt [112]}] The maximum channel for the Milky Way |
---|
[15] | 1051 | emission. |
---|
[3] | 1052 | \item[minMW {\tt [75]}] The minimum channel for the Milky Way |
---|
| 1053 | emission. Note that the channels specified by {\tt maxMW} and {\tt |
---|
| 1054 | minMW} are assumed to be Milky Way channels (\ie the range is |
---|
| 1055 | inclusive). |
---|
| 1056 | \item[flagBaseline {\tt [false]}] A flag to say whether to remove the |
---|
| 1057 | baseline from each spectrum in the cube for the purposes of |
---|
| 1058 | reconstruction and detection. |
---|
| 1059 | \end{entry} |
---|
| 1060 | |
---|
| 1061 | \subsection*{Detection related} |
---|
| 1062 | |
---|
| 1063 | \subsubsection*{General detection} |
---|
| 1064 | \begin{entry} |
---|
[80] | 1065 | \item[flagNegative {\tt [false]}] A flag to indicate that the features |
---|
| 1066 | being searched for are negative. The cube will be inverted prior to |
---|
| 1067 | searching. |
---|
[3] | 1068 | \item[snrCut {\tt [3.]}] The cut-off value for thresholding, in terms |
---|
| 1069 | of number of $\sigma$ above the mean. |
---|
[87] | 1070 | \item[flagGrowth {\tt [false]}] A flag indicating whether or not to |
---|
[3] | 1071 | grow the detected objects to a smaller threshold. |
---|
[80] | 1072 | \item[growthCut {\tt [2.]}] The smaller threshold using in growing |
---|
[3] | 1073 | detections. In units of $\sigma$ above the mean. |
---|
| 1074 | \end{entry} |
---|
| 1075 | |
---|
| 1076 | \subsubsection*{{\` a} trous reconstruction} |
---|
| 1077 | \begin{entry} |
---|
| 1078 | \item [flagATrous {\tt [true]}] A flag indicating whether or not to |
---|
| 1079 | reconstruct the cube using the {\it {\`a} trous} wavelet |
---|
| 1080 | reconstruction. Currently does this in 3-dimensions. See |
---|
| 1081 | \S\ref{sec-recon} for details. |
---|
| 1082 | \item[scaleMin {\tt [1]}] The minimum wavelet scale to be used in the |
---|
| 1083 | reconstruction. A value of 1 means ``use all scales''. |
---|
| 1084 | \item[snrRecon {\tt [4]}] The thresholding cutoff used in the |
---|
| 1085 | reconstruction -- only wavelet coefficients this many $\sigma$ above |
---|
| 1086 | the mean (or greater) are included in the reconstruction. |
---|
| 1087 | \item[filterCode {\tt [2]}] The code number of the filter to use in |
---|
| 1088 | the reconstruction. The options are: |
---|
| 1089 | \begin{itemize} |
---|
| 1090 | \item {\bf 1:} B$_3$-spline filter: coefficients = |
---|
| 1091 | $(\frac{1}{16}, \frac{1}{4}, \frac{3}{8}, \frac{1}{4}, \frac{1}{16})$ |
---|
| 1092 | \item {\bf 2:} Triangle filter: coefficients = $(\frac{1}{4}, \frac{1}{2}, \frac{1}{4})$ |
---|
| 1093 | \item {\bf 3:} Haar wavelet: coefficients = $(0, \frac{1}{2}, \frac{1}{2})$ |
---|
| 1094 | \end{itemize} |
---|
| 1095 | \end{entry} |
---|
| 1096 | |
---|
| 1097 | \subsubsection*{FDR method} |
---|
| 1098 | \begin{entry} |
---|
| 1099 | \item[flagFDR {\tt [false]}] A flag indicating whether or not to use |
---|
| 1100 | the False Discovery Rate method in thresholding the pixels. |
---|
| 1101 | \item[alphaFDR {\tt [0.01]}] The $\alpha$ parameter used in the FDR |
---|
| 1102 | analysis. The average number of false detections, as a fraction of the |
---|
| 1103 | total number, will be less than $\alpha$ (see \S\ref{sec-detection}). |
---|
| 1104 | \end{entry} |
---|
| 1105 | |
---|
| 1106 | \subsubsection*{Merging detections} |
---|
| 1107 | \begin{entry} |
---|
[17] | 1108 | \item[minPix {\tt [2]}] The minimum number of spatial pixels for a single |
---|
| 1109 | detection to be counted. |
---|
[3] | 1110 | \item[minChannels {\tt [3]}] The minimum number of consecutive |
---|
| 1111 | channels that must be present in the detection for it to be accepted |
---|
| 1112 | by the Merging algorithm. |
---|
| 1113 | %The minimum number of channels that a |
---|
| 1114 | % detection must span for it to be accepted by the Merging algorithm. |
---|
[80] | 1115 | \item[flagAdjacent {\tt [true]}] A flag indicating whether to use the |
---|
| 1116 | ``adjacent pixel'' criterion to decide whether to merge objects. If |
---|
| 1117 | not, the next two parameters are used to determine whether objects |
---|
| 1118 | are within the necessary thresholds. |
---|
[3] | 1119 | \item[threshSpatial {\tt [3.]}] The maximum allowed minimum spatial |
---|
| 1120 | separation (in pixels) between two detections for them to be merged |
---|
| 1121 | into one. Only used if {\tt flagAdjacent = false}. |
---|
| 1122 | \item[threshVelocity {\tt [7.]}] The maximum allowed minimum channel |
---|
| 1123 | separation between two detections for them to be merged into |
---|
| 1124 | one. %Only used if {\tt flagAdjacent = false}. |
---|
| 1125 | \end{entry} |
---|
| 1126 | |
---|
| 1127 | \subsubsection*{Other parameters} |
---|
| 1128 | \begin{entry} |
---|
[80] | 1129 | \item[spectralMethod {\tt [peak]}] This indicates which method is used |
---|
| 1130 | to plot the output spectra: {\tt peak} means plot the spectrum |
---|
| 1131 | containing the detection's peak pixel; {\tt sum} means sum the |
---|
| 1132 | spectra of each detected spatial pixel, and correct for the beam |
---|
| 1133 | size. Any other choice defaults to {\tt peak}. |
---|
[3] | 1134 | \item[drawBorders {\tt [true]}] A flag indicating whether borders |
---|
| 1135 | are to be drawn around the detected objects in the moment maps |
---|
| 1136 | included in the output (see for example Fig.~\ref{fig-spect}). |
---|
| 1137 | \item[verbose {\tt [true]}] A flag indicating whether to print the |
---|
| 1138 | progress of computationally-intensive algorithms (such as the |
---|
| 1139 | searching and merging) to screen. |
---|
| 1140 | \end{entry} |
---|
| 1141 | |
---|
| 1142 | |
---|
| 1143 | \newpage |
---|
| 1144 | \section{Example parameter files} |
---|
| 1145 | \label{app-input} |
---|
| 1146 | |
---|
| 1147 | This is what a typical parameter file would look like. |
---|
| 1148 | |
---|
| 1149 | \begin{verbatim} |
---|
| 1150 | imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
[80] | 1151 | logFile logfile.txt |
---|
| 1152 | outFile results.txt |
---|
[3] | 1153 | spectraFile spectra.ps |
---|
| 1154 | flagSubsection 0 |
---|
| 1155 | flagOutputRecon 0 |
---|
| 1156 | flagOutputResid 0 |
---|
| 1157 | flagBlankPix 1 |
---|
| 1158 | flagMW 1 |
---|
| 1159 | minMW 75 |
---|
| 1160 | maxMW 112 |
---|
| 1161 | minPix 3 |
---|
| 1162 | flagGrowth 1 |
---|
| 1163 | growthCut 1.5 |
---|
| 1164 | flagATrous 0 |
---|
| 1165 | scaleMin 1 |
---|
| 1166 | snrRecon 4 |
---|
| 1167 | flagFDR 1 |
---|
| 1168 | alphaFDR 0.1 |
---|
| 1169 | numPixPSF 20 |
---|
| 1170 | snrCut 3 |
---|
| 1171 | threshSpatial 3 |
---|
| 1172 | threshVelocity 7 |
---|
| 1173 | \end{verbatim} |
---|
| 1174 | |
---|
| 1175 | Note that it is not necessary to include all these parameters in the |
---|
| 1176 | file, only those that need to be changed from the defaults (as listed |
---|
| 1177 | in Appendix~\ref{app-param}), which in this case would be very few. A |
---|
| 1178 | minimal parameter file might look like: |
---|
| 1179 | \begin{verbatim} |
---|
| 1180 | imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
| 1181 | flagLog 0 |
---|
| 1182 | snrRecon 3 |
---|
| 1183 | snrCut 2.5 |
---|
[87] | 1184 | minChannels 4 |
---|
[3] | 1185 | \end{verbatim} |
---|
| 1186 | This will reconstruct the cube with a lower SNR value than the |
---|
| 1187 | default, select objects at a lower threshold, with a looser minimum |
---|
| 1188 | channel requirement, and not keep a log of the intermediate |
---|
| 1189 | detections. |
---|
| 1190 | |
---|
| 1191 | The following page demonstrates how the parameters are presented to |
---|
| 1192 | the user, both on the screen at execution time and in the output and |
---|
| 1193 | log files: |
---|
| 1194 | \newpage |
---|
| 1195 | \begin{landscape} |
---|
| 1196 | Presentation of parameters in output and log files: |
---|
| 1197 | \begin{verbatim} |
---|
[18] | 1198 | ---- Parameters ---- |
---|
[3] | 1199 | Image to be analysed = /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
[87] | 1200 | Intermediate Logfile = duchamp-Logfile.txt |
---|
| 1201 | Final Results file = duchamp-Results.txt |
---|
| 1202 | Spectrum file = duchamp-Spectra.ps |
---|
| 1203 | VOTable file = duchamp-Results.xml |
---|
| 1204 | 0th Moment Map = duchamp-MomentMap.ps |
---|
| 1205 | Detection Map = duchamp-DetectionMap.ps |
---|
[3] | 1206 | Saving reconstructed cube? = false |
---|
| 1207 | Saving residuals from reconstruction? = false |
---|
[18] | 1208 | ------ |
---|
[87] | 1209 | Searching for Negative features? = false |
---|
[3] | 1210 | Fixing Blank Pixels? = true |
---|
| 1211 | Blank Pixel Value = -8.00061 |
---|
| 1212 | Removing Milky Way channels? = true |
---|
| 1213 | Milky Way Channels = 75-112 |
---|
[18] | 1214 | Beam Size (pixels) = 10.1788 |
---|
| 1215 | Removing baselines before search? = false |
---|
| 1216 | Minimum # Pixels in a detection = 2 |
---|
| 1217 | Growing objects after detection? = false |
---|
[3] | 1218 | Using A Trous reconstruction? = true |
---|
| 1219 | Minimum scale in reconstruction = 1 |
---|
[18] | 1220 | SNR Threshold within reconstruction = 4 |
---|
| 1221 | Filter being used for reconstruction = B3 spline function |
---|
[3] | 1222 | Using FDR analysis? = false |
---|
[18] | 1223 | SNR Threshold = 2.5 |
---|
[3] | 1224 | Using Adjacent-pixel criterion? = true |
---|
[87] | 1225 | Max. velocity separation for merging = 7 |
---|
[3] | 1226 | Min. # channels for merging = 4 |
---|
[87] | 1227 | Method of spectral plotting = peak |
---|
[3] | 1228 | \end{verbatim} |
---|
| 1229 | |
---|
| 1230 | \newpage |
---|
| 1231 | \section{Example output file} |
---|
| 1232 | \label{app-output} |
---|
| 1233 | This the typical content of an output file, after running Duchamp |
---|
[80] | 1234 | with the parameters illustrated on the previous page. |
---|
[3] | 1235 | |
---|
| 1236 | {\scriptsize |
---|
| 1237 | \begin{verbatim} |
---|
[87] | 1238 | Results of the Duchamp source finder: Tue May 23 14:51:38 2006 |
---|
[18] | 1239 | ---- Parameters ---- |
---|
[3] | 1240 | |
---|
| 1241 | (... omitted for clarity -- see previous page for examples...) |
---|
| 1242 | |
---|
[18] | 1243 | -------------------- |
---|
| 1244 | Total number of detections = 23 |
---|
| 1245 | -------------------- |
---|
[87] | 1246 | Obj# Name X Y Z RA DEC w_RA w_DEC VEL w_VEL F_tot F_peak X1 X2 Y1 Y2 Z1 Z2 Npix Flag |
---|
| 1247 | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
---|
| 1248 | 1 J0609-2156 59.4 140.6 114.7 06:09:21.03 -21:56:51.08 48.48 39.45 226.253 65.957 17.572 0.213 55 66 136 145 113 118 185 |
---|
| 1249 | 2 J0607-2601 65.2 79.6 116.2 06:07:52.21 -26:01:09.34 44.44 39.50 246.310 39.574 4.144 0.100 60 70 76 85 115 118 50 |
---|
| 1250 | 3 J0606-2720 70.8 59.8 121.4 06:06:14.90 -27:20:45.24 52.45 47.59 315.404 39.574 17.066 0.150 65 77 53 64 120 123 213 |
---|
| 1251 | 4 J0611-2138 52.5 145.1 162.5 06:11:18.85 -21:38:03.71 32.39 23.49 856.919 118.722 44.394 0.410 49 56 142 147 158 167 303 E |
---|
| 1252 | 5 J0600-2859 89.7 35.3 202.4 06:00:33.13 -28:59:01.59 23.92 28.10 1383.476 184.678 26.573 0.173 87 92 32 38 195 209 319 |
---|
| 1253 | 6 J0558-2639 95.5 70.2 222.6 05:58:52.79 -26:39:04.56 15.93 12.10 1650.508 105.531 1.925 0.063 94 97 69 71 219 227 35 |
---|
| 1254 | 7 J0617-2724 34.8 58.3 227.5 06:17:05.84 -27:24:00.93 20.75 23.42 1714.993 303.400 11.414 0.093 33 37 56 61 215 238 176 |
---|
| 1255 | 8 J0609-2141 60.3 144.4 229.6 06:09:05.74 -21:41:38.75 16.14 11.82 1742.470 105.531 1.476 0.068 59 62 143 145 225 233 25 |
---|
| 1256 | 9 J0558-2525 95.7 88.6 231.1 05:58:51.19 -25:25:33.12 27.87 24.16 1762.632 250.635 16.930 0.115 92 98 86 91 220 239 257 |
---|
| 1257 | 10 J0600-2141 88.9 144.4 232.3 06:00:52.94 -21:41:57.48 31.95 24.15 1777.848 224.252 34.030 0.166 86 93 142 147 222 239 415 E |
---|
| 1258 | 11 J0615-2634 40.0 70.8 232.6 06:15:25.93 -26:34:35.73 16.54 19.58 1782.224 52.765 2.757 0.068 38 41 69 73 231 235 44 |
---|
| 1259 | 12 J0604-2606 75.9 78.4 233.1 06:04:42.24 -26:06:22.98 28.12 23.86 1788.258 224.252 27.059 0.155 73 79 76 81 225 242 352 |
---|
| 1260 | 13 J0601-2340 88.0 114.9 235.7 06:01:08.27 -23:40:17.66 35.94 32.09 1822.941 263.826 85.132 0.297 84 92 112 119 226 246 724 |
---|
| 1261 | 14 J0615-2234 38.2 130.6 253.6 06:15:30.57 -22:34:51.69 12.38 15.71 2059.721 118.722 2.317 0.070 37 39 129 132 248 257 40 |
---|
| 1262 | 15 J0617-2305 31.4 122.8 258.0 06:17:33.18 -23:05:36.24 16.45 15.54 2117.104 39.574 1.424 0.062 30 33 121 124 256 259 23 |
---|
| 1263 | 16 J0612-2149 49.5 142.3 271.1 06:12:11.78 -21:49:20.22 24.35 19.58 2290.167 395.740 20.712 0.101 47 52 140 144 257 287 318 |
---|
| 1264 | 17 J0616-2133 35.2 145.9 300.0 06:16:16.44 -21:33:36.96 20.21 7.47 2671.799 224.252 3.851 0.127 33 37 145 146 294 311 40 E |
---|
| 1265 | 18 J0544-2736 144.0 54.9 325.4 05:44:13.62 -27:36:34.24 3.57 12.13 3006.575 39.574 0.436 0.057 144 144 54 56 324 327 7 E |
---|
| 1266 | 19 J0555-2956 107.2 20.7 367.5 05:55:10.37 -29:56:43.13 19.65 24.31 3561.004 39.574 6.482 0.169 105 109 18 23 366 369 72 |
---|
| 1267 | 20 J0558-2321 96.0 119.6 532.1 05:58:47.64 -23:21:17.38 11.91 16.09 5733.479 52.765 1.287 0.051 95 97 118 121 530 534 27 |
---|
| 1268 | 21 J0616-2649 37.9 67.0 547.0 06:16:04.62 -26:49:18.33 12.35 11.67 5929.923 39.574 1.637 0.064 37 39 66 68 546 549 25 |
---|
| 1269 | 22 J0619-2252 25.1 125.9 724.2 06:19:21.57 -22:52:13.98 12.38 11.61 8267.304 39.573 0.698 0.059 24 26 125 127 723 726 13 E |
---|
| 1270 | 23 J0552-2916 116.9 30.5 727.0 05:52:15.05 -29:16:49.65 11.59 20.25 8304.033 303.400 35.834 0.479 116 118 28 32 716 739 132 |
---|
[3] | 1271 | \end{verbatim} |
---|
| 1272 | } |
---|
[80] | 1273 | Note that the |
---|
| 1274 | width of the table can make it hard to read. A good trick for those |
---|
| 1275 | using UNIX/Linux is to make use of the {\tt a2ps} command. The |
---|
| 1276 | following works well, producing a postscript file {\tt results.ps}: |
---|
[87] | 1277 | \\\verb|a2ps -1 -r -f8 -o duchamp-Results.ps duchamp-Results.txt| |
---|
[3] | 1278 | |
---|
| 1279 | %\end{landscape} |
---|
| 1280 | |
---|
| 1281 | \newpage |
---|
| 1282 | \section{Example VOTable output} |
---|
| 1283 | \label{app-votable} |
---|
| 1284 | This is part of the VOTable, in XML format, corresponding to the |
---|
[20] | 1285 | output file in Appendix~\ref{app-output} (the indentation has been removed to make it fit on the page!). |
---|
[3] | 1286 | |
---|
| 1287 | %\begin{landscape} |
---|
| 1288 | {\scriptsize |
---|
| 1289 | \begin{verbatim} |
---|
| 1290 | <?xml version="1.0"?> |
---|
| 1291 | <VOTABLE version="1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
---|
| 1292 | xsi:noNamespaceSchemaLocation="http://www.ivoa.net/xml/VOTable/VOTable/v1.1"> |
---|
[20] | 1293 | <COOSYS ID="J2000" equinox="J2000." epoch="J2000." system="eq_FK5"/> |
---|
| 1294 | <RESOURCE name="Duchamp Output"> |
---|
| 1295 | <TABLE name="Detections"> |
---|
| 1296 | <DESCRIPTION>Detected sources and parameters from running the Duchamp source finder.</DESCRIPTION> |
---|
| 1297 | <PARAM name="FITS file" datatype="char" ucd="meta.file;meta.fits" value="/DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits"/> |
---|
[87] | 1298 | <PARAM name="Threshold" datatype="float" ucd="stat.snr" value="2.5"> |
---|
| 1299 | <PARAM name="ATrous note" datatype="char" ucd="meta.note" value="The a trous reconstruction method was used, with the following parameters."> |
---|
| 1300 | <PARAM name="ATrous Cut" datatype="float" ucd="stat.snr" value="4"> |
---|
| 1301 | <PARAM name="ATrous Minimum Scale" datatype="int" ucd="stat.param" value="1"> |
---|
| 1302 | <PARAM name="ATrous Filter" datatype="char" ucd="meta.code;stat" value="B3 spline function"> |
---|
[20] | 1303 | <FIELD name="ID" ID="col1" ucd="meta.id" datatype="int" width="4"/> |
---|
| 1304 | <FIELD name="Name" ID="col2" ucd="meta.id;meta.main" datatype="char" arraysize="14"/> |
---|
| 1305 | <FIELD name="RA" ID="col3" ucd="pos.eq.ra;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/> |
---|
| 1306 | <FIELD name="Dec" ID="col4" ucd="pos.eq.dec;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/> |
---|
| 1307 | <FIELD name="w_RA" ID="col3" ucd="phys.angSize;pos.eq.ra" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/> |
---|
| 1308 | <FIELD name="w_Dec" ID="col4" ucd="phys.angSize;pos.eq.dec" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/> |
---|
| 1309 | <FIELD name="Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc" datatype="float" width="9" precision="3" unit="km/s"/> |
---|
| 1310 | <FIELD name="w_Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc;spect.line.width" datatype="float" width="8" precision="3" unit="km/s"/> |
---|
| 1311 | <FIELD name="Integrated_Flux" ID="col4" ucd="phys.flux;spect.line.intensity" datatype="float" width="10" precision="3" unit="km/s"/> |
---|
| 1312 | <DATA> |
---|
| 1313 | <TABLEDATA> |
---|
| 1314 | <TR> |
---|
| 1315 | <TD> 1</TD><TD> J0609-2200</TD><TD> 92.410416</TD><TD>-22.013390</TD><TD> 48.50</TD><TD> 39.42</TD><TD> 213.061</TD><TD> 65.957</TD><TD> 17.572</TD> |
---|
| 1316 | </TR> |
---|
| 1317 | <TR> |
---|
| 1318 | <TD> 2</TD><TD> J0608-2605</TD><TD> 92.042633</TD><TD>-26.085157</TD><TD> 44.47</TD><TD> 39.47</TD><TD> 233.119</TD><TD> 39.574</TD><TD> 4.144</TD> |
---|
| 1319 | </TR> |
---|
| 1320 | <TR> |
---|
| 1321 | <TD> 3</TD><TD> J0606-2724</TD><TD> 91.637840</TD><TD>-27.412022</TD><TD> 52.48</TD><TD> 47.57</TD><TD> 302.213</TD><TD> 39.574</TD><TD> 17.066</TD> |
---|
| 1322 | </TR> |
---|
[3] | 1323 | (... table truncated for clarity ...) |
---|
[20] | 1324 | </TABLEDATA> |
---|
| 1325 | </DATA> |
---|
| 1326 | </TABLE> |
---|
| 1327 | </RESOURCE> |
---|
[3] | 1328 | </VOTABLE> |
---|
| 1329 | \end{verbatim} |
---|
| 1330 | } |
---|
| 1331 | \end{landscape} |
---|
| 1332 | |
---|
[87] | 1333 | \newpage |
---|
| 1334 | \section{Example Karma Annotation File output} |
---|
| 1335 | \label{app-karma} |
---|
| 1336 | |
---|
| 1337 | This is the format of the Karma Annotation file, showing the locations |
---|
| 1338 | of the detected objects. This can be loaded by the plotting tools of |
---|
| 1339 | the Karma package (for instance, {\tt kvis}) as an overlay on the FITS |
---|
| 1340 | file. |
---|
| 1341 | |
---|
| 1342 | \begin{verbatim} |
---|
| 1343 | # Duchamp Source Finder results for |
---|
| 1344 | # cube /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
| 1345 | COLOR RED |
---|
| 1346 | COORD W |
---|
| 1347 | CIRCLE 92.3376 -21.9475 0.403992 |
---|
| 1348 | TEXT 92.3376 -21.9475 1 |
---|
| 1349 | CIRCLE 91.9676 -26.0193 0.37034 |
---|
| 1350 | TEXT 91.9676 -26.0193 2 |
---|
| 1351 | CIRCLE 91.5621 -27.3459 0.437109 |
---|
| 1352 | TEXT 91.5621 -27.3459 3 |
---|
| 1353 | CIRCLE 92.8285 -21.6344 0.269914 |
---|
| 1354 | TEXT 92.8285 -21.6344 4 |
---|
| 1355 | CIRCLE 90.1381 -28.9838 0.234179 |
---|
| 1356 | TEXT 90.1381 -28.9838 5 |
---|
| 1357 | CIRCLE 89.72 -26.6513 0.132743 |
---|
| 1358 | TEXT 89.72 -26.6513 6 |
---|
| 1359 | CIRCLE 94.2743 -27.4003 0.195175 |
---|
| 1360 | TEXT 94.2743 -27.4003 7 |
---|
| 1361 | CIRCLE 92.2739 -21.6941 0.134538 |
---|
| 1362 | TEXT 92.2739 -21.6941 8 |
---|
| 1363 | CIRCLE 89.7133 -25.4259 0.232252 |
---|
| 1364 | TEXT 89.7133 -25.4259 9 |
---|
| 1365 | CIRCLE 90.2206 -21.6993 0.266247 |
---|
| 1366 | TEXT 90.2206 -21.6993 10 |
---|
| 1367 | CIRCLE 93.8581 -26.5766 0.163153 |
---|
| 1368 | TEXT 93.8581 -26.5766 11 |
---|
| 1369 | CIRCLE 91.176 -26.1064 0.234356 |
---|
| 1370 | TEXT 91.176 -26.1064 12 |
---|
| 1371 | CIRCLE 90.2844 -23.6716 0.299509 |
---|
| 1372 | TEXT 90.2844 -23.6716 13 |
---|
| 1373 | CIRCLE 93.8774 -22.581 0.130925 |
---|
| 1374 | TEXT 93.8774 -22.581 14 |
---|
| 1375 | CIRCLE 94.3882 -23.0934 0.137108 |
---|
| 1376 | TEXT 94.3882 -23.0934 15 |
---|
| 1377 | CIRCLE 93.0491 -21.8223 0.202928 |
---|
| 1378 | TEXT 93.0491 -21.8223 16 |
---|
| 1379 | CIRCLE 94.0685 -21.5603 0.168456 |
---|
| 1380 | TEXT 94.0685 -21.5603 17 |
---|
| 1381 | CIRCLE 86.0568 -27.6095 0.101113 |
---|
| 1382 | TEXT 86.0568 -27.6095 18 |
---|
| 1383 | CIRCLE 88.7932 -29.9453 0.202624 |
---|
| 1384 | TEXT 88.7932 -29.9453 19 |
---|
| 1385 | \end{verbatim} |
---|
| 1386 | |
---|
| 1387 | \newpage |
---|
[80] | 1388 | \section{Installing Duchamp (README file)} |
---|
[71] | 1389 | \begin{verbatim} |
---|
| 1390 | There is an executable (Duchamp) that has been compiled on a Debian |
---|
| 1391 | Linux kernel 2.6.8-2-686, with gcc version 3.3.5 (Debian 1:3.3.5-13) |
---|
| 1392 | |
---|
| 1393 | If that is no good to you, you can compile it yourself using the |
---|
| 1394 | Makefile included in this directory (sorry for not having a configure |
---|
| 1395 | script or similar yet!). |
---|
| 1396 | |
---|
| 1397 | Duchamp uses three main external libraries: pgplot, cfitsio and |
---|
| 1398 | wcslib. You will need to set the paths for the base directory and |
---|
| 1399 | three libraries, as they are currently configured for my use and will |
---|
| 1400 | not be of much use to you! These are: |
---|
| 1401 | |
---|
| 1402 | BASE --> the current directory |
---|
| 1403 | PGDIR --> where the pgplot libraries (and header files) are located |
---|
| 1404 | CFITSIODIR --> where the header file fitsio.h is |
---|
| 1405 | CFITSIOLDIR --> where the cfitsio library is located (libcfitsio.a) |
---|
| 1406 | WCSDIR --> where the wcslib header files are |
---|
| 1407 | WCSLDIR --> where the wcslib library is located (libwcs.a) |
---|
| 1408 | |
---|
| 1409 | If you do not have the libraries, they can be downloaded from the |
---|
| 1410 | following locations: |
---|
| 1411 | PGPlot -- http://www.astro.caltech.edu/~tjp/pgplot/ |
---|
| 1412 | cfitsio -- http://heasarc.gsfc.nasa.gov/docs/software/fitsio/fitsio.html |
---|
| 1413 | wcslib -- http://www.atnf.csiro.au/people/Mark.Calabretta/WCS/index.html |
---|
| 1414 | |
---|
| 1415 | Once you've set up the Makefile correctly, then simply typing |
---|
| 1416 | > make duchamp |
---|
| 1417 | will compile the program. |
---|
| 1418 | |
---|
| 1419 | To run it, you need to use the syntax |
---|
| 1420 | > Duchamp -p parameterFile |
---|
| 1421 | where parameterFile is a file with the input parameters, including the |
---|
| 1422 | name of the cube you want to search. |
---|
| 1423 | |
---|
| 1424 | There are two example input files included with the distribution. The |
---|
| 1425 | smaller one, InputExample, shows the typical parameters one might want |
---|
| 1426 | to set. The large one, InputComplete, lists all parameters that can be |
---|
| 1427 | entered, and a brief description of them. Refer to the documentation |
---|
| 1428 | for further details. |
---|
| 1429 | |
---|
| 1430 | To get going quickly, just replace the "your-file-here" in |
---|
| 1431 | InputExample with your image name, and type |
---|
| 1432 | > Duchamp -p InputExample |
---|
| 1433 | and you're off! |
---|
| 1434 | \end{verbatim} |
---|
| 1435 | |
---|
[3] | 1436 | \section{Robust statistics for a Normal distribution} |
---|
| 1437 | \label{app-madfm} |
---|
| 1438 | |
---|
| 1439 | The Normal, or Gaussian, distribution for mean $\mu$ and standard |
---|
| 1440 | deviation $\sigma$ can be written as |
---|
| 1441 | \[ |
---|
| 1442 | f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\ e^{-(x-\mu)^2/2\sigma^2}. |
---|
| 1443 | \] |
---|
| 1444 | |
---|
| 1445 | When one has a purely Gaussian signal, it is straightforward to |
---|
| 1446 | estimate $\sigma$ by calculating the standard deviation (or rms) of |
---|
| 1447 | the data. However, if there is a small amount of signal present on top |
---|
| 1448 | of Gaussian noise, and one wants to estimate the $\sigma$ for the |
---|
| 1449 | noise, the presence of the large values from the signal can bias the |
---|
| 1450 | estimator to higher values. |
---|
| 1451 | |
---|
| 1452 | An alternative way is to use the median ($m$) and median absolute deviation |
---|
| 1453 | from the median ($s$) to estimate $\mu$ and $\sigma$. The median is the |
---|
| 1454 | middle of the distribution, defined for a continuous distribution by |
---|
| 1455 | \[ |
---|
| 1456 | \int_{-\infty}^{m} f(x) \diff x = \int_{m}^{\infty} f(x) \diff x. |
---|
| 1457 | \] |
---|
| 1458 | From symmetry, we quickly see that for the continuous Normal |
---|
| 1459 | distribution, $m=\mu$. We consider the case henceforth of $\mu=0$, |
---|
| 1460 | without loss of generality. |
---|
| 1461 | |
---|
| 1462 | To find $s$, we find the distribution of the absolute deviation from |
---|
| 1463 | the median, and then find the median of that distribution. This |
---|
| 1464 | distribution is given by |
---|
| 1465 | \begin{eqnarray*} |
---|
| 1466 | g(x) &= &{\mbox{\rm distribution of }} |x|\\ |
---|
[18] | 1467 | &= &f(x) + f(-x),\ x\ge0\\ |
---|
[37] | 1468 | &= &\sqrt{\frac{2}{\pi\sigma^2}}\, e^{-x^2/2\sigma^2},\ x\ge0. |
---|
[3] | 1469 | \end{eqnarray*} |
---|
[18] | 1470 | So, the median absolute deviation from the median, $s$, is given by |
---|
[3] | 1471 | \[ |
---|
| 1472 | \int_{0}^{s} g(x) \diff x = \int_{s}^{\infty} g(x) \diff x. |
---|
| 1473 | \] |
---|
| 1474 | Now, $\int_{0}^{\infty}e^{-x^2/2\sigma^2} \diff x = \sqrt{\pi\sigma^2/2}$, and |
---|
| 1475 | so $\int_{s}^{\infty} e^{-x^2/2\sigma^2} \diff x = |
---|
| 1476 | \sqrt{\pi\sigma^2/2} - \int_{0}^{s} e^{-\frac{x^2}{2\sigma^2}} \diff x |
---|
| 1477 | $. Hence, to find $s$ we simply solve the following equation (setting $\sigma=1$ for |
---|
| 1478 | simplicity -- equivalent to stating $x$ and $s$ in units of $\sigma$): |
---|
| 1479 | \[ |
---|
| 1480 | \int_{0}^{s}e^{-x^2/2} \diff x - \sqrt{\pi/8} = 0. |
---|
| 1481 | \] |
---|
| 1482 | This is hard to solve analytically (no nice analytic solution exists |
---|
| 1483 | for the finite integral that I'm aware of), but straightforward to |
---|
| 1484 | solve numerically, yielding the value of $s=0.6744888$. Thus, to |
---|
| 1485 | estimate $\sigma$ for a Normally distributed data set, one can calculate |
---|
| 1486 | $s$, then divide by 0.6744888 (or multiply by 1.4826042) to obtain the |
---|
| 1487 | correct estimator. |
---|
| 1488 | |
---|
| 1489 | Note that this is different to solutions quoted elsewhere, |
---|
[37] | 1490 | specifically in \citet{meyer04:trunc}, where the same robust estimator |
---|
| 1491 | is used but with an incorrect conversion to standard deviation -- they |
---|
| 1492 | assume $\sigma = s\sqrt{\pi/2}$. This, in fact, is the conversion used |
---|
| 1493 | to convert the {\it mean} absolute deviation from the mean to the |
---|
| 1494 | standard deviation. This means that the cube noise in the \hipass\ |
---|
[80] | 1495 | catalogue (their parameter Rms$_{\rm cube}$) should be 18\% larger |
---|
| 1496 | than quoted. |
---|
[3] | 1497 | |
---|
[24] | 1498 | \section{How Gaussian noise changes with wavelet scale.} |
---|
| 1499 | \label{app-scaling} |
---|
| 1500 | |
---|
| 1501 | The key element in the wavelet reconstruction of an array is the |
---|
| 1502 | thresholding of the individual wavelet coefficient arrays. This is |
---|
| 1503 | usually done by choosing a level to be some number of standard |
---|
| 1504 | deviations above the mean value. |
---|
| 1505 | |
---|
| 1506 | However, since the wavelet arrays are produced by convolving the input |
---|
| 1507 | array by an increasingly large filter, the pixels in the coefficient |
---|
| 1508 | arrays become increasingly correlated as the scale of the filter |
---|
| 1509 | increases. This results in the measured standard deviation from a |
---|
| 1510 | given coefficient array decreasing with increasing scale. To calculate |
---|
| 1511 | this, we need to take into account how many other pixels each pixel in |
---|
| 1512 | the convolved array depends on. |
---|
| 1513 | |
---|
| 1514 | To demonstrate, suppose we have a 1-D array with $N$ pixel values |
---|
| 1515 | given by $F_i,\ i=1,...,N$, and we convolve it with the B$_3$-spline |
---|
[87] | 1516 | filter, defined by the set of coefficients |
---|
| 1517 | $\{1/16,1/4,3/8,1/4,1/16\}$. The flux of the $i$th pixel in the |
---|
| 1518 | convolved array will be |
---|
[24] | 1519 | \[ |
---|
| 1520 | F'_i = \frac{1}{16}F_{i-2} + \frac{1}{16}F_{i-2} + \frac{3}{8}F_{i} |
---|
| 1521 | + \frac{1}{4}F_{i-1} + \frac{1}{16}F_{i+2} |
---|
| 1522 | \] |
---|
| 1523 | and the flux of the corresponding pixel in the wavelet array will be |
---|
| 1524 | \[ |
---|
| 1525 | W'_i = F_i - F'_i = \frac{1}{16}F_{i-2} + \frac{1}{16}F_{i-2} + \frac{5}{8}F_{i} |
---|
| 1526 | + \frac{1}{4}F_{i-1} + \frac{1}{16}F_{i+2} |
---|
| 1527 | \] |
---|
| 1528 | Now, assuming each pixel has the same standard deviation |
---|
| 1529 | $\sigma_i=\sigma$, we can work out the standard deviation for the |
---|
| 1530 | coefficient array: |
---|
| 1531 | \[ |
---|
| 1532 | \sigma'_i = \sigma \sqrt{\left(\frac{1}{16}\right)^2 + \left(\frac{1}{4}\right)^2 |
---|
| 1533 | + \left(\frac{5}{8}\right)^2 + \left(\frac{1}{4}\right)^2 + \left(\frac{1}{16}\right)^2} |
---|
| 1534 | = 0.72349\ \sigma |
---|
| 1535 | \] |
---|
| 1536 | Thus, the first scale wavelet coefficient array will have a standard |
---|
| 1537 | deviation of 72.3\% of the input array. This procedure can be followed |
---|
| 1538 | to calculate the necessary values for all scales, dimensions and |
---|
| 1539 | filters used by Duchamp. |
---|
| 1540 | |
---|
| 1541 | Calculating these values is, therefore, a critical step in performing |
---|
| 1542 | the reconstruction. \citet{starck02:book} did so by simulating data sets |
---|
| 1543 | with Gaussian noise, taking the wavelet transform, and measuring the |
---|
| 1544 | value of $\sigma$ for each scale. We take a different approach, by |
---|
| 1545 | calculating the scaling factors directly from the filter coefficients |
---|
| 1546 | by taking the wavelet transform of an array made up of a 1 in the |
---|
| 1547 | central pixel and 0s everywhere else. The scaling value is then |
---|
| 1548 | derived by adding in quadrature all the wavelet coefficient values at |
---|
| 1549 | each scale. We give the scaling factors for the three filters |
---|
| 1550 | available to Duchamp on the following page. These values are |
---|
| 1551 | hard-coded into Duchamp, so no on-the-fly calculation of them is |
---|
| 1552 | necessary. |
---|
| 1553 | |
---|
| 1554 | Memory limitations prevent us from calculating factors for large |
---|
| 1555 | scales, particularly for the three-dimensional case (hence the -- |
---|
| 1556 | symbols in the tables). To calculate factors for |
---|
| 1557 | higher scales than those available, we note the following |
---|
| 1558 | relationships apply for large scales to a sufficient level of precision: |
---|
| 1559 | \begin{itemize} |
---|
| 1560 | \item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{2}$. |
---|
| 1561 | \item 2-D: factor(scale $i$) = factor(scale $i-1$)$/2$. |
---|
| 1562 | \item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{8}$. |
---|
| 1563 | \end{itemize} |
---|
| 1564 | |
---|
| 1565 | \newpage |
---|
| 1566 | \begin{itemize} |
---|
| 1567 | \item {\bf B$_3$-Spline Function:} $\{1/16,1/4,3/8,1/4,1/16\}$ |
---|
| 1568 | |
---|
| 1569 | \begin{tabular}{llll} |
---|
| 1570 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
| 1571 | 1 & 0.723489806 & 0.890796310 & 0.956543592\\ |
---|
| 1572 | 2 & 0.285450405 & 0.200663851 & 0.120336499\\ |
---|
| 1573 | 3 & 0.177947535 & 0.0855075048 & 0.0349500154\\ |
---|
| 1574 | 4 & 0.122223156 & 0.0412474444 & 0.0118164242\\ |
---|
| 1575 | 5 & 0.0858113122 & 0.0204249666 & 0.00413233507\\ |
---|
| 1576 | 6 & 0.0605703043 & 0.0101897592 & 0.00145703714\\ |
---|
| 1577 | 7 & 0.0428107206 & 0.00509204670 & 0.000514791120\\ |
---|
| 1578 | 8 & 0.0302684024 & 0.00254566946 & --\\ |
---|
| 1579 | 9 & 0.0214024008 & 0.00127279050 & --\\ |
---|
| 1580 | 10 & 0.0151336781 & 0.000636389722 & --\\ |
---|
| 1581 | 11 & 0.0107011079 & 0.000318194170 & --\\ |
---|
| 1582 | 12 & 0.00756682272 & -- & --\\ |
---|
| 1583 | 13 & 0.00535055108 & -- & --\\ |
---|
| 1584 | %14 & 0.00378341085 & -- & --\\ |
---|
| 1585 | %15 & 0.00267527545 & -- & --\\ |
---|
| 1586 | %16 & 0.00189170541 & -- & --\\ |
---|
| 1587 | %17 & 0.00133763772 & -- & --\\ |
---|
| 1588 | %18 & 0.000945852704 & -- & -- |
---|
| 1589 | \end{tabular} |
---|
| 1590 | |
---|
| 1591 | \item {\bf Triangle Function:} $\{1/4,1/2,1/4\}$ |
---|
| 1592 | |
---|
| 1593 | \begin{tabular}{llll} |
---|
| 1594 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
| 1595 | 1 & 0.612372436 & 0.800390530 & 0.895954449 \\ |
---|
| 1596 | 2 & 0.330718914 & 0.272878894 & 0.192033014\\ |
---|
| 1597 | 3 & 0.211947812 & 0.119779282 & 0.0576484078\\ |
---|
| 1598 | 4 & 0.145740298 & 0.0577664785 & 0.0194912393\\ |
---|
| 1599 | 5 & 0.102310944 & 0.0286163283 & 0.00681278387\\ |
---|
| 1600 | 6 & 0.0722128185 & 0.0142747506 & 0.00240175885\\ |
---|
| 1601 | 7 & 0.0510388224 & 0.00713319703 & 0.000848538128 \\ |
---|
| 1602 | 8 & 0.0360857673 & 0.00356607618 & 0.000299949455 \\ |
---|
| 1603 | 9 & 0.0255157615 & 0.00178297280 & -- \\ |
---|
| 1604 | 10 & 0.0180422389 & 0.000891478237 & -- \\ |
---|
| 1605 | 11 & 0.0127577667 & 0.000445738098 & -- \\ |
---|
| 1606 | 12 & 0.00902109930 & 0.000222868922 & -- \\ |
---|
| 1607 | 13 & 0.00637887978 & -- & -- \\ |
---|
| 1608 | %14 & 0.00451054902 & -- & -- \\ |
---|
| 1609 | %15 & 0.00318942978 & -- & -- \\ |
---|
| 1610 | %16 & 0.00225527449 & -- & -- \\ |
---|
| 1611 | %17 & 0.00159471988 & -- & -- \\ |
---|
| 1612 | %18 & 0.000112763724 & -- & -- |
---|
| 1613 | |
---|
| 1614 | \end{tabular} |
---|
| 1615 | |
---|
| 1616 | \item {\bf Haar Wavelet:} $\{0,1/2,1/2\}$ |
---|
| 1617 | |
---|
| 1618 | \begin{tabular}{llll} |
---|
| 1619 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
| 1620 | 1 & 0.707167810 & 0.433012702 & 0.935414347 \\ |
---|
| 1621 | 2 & 0.500000000 & 0.216506351 & 0.330718914\\ |
---|
| 1622 | 3 & 0.353553391 & 0.108253175 & 0.116926793\\ |
---|
| 1623 | 4 & 0.250000000 & 0.0541265877 & 0.0413398642\\ |
---|
| 1624 | 5 & 0.176776695 & 0.0270632939 & 0.0146158492\\ |
---|
| 1625 | 6 & 0.125000000 & 0.0135316469 & 0.00516748303 |
---|
| 1626 | |
---|
| 1627 | \end{tabular} |
---|
| 1628 | |
---|
| 1629 | |
---|
| 1630 | \end{itemize} |
---|
| 1631 | |
---|
| 1632 | \end{document} |
---|
| 1633 | |
---|
| 1634 | |
---|
| 1635 | |
---|
| 1636 | |
---|
| 1637 | |
---|
| 1638 | |
---|
| 1639 | |
---|
| 1640 | |
---|
| 1641 | |
---|
| 1642 | |
---|
| 1643 | |
---|
| 1644 | |
---|
| 1645 | |
---|
| 1646 | |
---|
| 1647 | |
---|
| 1648 | |
---|
| 1649 | |
---|
| 1650 | |
---|
| 1651 | |
---|
| 1652 | |
---|
| 1653 | |
---|
| 1654 | |
---|
| 1655 | |
---|
| 1656 | |
---|
| 1657 | |
---|
| 1658 | |
---|
| 1659 | |
---|
| 1660 | |
---|
| 1661 | |
---|
| 1662 | |
---|
| 1663 | |
---|
| 1664 | |
---|
| 1665 | |
---|