1 | \documentclass[12pt,a4paper]{article} |
---|
2 | |
---|
3 | %Define a test for doing PDF format -- use different code below |
---|
4 | \newif\ifPDF |
---|
5 | \ifx\pdfoutput\undefined\PDFfalse |
---|
6 | \else\ifnum\pdfoutput > 0\PDFtrue |
---|
7 | \else\PDFfalse |
---|
8 | \fi |
---|
9 | \fi |
---|
10 | |
---|
11 | \textwidth=161 mm |
---|
12 | \textheight=240 mm |
---|
13 | \topmargin=-18 mm |
---|
14 | \oddsidemargin=0 mm |
---|
15 | \parindent=6 mm |
---|
16 | |
---|
17 | \usepackage[sort]{natbib} |
---|
18 | \usepackage{lscape} |
---|
19 | \bibpunct[,]{(}{)}{;}{a}{}{,} |
---|
20 | |
---|
21 | \newcommand{\eg}{e.g.\ } |
---|
22 | \newcommand{\ie}{i.e.\ } |
---|
23 | \newcommand{\hi}{H{\sc i}} |
---|
24 | \newcommand{\hipass}{{\sc hipass}} |
---|
25 | \newcommand{\progname}{{\tt Duchamp}} |
---|
26 | \newcommand{\diff}{{\rm d}} |
---|
27 | \newcommand{\entrylabel}[1]{\mbox{\textsf{\bf{#1:}}}\hfil} |
---|
28 | \newenvironment{entry} |
---|
29 | {\begin{list}{}% |
---|
30 | {\renewcommand{\makelabel}{\entrylabel}% |
---|
31 | \setlength{\labelwidth}{30mm}% |
---|
32 | \setlength{\labelsep}{5pt}% |
---|
33 | \setlength{\itemsep}{2pt}% |
---|
34 | \setlength{\parsep}{2pt}% |
---|
35 | \setlength{\leftmargin}{35mm}% |
---|
36 | }% |
---|
37 | }% |
---|
38 | {\end{list}} |
---|
39 | |
---|
40 | |
---|
41 | \title{A Guide to the {\it Duchamp} Source Finding Software} |
---|
42 | \author{Matthew Whiting\\ |
---|
43 | %{\small \href{mailto:Matthew.Whiting@csiro.au}{Matthew.Whiting@csiro.au}}\\ |
---|
44 | Australia Telescope National Facility\\CSIRO} |
---|
45 | %\date{January 2006} |
---|
46 | \date{} |
---|
47 | |
---|
48 | % If we are creating a PDF, use different options for graphicx, hyperref. |
---|
49 | \ifPDF |
---|
50 | \usepackage[pdftex]{graphicx,color} |
---|
51 | \usepackage[pdftex]{hyperref} |
---|
52 | \hypersetup{colorlinks=true,% |
---|
53 | % citecolor=black,% |
---|
54 | % filecolor=black,% |
---|
55 | % linkcolor=black,% |
---|
56 | % urlcolor=black,% |
---|
57 | } |
---|
58 | \else |
---|
59 | \usepackage[dvips]{graphicx} |
---|
60 | \usepackage[dvips]{hyperref} |
---|
61 | \fi |
---|
62 | |
---|
63 | \begin{document} |
---|
64 | |
---|
65 | \maketitle |
---|
66 | \tableofcontents |
---|
67 | |
---|
68 | \newpage |
---|
69 | \section{Introduction and getting going quickly} |
---|
70 | |
---|
71 | This document gives details on the use of the program Duchamp. This |
---|
72 | has been written to provide a source-detection facility for |
---|
73 | spectral-line data cubes. The basic execution of Duchamp is to read |
---|
74 | in a FITS data cube, find sources in the cube, and produce a text |
---|
75 | file of positions, velocities and fluxes of the detections, as well as |
---|
76 | a postscript file of the spectra of each detection. |
---|
77 | |
---|
78 | So, you have a FITS cube, and you want to find the sources in it. What |
---|
79 | do you do? The first step is to make an input file that contains the |
---|
80 | list of parameters. Brief and detailed examples are shown in |
---|
81 | Appendix~\ref{app-input}. This provides the input file name, the various |
---|
82 | output files, and defines various parameters that control the |
---|
83 | execution. |
---|
84 | |
---|
85 | The program is run by the command |
---|
86 | \begin{quote} |
---|
87 | {\tt Duchamp -p [parameter file]} |
---|
88 | \end{quote} |
---|
89 | replacing {\tt [parameter file]} with the name of the file you have |
---|
90 | just created/copied. The program will then work away and give you the |
---|
91 | list of detections and their spectra. The program execution is |
---|
92 | summarise below, and detailed in \S\ref{sec-flow}. Information on |
---|
93 | inputs is in \S\ref{sec-param} and Appendix~\ref{app-param}, and |
---|
94 | descriptions of the output is in \S\ref{sec-output}. |
---|
95 | |
---|
96 | \subsection{A summary of the execution steps} |
---|
97 | |
---|
98 | The basic flow of the program is summarised here. All these steps are |
---|
99 | discussed in more detail in the following sections, so read on if |
---|
100 | you have questions! |
---|
101 | \begin{enumerate} |
---|
102 | \item The parameter file given on the command line is read in, and the |
---|
103 | parameters absorbed. |
---|
104 | \item From the parameter file, the FITS image is located and read in |
---|
105 | to memory. |
---|
106 | \item If requested, blank pixels are trimmed from the edges, and |
---|
107 | channels corresponding to bright (\eg Galactic) emission are |
---|
108 | excised. |
---|
109 | \item If requested, the baseline of each spectrum is removed. |
---|
110 | \item If the reconstruction method is requested, the cube is |
---|
111 | reconstructed using the {\it {\' a} trous} wavelet method. |
---|
112 | \item Searching for objects then takes place, using the requested |
---|
113 | thresholding method. |
---|
114 | \item The list of objects is trimmed by merging neighbouring objects |
---|
115 | and removing those deemed unacceptable. |
---|
116 | \item The baselines and trimmed pixels are replaced prior to output. |
---|
117 | \item The details on the detections are written to screen and to the |
---|
118 | requested output file. |
---|
119 | \item Maps showing the spatial location of the detections are written. |
---|
120 | \item The integrated spectra of each detection are written to a |
---|
121 | postscript file. |
---|
122 | \item If requested, the reconstructed array can be written to a new |
---|
123 | FITS file. |
---|
124 | \end{enumerate} |
---|
125 | |
---|
126 | \subsection{Guide to terminology} |
---|
127 | |
---|
128 | First, a brief note on the use of terminology in this guide. Duchamp |
---|
129 | is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is |
---|
130 | the Flexible Image Transport System -- see \citet{hanisch01} or |
---|
131 | websites such as |
---|
132 | \href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html} |
---|
133 | for details.} image arrays with three dimensions -- they are assumed |
---|
134 | to have the following form: the first two dimensions (referred to as |
---|
135 | $x$ and $y$) are spatial directions (that is, relating to the position |
---|
136 | on the sky), while the third dimension, $z$, is the spectral |
---|
137 | direction, which can correspond to frequency, wavelength, or velocity. |
---|
138 | |
---|
139 | Each spatial pixel (a given $(x,y)$ coordinate) can be said to be a |
---|
140 | single spectrum, while a slice through the cube perpendicular to the |
---|
141 | spectral direction at a given $z$-value is a single channel (the 2-D |
---|
142 | image is a channel map). |
---|
143 | |
---|
144 | Features that are detected are assumed to be positive. If one wants to |
---|
145 | search for absorption (negative) features, try multiplying your cube |
---|
146 | by $-1$ before running Duchamp. |
---|
147 | |
---|
148 | Note that it is possible to run Duchamp on a two-dimensional image |
---|
149 | (\ie one with no frequency or velocity information), or indeed a |
---|
150 | one-dimensional array, and many of the features of the program will |
---|
151 | work fine. The focus, however, is on object detection in three |
---|
152 | dimensions. |
---|
153 | |
---|
154 | \subsection{Why ``Duchamp''?} |
---|
155 | |
---|
156 | Well, it's important for a program to have a name, and it certainly |
---|
157 | beats the initial version of ``cubefind''. I had planned to call it |
---|
158 | ``Picasso'' (as in the father of cubism), but sadly this had already |
---|
159 | been used before \citep{minchin99}. So I settled on naming it after |
---|
160 | Marcel Duchamp, another cubist, but also one of the first artists to |
---|
161 | work with ``found objects''. |
---|
162 | |
---|
163 | \section{User Inputs} |
---|
164 | \label{sec-param} |
---|
165 | |
---|
166 | Input to the program is provided by means of a parameter file. Parameters |
---|
167 | are listed in the file, followed by the value that should be assigned |
---|
168 | to them. The syntax used is {\tt paramName value}. The file is not |
---|
169 | case-sensitive, and lines in the input file that start with {\tt \#} are |
---|
170 | ignored. If a parameter is listed more than once, the latter value is |
---|
171 | used, but otherwise the order in which the parameters are listed in the |
---|
172 | input file is arbitrary. |
---|
173 | |
---|
174 | If a parameter is not listed, the default value is assumed. The |
---|
175 | defaults are chosen to provide a good result (using the reconstruction |
---|
176 | method), so the user doesn't need to specify many new parameters in |
---|
177 | the input file. Note that the image file {\bf must} be specified! The |
---|
178 | parameters that can be set are listed in Appendix~\ref{app-param}, |
---|
179 | with their default values in parentheses. |
---|
180 | |
---|
181 | The 'flag' parameters are stored as {\tt bool} variables, and so are |
---|
182 | either {\tt true = 1} or {\tt false = 0}. Currently the program only |
---|
183 | reads them from the file as integers, and so they should be entered in |
---|
184 | the file as 0 or 1 (see example file in Appendix~\ref{app-input}). |
---|
185 | |
---|
186 | \section{What the program is doing} |
---|
187 | \label{sec-flow} |
---|
188 | |
---|
189 | The execution flow of the program is detailed here, indicating the |
---|
190 | main algorithmic steps that are used. The program is written in C/C++ |
---|
191 | and makes use of the {\sc cfitsio}, {\sc wcslib} and {\sc pgplot} |
---|
192 | libraries. |
---|
193 | |
---|
194 | %\subsection{Parameter input} |
---|
195 | % |
---|
196 | %The user provides parameters that govern the selection of files and |
---|
197 | %the parameters used by the various subroutines in the program. This is |
---|
198 | %done via a parameter file, and the parameters are stored in a C++ |
---|
199 | %class for use throughout the program. The form of the parameter file is |
---|
200 | %discussed in \S\ref{sec-param}, and the parameters themselves are |
---|
201 | %listed in Appendix~\ref{app-param}. |
---|
202 | |
---|
203 | \subsection{Image input} |
---|
204 | |
---|
205 | The cube is read in using basic {\sc cfitsio} commands, and stored as |
---|
206 | an array in a special C++ class structure. This class keeps track of |
---|
207 | the list of detected objects, as well as any reconstructed arrays that |
---|
208 | are made (see \S\ref{sec-recon}). The World Coordinate System (WCS) |
---|
209 | information for the cube is also obtained from the FITS header by {\sc |
---|
210 | wcslib} functions \citep{greisen02, calabretta02}, and this |
---|
211 | information, in the form of a {\tt wcsprm} structure, is also stored |
---|
212 | in the same class. |
---|
213 | |
---|
214 | A sub-section of an image can be requested. This is done via the {\tt |
---|
215 | subsection} parameter in the parameter file. The generalised form of |
---|
216 | the subsection that is used by {\sc cfitsio} is {\tt |
---|
217 | [x1:x2:dx,y1:y2:dy,z1:z2:dz]}, such that the x-coordinates run from |
---|
218 | {\tt x1} to {\tt x2} (inclusive), with steps of {\tt dx}. The step |
---|
219 | value can be omitted (so a subsection of the form {\tt |
---|
220 | [2:50,2:50,10:1000]} is still valid). Duchamp does not at this |
---|
221 | stage deal with the presence of steps in the subsection string, and |
---|
222 | any that are present are removed before the file is opened. |
---|
223 | |
---|
224 | If one wants the full range of a value then replace the range with an |
---|
225 | asterisk, \eg {\tt [2:50,2:50,*]}. If one wants to use just a |
---|
226 | subsection, one must set {\tt flagSubsection = 1}. A complete |
---|
227 | description of the section syntax can be found at the {\sc fitsio} web |
---|
228 | site |
---|
229 | \footnote{ |
---|
230 | \href{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}% |
---|
231 | {http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}}. |
---|
232 | |
---|
233 | \subsection{Image modification} |
---|
234 | \label{sec-modify} |
---|
235 | |
---|
236 | Several modifications to the cube can be made that improve the |
---|
237 | execution and efficiency of Duchamp (these are optional -- their |
---|
238 | use is indicated by the relevant flags set in the input parameter |
---|
239 | file). |
---|
240 | |
---|
241 | \subsubsection{Milky-Way removal} |
---|
242 | |
---|
243 | First, a single set of contiguous channels can be removed -- these may |
---|
244 | exhibit very strong emission, such as that from the Milky Way as seen |
---|
245 | in extragalactic \hi\ cubes (hence the references to ``Milky Way'' in |
---|
246 | relation to this task -- apologies to Galactic astronomers!). Such |
---|
247 | dominant channels will both produce many unnecessary, uninteresting |
---|
248 | and large (in size and hence in memory usage) detections, and will |
---|
249 | also affect any reconstruction that is performed (see next |
---|
250 | section). The use of this feature is controlled by the {\tt flagMW} |
---|
251 | parameter, and the exact channels concerned are able to be set by the |
---|
252 | user (using {\tt maxMW} and {\tt minMW}). When employed, the flux in |
---|
253 | these channels is set to zero. The information in those channels is |
---|
254 | not kept. |
---|
255 | |
---|
256 | \subsubsection{Blank pixel removal} |
---|
257 | |
---|
258 | Second, the cube is trimmed of any BLANK pixels that pad the image |
---|
259 | out to a rectangular shape. This is also optional, being determined by |
---|
260 | the {\tt flagBlankPix} parameter. The value for these pixels is read from |
---|
261 | the FITS header (using the BLANK, BSCALE and BZERO keywords), but if |
---|
262 | these are not present then the value can be specified by the user in |
---|
263 | the parameter file. If these blank pixels are stored as NaNs, then a |
---|
264 | normal number will be substituted (allowing these pixels to be |
---|
265 | accurately removed without adverse effects). [NOTE: this appears not |
---|
266 | to be working correctly at time of writing. If your data has |
---|
267 | unspecified BLANKs, be wary...] |
---|
268 | |
---|
269 | This stage is particularly important for the reconstruction step, as |
---|
270 | lots of BLANK pixels on the edges will smooth out features in the |
---|
271 | wavelet calculation stage. The trimming will also reduce the size of |
---|
272 | the cube's array, speeding up the execution. The amount of trimming is |
---|
273 | recorded, and these pixels are added back in once the source-detection |
---|
274 | is completed (so that quoted pixel positions are applicable to the |
---|
275 | original cube). |
---|
276 | |
---|
277 | Rows and columns are trimmed one at a time until the first non-BLANK |
---|
278 | pixel is reached, so that the image remains rectangular. In practice, |
---|
279 | this means that there will be BLANK pixels left in the trimmed image |
---|
280 | (if the non-BLANK region is non-rectangular). However, these are |
---|
281 | ignored in all further calculations done on the cube. |
---|
282 | |
---|
283 | \subsubsection{Baseline removal} |
---|
284 | |
---|
285 | Finally, the user may request the removal of baselines from the |
---|
286 | spectra, via the parameter {\tt flagBaseline}. This may be necessary |
---|
287 | if there is a strong baseline ripple present, which can result in |
---|
288 | spurious detections on the high points of the ripple. The baseline is |
---|
289 | calculated from a wavelet reconstruction procedure (see |
---|
290 | \S\ref{sec-recon}) that keeps only the two largest scales. This is |
---|
291 | done separately for each spatial pixel (\ie for each spectrum in the |
---|
292 | cube), and the baselines are stored and added back in before any |
---|
293 | output is done. In this way the quoted fluxes and displayed spectra |
---|
294 | are as one would see from the input cube itself -- even though the |
---|
295 | detection (and reconstruction if applicable) is done on the |
---|
296 | baseline-removed cube. |
---|
297 | |
---|
298 | \subsection{Image reconstruction} |
---|
299 | \label{sec-recon} |
---|
300 | |
---|
301 | This is an optional step. The user can direct Duchamp to |
---|
302 | reconstruct the data cube using the {\it {\`a} trous} wavelet |
---|
303 | procedure. A good description of the procedure can be found in |
---|
304 | \citet{starck02:book}. This is an effective way of removing a |
---|
305 | lot of the noise in the image, but at this stage is relatively time- |
---|
306 | and memory-intensive. The steps in the procedure are as follows: |
---|
307 | \begin{enumerate} |
---|
308 | \item Set the reconstructed array to 0 everywhere. |
---|
309 | \item The cube is discretely convolved with a given filter |
---|
310 | function. This is determined from the parameter file via the {\tt |
---|
311 | filterCode} parameter -- see Appendix~\ref{app-param} for details on |
---|
312 | the filters available. |
---|
313 | \item The wavelet coefficients are calculated by taking the difference |
---|
314 | between the convolved array and the input array. |
---|
315 | \item If the wavelet coefficients at a given point are above the |
---|
316 | threshold requested (given by {\tt snrRecon} as the number of |
---|
317 | $\sigma$ above the mean and adjusted to the current scale), add |
---|
318 | these to the reconstructed array. |
---|
319 | \item The separation of the filter coefficients is doubled. |
---|
320 | \item The procedure is repeated from step 2, using the convolved array |
---|
321 | as the input array. |
---|
322 | \item Continue until the required maximum number of scales is reached. |
---|
323 | \item Add the final smoothed (\ie convolved) array to the |
---|
324 | reconstructed array. This provides the ``DC offset'', as each of the |
---|
325 | wavelet coefficient arrays will have zero mean. |
---|
326 | \end{enumerate} |
---|
327 | |
---|
328 | It is important to note that the {\it {\`a} trous} decomposition is |
---|
329 | an example of a ``redundant'' transformation. If no thresholding is |
---|
330 | performed, the sum of all the wavelet coefficient arrays and the final |
---|
331 | smoothed array is identical to the input array. The thresholding thus |
---|
332 | removes only the unwanted structure in the array. |
---|
333 | |
---|
334 | The statistics of the cube are estimated using robust methods, to |
---|
335 | avoid corruption by strong outlying points. The mean is actually |
---|
336 | estimated by the median, while the median absolute deviation from the |
---|
337 | median (MADFM) is calculated and corrected assuming Gaussianity to |
---|
338 | estimate the standard deviation $\sigma$. The Gaussianity (or |
---|
339 | Normality) assumption is critical, as the MADFM does not give the same |
---|
340 | value as the usual rms or standard deviation value -- for a normal |
---|
341 | distribution $N(\mu,\sigma)$ we find MADFM$=0.6744888\sigma$. The |
---|
342 | difference between the MADFM and $\sigma$ is corrected for, so the |
---|
343 | user need only think in the usual multiples of $\sigma$ when setting |
---|
344 | {\tt snrRecon}. See Appendix~\ref{app-madfm} for a derivation of this |
---|
345 | value. |
---|
346 | |
---|
347 | When thresholding the different wavelet scales, the value of $\sigma$ |
---|
348 | as measured from the input array needs to be scaled to account for the |
---|
349 | increased amount of correlation between neighbouring pixels (due to |
---|
350 | the convolution). See Appendix~\ref{app-madfm} for details on this |
---|
351 | scaling. |
---|
352 | |
---|
353 | The user can also select the minimum scale to be used in the |
---|
354 | reconstruction -- the first scale exhibits the highest frequency |
---|
355 | variations, and so ignoring this one can sometimes be beneficial in |
---|
356 | removing excess noise. The default, however, is to use all scales |
---|
357 | ({\tt minscale = 1}). |
---|
358 | |
---|
359 | The reconstruction has at least two iterations. The first iteration |
---|
360 | makes a first pass at the wavelet reconstruction (the process outlined |
---|
361 | in the 8 stages above), but the residual array will inevitably have |
---|
362 | some structure still in it, so the wavelet filtering is done on the |
---|
363 | residual, and any significant wavelet terms are added to the final |
---|
364 | reconstruction. This step is repeated until the change in the $\sigma$ |
---|
365 | of the background is less than some fiducial amount. |
---|
366 | |
---|
367 | The user can optionally select to save the reconstructed image as a |
---|
368 | FITS file -- at the moment this is just saved in the same directory as |
---|
369 | the input file, so it won't work if the user does not have write |
---|
370 | permissions on that directory. See Appendix~\ref{app-param} for details |
---|
371 | on the naming of the output image. The residual image, which is the |
---|
372 | difference between the input image and the reconstructed image, can |
---|
373 | also be saved in the same manner. |
---|
374 | |
---|
375 | Finally, note that any BLANK pixels that are still in the cube |
---|
376 | will not be altered by the reconstruction -- they will be left as |
---|
377 | BLANK so that the shape of the valid part of the cube is preserved. |
---|
378 | |
---|
379 | \subsection{Searching the image} |
---|
380 | \label{sec-detection} |
---|
381 | |
---|
382 | The image is searched for detections in two ways: spectrally (a |
---|
383 | 1-dimensional search in the spectrum in each spatial pixel), and |
---|
384 | spatially (a 2-dimensional search in the spatial image in each |
---|
385 | channel). In both cases, the algorithm finds connected pixels that are |
---|
386 | above the user-specified threshold. In the case of the spatial image |
---|
387 | search, the algorithm of \citet{lutz80} is used to raster scan through |
---|
388 | the image and connect groups of pixels on neighbouring rows. |
---|
389 | |
---|
390 | Note that this algorithm cannot be applied directly to a 3-dimensional |
---|
391 | case, as it requires that objects are completely nested in a row: that |
---|
392 | is, if you are scanning along a row, and one object finishes and |
---|
393 | another starts, you know that you will not get back to the first one |
---|
394 | (if at all) until the second is finished for that |
---|
395 | row. Three-dimensional data does not have this property, which is why |
---|
396 | we break up the searching into 1- and 2-dimensional cases. |
---|
397 | |
---|
398 | The determination of the threshold is done in one of two ways. The |
---|
399 | first way is a simple sigma-clipping, where a threshold defined as |
---|
400 | $n\sigma$ above the mean is set and pixels above this threshold are |
---|
401 | flagged as detected. As before, the value for $\sigma$ is estimated by |
---|
402 | the MADFM, and corrected by the ratio derived in |
---|
403 | Appendix~\ref{app-madfm}. |
---|
404 | |
---|
405 | The second method uses the False Discovery Rate (FDR) technique |
---|
406 | \citep{miller01,hopkins02}, whose basis we briefly detail here. The |
---|
407 | false discovery rate (given by the number of false detections divided |
---|
408 | by the total number of detections) is fixed at a certain value |
---|
409 | $\alpha$ (\eg $\alpha=0.05$ implies 5\% of detections are false |
---|
410 | positives). In practice, an $\alpha$ value is chosen, and the ensemble |
---|
411 | average FDR (\ie $<FDR>$) when the method is used will be less than |
---|
412 | $\alpha$. One calculates $p$ -- the probability, assuming the null |
---|
413 | hypothesis is true, of obtaining a test statistic as extreme as the |
---|
414 | pixel value (the observed test statistic) -- for each pixel, and sorts |
---|
415 | them in increasing order. One then calculates $d$ where |
---|
416 | \[ |
---|
417 | d = \max_j \left\{ j : P_j < \frac{j\alpha}{c_N N} \right\}, |
---|
418 | \] |
---|
419 | and then rejects all hypotheses whose $p$-values are less than or equal |
---|
420 | to $P_d$. (So a $P_i<P_d$ will be rejected even if $P_i \geq |
---|
421 | j\alpha/c_N N$.) Note that ``reject hypothesis'' here means ``accept |
---|
422 | the pixel as an object pixel'' (\ie we are rejecting the null |
---|
423 | hypothesis that the pixel belongs to the background). |
---|
424 | |
---|
425 | The $c_N$ values here are normalisation constants that depend on the |
---|
426 | correlated nature of the pixel values. If all the pixels are |
---|
427 | uncorrelated, then $c_N=1$. If $N$ pixels are correlated, then their |
---|
428 | tests will be dependent on each other, and so $c_N = \sum_{i=1}^N |
---|
429 | i^{-1}$. \citet{hopkins02} consider real radio data, where the pixels |
---|
430 | are correlated over the beam. In this case the sum is made over the |
---|
431 | $N$ pixels that make up the beam. The value of $N$ is calculated from |
---|
432 | the FITS header (if the correct keywords -- BMAJ, BMIN -- are not |
---|
433 | present, a default value of 10 pixels is assumed). |
---|
434 | |
---|
435 | If a reconstruction has been made, the residuals (defined as original |
---|
436 | $-$ reconstruction) are used to estimate the noise parameters of the |
---|
437 | cube. Otherwise they are estimated directly from the cube itself. In |
---|
438 | both cases, the median is used as a robust estimator of the mean |
---|
439 | value, although the $\sigma$ is estimated by the standard deviation |
---|
440 | (of the residual array, in the case of the reconstruction, but of the |
---|
441 | original array otherwise). |
---|
442 | |
---|
443 | Detections must have a minimum number of pixels to be counted. This |
---|
444 | minimum number is given by the input parameters {\tt minPix} (for |
---|
445 | 2-dimensional searches) and {\tt minChannels} (for 1-dimensional |
---|
446 | searches). Note again that only positive thresholding is done -- |
---|
447 | negative features are not searched for. |
---|
448 | |
---|
449 | \subsection{Merging detected objects} |
---|
450 | \label{sec-merger} |
---|
451 | |
---|
452 | The searching step produces a list of detections that will have many |
---|
453 | repeated detections of a given object -- for instance, spectral |
---|
454 | detections in adjacent pixels of the same object and/or spatial |
---|
455 | detections in neighbouring channels. These are then combined in an |
---|
456 | algorithm that matches all objects judged to be ``close''. This |
---|
457 | determination is made in one of two ways. |
---|
458 | |
---|
459 | One way is to define two thresholds -- one spatial and one in velocity |
---|
460 | -- and say that two objects should be merged if there is at least one |
---|
461 | pair of pixels that lie within these threshold distances of each |
---|
462 | other. These thresholds are specified by the parameters {\tt |
---|
463 | threshSpatial} and {\tt threshVelocity} (in units of pixels and |
---|
464 | channels respectively). |
---|
465 | |
---|
466 | Alternatively, the spatial requirement can be changed to say that |
---|
467 | there must be a pair of pixels that are {\it adjacent} -- a stricter, |
---|
468 | but more realistic requirement, particularly when the spatial pixels |
---|
469 | have a large angular size (as is the case for \hi\ surveys). This |
---|
470 | method can be selected by setting the parameter |
---|
471 | {\tt flagAdjacent} to 1 (\ie {\tt true}) in the parameter file. The |
---|
472 | velocity thresholding is done in the same way as the first option. |
---|
473 | |
---|
474 | Once the detections have been merged, they may be ``grown''. This is a |
---|
475 | process of increasing the size of the detection by adding adjacent |
---|
476 | pixels that are above some secondary threshold. This threshold is |
---|
477 | lower than the one used for the initial detection, but above the noise |
---|
478 | level, so that faint pixels are only detected when they are close to a |
---|
479 | bright pixel. The value of this threshold is a possible input |
---|
480 | parameter ({\tt growthCut}), with a default value of $1.5\sigma$. The |
---|
481 | use of the growth algorithm is controlled by the {\tt flagGrowth} |
---|
482 | parameter -- the default value of which is {\tt false}. If the |
---|
483 | detections are grown, they are sent through the merging algorithm a |
---|
484 | second time, to pick up any detections that now overlap or have grown |
---|
485 | over each other. |
---|
486 | |
---|
487 | Finally, to be accepted, the detections must span {\it both} a minimum |
---|
488 | number of channels (to remove any spurious single-channel spikes that |
---|
489 | may be present), and a minimum number of spatial pixels. These |
---|
490 | numbers, as for the original detection step, are set with the {\tt |
---|
491 | minChannels} and {\tt minPix} parameters. The channel requirement |
---|
492 | means there must be at least one set of this many consecutive channels |
---|
493 | in the source for it to be accepted. |
---|
494 | |
---|
495 | \section{Outputs} |
---|
496 | \label{sec-output} |
---|
497 | |
---|
498 | \subsection{During execution} |
---|
499 | |
---|
500 | Duchamp provides the user with feedback whilst it is running, to |
---|
501 | keep the user informed on the progress of the analysis. Most of this |
---|
502 | consists of self-explanatory messages about the particular stage the |
---|
503 | program is up to. The relevant parameters are printed to the screen at |
---|
504 | the start (once the file has been successfully read in), so the user |
---|
505 | is able to make a quick check that the setup is correct. |
---|
506 | |
---|
507 | If the cube is being trimmed (\S\ref{sec-modify}), the resulting |
---|
508 | dimensions are printed to indicate how much has been trimmed. If a |
---|
509 | reconstruction is being done, a continually updating message shows the |
---|
510 | current iteration and scale (compared to the maximum scale). |
---|
511 | |
---|
512 | During the searching algorithms, the progress through the 1D and 2D |
---|
513 | searches are shown. When the searches have completed, |
---|
514 | the number of objects found in both the 1D and 2D searches are |
---|
515 | reported (see \S\ref{sec-detection} for details). |
---|
516 | |
---|
517 | In the merging process (where multiple detections of the same object |
---|
518 | are combined -- see \S\ref{sec-merger}), two stages of output |
---|
519 | occur. The first is when each object in the list is compared with all |
---|
520 | others. The output shows two numbers: the first being how far through |
---|
521 | the list we are, and the second being the length of the list. As the |
---|
522 | algorithm proceeds, the first number should increase and the second |
---|
523 | should decrease (as objects are being combined). When the numbers |
---|
524 | meet, the second phase begins, of removing multiply-appearing pixels |
---|
525 | in each object and removing objects not meeting the minimum channels |
---|
526 | requirement. During this phase, the total number of accepted objects |
---|
527 | is shown, which should steadily increase until all have been accepted |
---|
528 | or rejected. Note that these steps can be very quick for small numbers |
---|
529 | of detections. |
---|
530 | |
---|
531 | Since this continual printing to screen has some overhead of time and |
---|
532 | CPU involved, the user can elect to not print this information by |
---|
533 | setting the parameter {\tt verbose = 0}. In this case, the user is |
---|
534 | still informed as to the steps being undertaken, but the details of |
---|
535 | the progress are not shown. |
---|
536 | |
---|
537 | \subsection{Results} |
---|
538 | |
---|
539 | Finally, we get to the results -- the reason for running Duchamp in |
---|
540 | the first place. Once the detection list is finalised, it is sorted by |
---|
541 | the mean velocity of the detections (or, if there is no good WCS with |
---|
542 | the cube, by the mean Z-pixel position). The results are then |
---|
543 | printed to the screen and to the output file, denoted by the {\tt |
---|
544 | OutFile} parameter. The results list, an example of which can be seen |
---|
545 | in Appendix~\ref{app-output}, contains the following columns: |
---|
546 | |
---|
547 | \begin{entry} |
---|
548 | \item[Obj\#] The ID number of the detection (simply the sequential |
---|
549 | count for the list, which is ordered by increasing velocity). |
---|
550 | \item[Name] The IAU-format name of the detection (based on the RA \& |
---|
551 | Dec). |
---|
552 | \item[X] The average X-pixel position. |
---|
553 | \item[Y] The average Y-pixel position. |
---|
554 | \item[Z] The average Z-pixel position. |
---|
555 | \item[RA] The Right Ascension of the centre of the object. |
---|
556 | \item[DEC] The Declination of the centre of the object. |
---|
557 | \item[w\_RA] The width of the object in Right Ascension [arcmin]. |
---|
558 | \item[w\_DEC] The width of the object in Declination [arcmin]. |
---|
559 | \item[VEL] The mean velocity of the object [km/s]. |
---|
560 | \item[w\_VEL] The full velocity width of the detection (max channel |
---|
561 | $-$ min channel, in velocity units [km/s]). |
---|
562 | \item[X1, X2] The minimum and maximum X-pixel coordinates. |
---|
563 | \item[Y1, Y2] The minimum and maximum Y-pixel coordinates. |
---|
564 | \item[Z1, Z2] The minimum and maximum Z-pixel coordinates. |
---|
565 | \item[Npix] The number of pixels \& channels (\ie distinct $(x,y,z)$ |
---|
566 | coordinates) in the detection. |
---|
567 | \item[F\_tot] The integrated flux over the object, in the units of the |
---|
568 | FITS file. %This is calculated |
---|
569 | \item[F\_peak] The peak flux over the object, in the units of the FITS |
---|
570 | file. |
---|
571 | \end{entry} |
---|
572 | If the WCS is not valid (\ie is not present or does not have all the |
---|
573 | necessary information), the Name, RA, DEC, VEL and related columns are not |
---|
574 | printed, but the pixel coordinates are still provided. |
---|
575 | |
---|
576 | \begin{figure}[t] |
---|
577 | \begin{center} |
---|
578 | \includegraphics[width=\textwidth]{example_spectrum} |
---|
579 | \end{center} |
---|
580 | \caption{\footnotesize An example of the spectrum output. Note several |
---|
581 | of the features discussed in the text: the removal of the Milky Way |
---|
582 | emission around 0 km/s; the red lines indicating the reconstructed |
---|
583 | spectrum; the blue dashed lines indicating the spectral extent of |
---|
584 | the detection; and the blue border showing its spatial extent on the |
---|
585 | 0th moment map.} |
---|
586 | \label{fig-spect} |
---|
587 | \end{figure} |
---|
588 | |
---|
589 | Two alternative results files can also be requested. One option is a |
---|
590 | VOTable-format XML file, containing just the RA, Dec, Velocity and the |
---|
591 | corresponding widths of the detections, as well as the fluxes. The |
---|
592 | user should set {\tt flagVOT = 1}, and put the desired filename in the |
---|
593 | parameter {\tt votFile} -- note that the default is for it not to be |
---|
594 | produced. This file should be compatible with all Virtual Observatory |
---|
595 | tools (such as Aladin\footnote{ Aladin can be found on the web at |
---|
596 | \href{http://aladin.u-strasbg.fr/}{http://aladin.u-strasbg.fr/}}). The |
---|
597 | second option is an annotation file for use with the Karma toolkit of |
---|
598 | visualisation tools (in particular, with {\tt kvis}). This will draw a |
---|
599 | circle at the position of each detection, and number it according to |
---|
600 | the Obj\# given above. To use, the user should set {\tt flagKarma = 1}, |
---|
601 | and put the desired filename in the parameter {\tt karmaFile} -- again, |
---|
602 | the default is for it not to be produced. |
---|
603 | |
---|
604 | As the program is running, it also (optionally) records the detections |
---|
605 | made in each individual spectrum or channel (see |
---|
606 | \S\ref{sec-detection} for details on this process). This is |
---|
607 | recorded in the file denoted by the parameter {\tt LogFile}. This file |
---|
608 | does not include the columns {\tt Name, RA, DEC, w\_RA, w\_DEC, VEL, |
---|
609 | w\_VEL}. This file is designed primarily for diagnostic purposes: \eg |
---|
610 | to see if a given set of pixels is detected in, say, one channel |
---|
611 | image, but does not survive the merging process. The list of pixels |
---|
612 | (and their fluxes) in the final detection list are also printed to |
---|
613 | this file, again for diagnostic purposes. This feature can be turned |
---|
614 | off by setting {\tt flagLog = false}. (This may be a good idea if you |
---|
615 | are not interested in its contents, as it can be a large file.) |
---|
616 | |
---|
617 | \begin{figure}[!t] |
---|
618 | \begin{center} |
---|
619 | \includegraphics[width=\textwidth]{example_moment_map} |
---|
620 | \end{center} |
---|
621 | \caption{\footnotesize An example of the moment map created by |
---|
622 | Duchamp. The full extent of the cube is covered, and the 0th moment |
---|
623 | of each object is shown (integrated individually over all the |
---|
624 | detected channels).} |
---|
625 | \label{fig-moment} |
---|
626 | \end{figure} |
---|
627 | |
---|
628 | As well as the output data file, a postscript file is created that |
---|
629 | shows the integrated spectra of each detection, together with a small |
---|
630 | cutout image (0th moment) and basic information of the detection. If |
---|
631 | the cube was reconstructed, the spectrum from the reconstruction is |
---|
632 | shown in red, over the top of the original spectrum. The spectral |
---|
633 | extent of the detection is indicated with green lines, and a zoom is |
---|
634 | shown in a separate window. The cutout image can optionally include a |
---|
635 | border around the spatial pixels that are in the detection (turned on |
---|
636 | and off by the parameter {\tt drawBorders}). It also includes a scale |
---|
637 | bar in the bottom left corner to indicate size -- it is 30~arcmin |
---|
638 | long. An example detection can be seen below in Fig.~\ref{fig-spect}. |
---|
639 | |
---|
640 | Finally, a couple of images are optionally produced: a 0th moment map |
---|
641 | of the cube, combining just the detected channels in each object, |
---|
642 | showing the integrated flux in grey-scale; and a ``detection image'', |
---|
643 | a grey-scale image where the pixel values are the number of channels |
---|
644 | that spatial pixel is detected in. In both cases, if {\tt drawBorders = |
---|
645 | true}, a border is drawn around the spatial extent of each |
---|
646 | detection. An example moment map is shown in Fig.~\ref{fig-moment}. |
---|
647 | The production or otherwise of these images is governed by the {\tt |
---|
648 | flagMaps} parameter. |
---|
649 | |
---|
650 | The purpose of these images are to provide a visual guide to where the |
---|
651 | detections have been made, and, particularly in the case of the moment |
---|
652 | map, to provide an indication of the strength of the source. In both |
---|
653 | cases, the detections are numbered (in the same way as the output |
---|
654 | list), and the spatial borders are marked out as for the cutout images |
---|
655 | in the spectra file. Both these images are saved as postscript files |
---|
656 | (given by the parameters {\tt momentMap} and {\tt detectionMap} |
---|
657 | respectively), with the latter also displayed in a {\sc pgplot} |
---|
658 | window (regardless of the state of {\tt flagMaps}). |
---|
659 | |
---|
660 | \section{Notes and hints on the use of Duchamp} |
---|
661 | |
---|
662 | In using Duchamp, the user has to make a number of decisions about |
---|
663 | the way the program runs. This section is designed to give the user |
---|
664 | some idea about what the various selections do... |
---|
665 | |
---|
666 | The main choice is whether or not to use the wavelet |
---|
667 | reconstruction. The main benefits of this are the marked reduction in |
---|
668 | the noise level, leading to regularly-shaped detections, and good |
---|
669 | reliability for faint sources. The main drawback with its use is the |
---|
670 | long execution time: to reconstruct a $170\times160\times1024$ |
---|
671 | (\hipass) cube often requires three iterations and takes about 20-25 |
---|
672 | minutes. The searching part of the procedure is much quicker (although |
---|
673 | see the note on merging, below), so if one uses the FDR method on the |
---|
674 | un-reconstructed cube, the execution time is only a couple of minutes. |
---|
675 | |
---|
676 | A further drawback with the reconstruction is that it is susceptible |
---|
677 | to edge effects. If the valid area in the cube (\ie the part that is |
---|
678 | not BLANK) has very curved edges (such as the \hipass\ polar cap cube, |
---|
679 | H001, which has a roughly circular shape after gridding), the |
---|
680 | convolution can produce artefacts in the reconstruction that mimic |
---|
681 | the edges and lead to some spurious sources. Caution is advised with |
---|
682 | such data -- the user is advised to check carefully the reconstructed |
---|
683 | cube for the presence of such artefacts. |
---|
684 | |
---|
685 | If one chooses the reconstruction method, a further decision is |
---|
686 | required on the signal-to-noise cutoff used in determining acceptable |
---|
687 | wavelet coefficients. A larger value will remove more noise from the |
---|
688 | cube, at the expense of losing fainter sources, while a smaller value |
---|
689 | will include more noise, which may produce spurious detections, but |
---|
690 | will be more sensitive to faint sources. Values of less than about |
---|
691 | $3\sigma$ tend to not reduce the noise a great deal and can lead to |
---|
692 | many spurious sources. |
---|
693 | |
---|
694 | The FDR method certainly produces more reliable results than a simple |
---|
695 | sigma-clipping (\ie thresholding at some number of $\sigma$ above the |
---|
696 | mean). However, at this point it does not seem to be giving the |
---|
697 | sensitivity expected for the supplied value of {\tt alpha} (\ie it is |
---|
698 | not finding as many sources as expected). Work is |
---|
699 | being done to assess this, and to judge whether there is a real |
---|
700 | problem (such as with the determination of the statistics), or simply |
---|
701 | a result of working in 3 dimensions as opposed to 2. |
---|
702 | |
---|
703 | A further point to bear in mind is that the shape of the detections in |
---|
704 | a cube that has been reconstructed will be much more regular and |
---|
705 | smooth -- the ragged edges that objects in the raw cube possess are |
---|
706 | smoothed by the removal of most of the noise. |
---|
707 | |
---|
708 | Finally, as Duchamp is still undergoing development, there are some |
---|
709 | elements that are not fully developed. In particular, it is not as |
---|
710 | clever as I would like at avoiding interference. The ability to place |
---|
711 | requirements on the minimum number of channels and pixels partially |
---|
712 | circumvents this problem, but work is being done to make Duchamp |
---|
713 | smarter at rejecting signals that are clearly (to a human eye at |
---|
714 | least) interference. See the following section for further |
---|
715 | improvements that are planned. |
---|
716 | |
---|
717 | %\section{Drawbacks of the current program} |
---|
718 | % |
---|
719 | %The program currently has a few problems/drawbacks/things to be aware |
---|
720 | %of that will hopefully be fixed in the future: |
---|
721 | %\begin{itemize} |
---|
722 | % |
---|
723 | %\item Narrow interference spikes are still getting found, particularly |
---|
724 | % if there is no reconstruction, or reconstruction with a relatively |
---|
725 | % low {\tt snrRecon} (such as 2 or 3). Increasing the {\tt |
---|
726 | % minChannels} parameter is one way to circumvent this, but making the |
---|
727 | % algorithm a bit more clever would be preferable. |
---|
728 | % |
---|
729 | %\item Sources that have strong continuum ripple and/or artefacts often |
---|
730 | % generate many spurious detections. This needs some work to avoid |
---|
731 | % Duchamp doing this, and until then users are advised to be aware |
---|
732 | % of the possibility. Strong continuum ripples may generate many |
---|
733 | % sources on the same spatial pixel, and this will be apparent on the |
---|
734 | % detection images. |
---|
735 | % |
---|
736 | %\item Spectra are integrated over every spatial pixel of the |
---|
737 | % detection, and this may dilute the actual detection, making it |
---|
738 | % harder to see \ie the apparent strength of the line as plotted may |
---|
739 | % not give a true indication of how strong it really is. |
---|
740 | % |
---|
741 | %%\item A caution on the merging part of the procedure. This can be time |
---|
742 | %% consuming if there are many detections that do not require merging |
---|
743 | %% -- in this case, the time will go like $N^2$ ($N$ = number of |
---|
744 | %% detections). If there are plenty of mergers, the size of the list |
---|
745 | %% reduces quickly, so the execution time will be less. |
---|
746 | % |
---|
747 | % |
---|
748 | %\end{itemize} |
---|
749 | |
---|
750 | |
---|
751 | %\section{Comparison with other software (to be developed further...)} |
---|
752 | % |
---|
753 | %\subsection{fred, by Matt Howlett} |
---|
754 | % |
---|
755 | %This is the program used in the \hipass\ analysis. It smoothes the |
---|
756 | %data spectrally with a boxcar filter of a size that varies over a |
---|
757 | %user-specified range, and then thresholds the data. |
---|
758 | % |
---|
759 | %Works effectively, but generally doesn't find as many sources as |
---|
760 | %Duchamp, particularly when the reconstruction is used. Sensitive to |
---|
761 | %faint, broad features that fall below the reconstruction threshold. |
---|
762 | % |
---|
763 | %Execution takes a long time, depending on the range of filter widths |
---|
764 | %that are used. |
---|
765 | % |
---|
766 | %\subsection{sfind} |
---|
767 | % |
---|
768 | %Hard to evaluate, as it does not (as far as I can see) output the |
---|
769 | %channel number at which detections are made, and does not merge |
---|
770 | %detections made at adjacent channels (\ie it just works in 2 |
---|
771 | %dimensions). |
---|
772 | % |
---|
773 | |
---|
774 | \section{Future Developments} |
---|
775 | |
---|
776 | This is both a list of planned improvements and a wish-list of |
---|
777 | features that would be nice to include (but are not planned in the |
---|
778 | immediate future): |
---|
779 | |
---|
780 | \begin{itemize} |
---|
781 | |
---|
782 | \item Ability to invert cube to search for absorption features. {\bf |
---|
783 | Planned.} |
---|
784 | |
---|
785 | \item More varied output formats. {\bf Planned.} |
---|
786 | |
---|
787 | \item Better determination of the noise characteristics of |
---|
788 | spectral-line cubes, including understanding how the noise is |
---|
789 | generated and developing a model for it. {\bf Planned.} |
---|
790 | |
---|
791 | \item Include more source analysis. Examples could be: shape |
---|
792 | information; measurements of HI mass; better measurements of |
---|
793 | velocity width and profile... {\bf Some planned.} |
---|
794 | |
---|
795 | \item Provide some indication of the significance of the detection |
---|
796 | (\ie some S/N-like value). {\bf Planned.} |
---|
797 | |
---|
798 | \item Improved ability to reject interference, possibly on the |
---|
799 | spectral shape of features. {\bf Planned.} |
---|
800 | |
---|
801 | \item Link to lists of possible counterparts (\eg via NED/SIMBAD/other |
---|
802 | VO tools?). {\bf Wishlist.} |
---|
803 | |
---|
804 | \item Add ability to read in a reconstructed cube that has been |
---|
805 | saved. In this case the residual array will also need to be read |
---|
806 | in. The idea of this will be to avoid the extended time required for |
---|
807 | the reconstruction if the same cube is being analysed multiple |
---|
808 | times. {\bf Wishlist.} |
---|
809 | |
---|
810 | \item At this point, the ``Milky Way'' channels are discarded and set |
---|
811 | to zero. It may be that users would like to have those put back in |
---|
812 | the final cube after the source detection is done, so at some point |
---|
813 | this option may be added. {\bf Wishlist -- if needed.} |
---|
814 | |
---|
815 | \end{itemize} |
---|
816 | |
---|
817 | |
---|
818 | %\bibliographystyle{mn2e} |
---|
819 | %\bibliographystyle{abbrvnat} |
---|
820 | %\bibliography{mnrasmnemonic,sourceDetection} |
---|
821 | \begin{thebibliography}{} |
---|
822 | |
---|
823 | \bibitem[\protect\citeauthoryear{{Calabretta} \& {Greisen}}{{Calabretta} \& |
---|
824 | {Greisen}}{2002}]{calabretta02} |
---|
825 | {Calabretta} M., {Greisen} E., 2002, A\&A, 395, 1077 |
---|
826 | |
---|
827 | \bibitem[\protect\citeauthoryear{{Greisen} \& {Calabretta}}{{Greisen} \& |
---|
828 | {Calabretta}}{2002}]{greisen02} |
---|
829 | {Greisen} E., {Calabretta} M., 2002, A\&A, 395, 1061 |
---|
830 | |
---|
831 | \bibitem[\protect\citeauthoryear{{Hanisch}, {Farris}, {Greisen}, {Pence}, |
---|
832 | {Schlesinger}, {Teuben}, {Thompson} \& {Warnock}}{{Hanisch} |
---|
833 | et~al.}{2001}]{hanisch01} |
---|
834 | {Hanisch} R., {Farris} A., {Greisen} E., {Pence} W., {Schlesinger} B., |
---|
835 | {Teuben} P., {Thompson} R., {Warnock} A., 2001, A\&A, 376, 359 |
---|
836 | |
---|
837 | \bibitem[\protect\citeauthoryear{{Hopkins}, {Miller}, {Connolly}, {Genovese}, |
---|
838 | {Nichol} \& {Wasserman}}{{Hopkins} et~al.}{2002}]{hopkins02} |
---|
839 | {Hopkins} A., {Miller} C., {Connolly} A., {Genovese} C., {Nichol} R., |
---|
840 | {Wasserman} L., 2002, AJ, 123, 1086 |
---|
841 | |
---|
842 | \bibitem[\protect\citeauthoryear{Lutz}{Lutz}{1980}]{lutz80} |
---|
843 | Lutz R., 1980, The Computer Journal, 23, 262 |
---|
844 | |
---|
845 | \bibitem[\protect\citeauthoryear{{Meyer} et~al.,}{{Meyer} |
---|
846 | et~al.}{2004}]{meyer04:trunc} |
---|
847 | {Meyer} M., et~al., 2004, MNRAS, 350, 1195 |
---|
848 | |
---|
849 | \bibitem[\protect\citeauthoryear{{Miller}, {Genovese}, {Nichol}, {Wasserman}, |
---|
850 | {Connolly}, {Reichart}, {Hopkins}, {Schneider} \& {Moore}}{{Miller} |
---|
851 | et~al.}{2001}]{miller01} |
---|
852 | {Miller} C., {Genovese} C., {Nichol} R., {Wasserman} L., {Connolly} A., |
---|
853 | {Reichart} D., {Hopkins} A., {Schneider} J., {Moore} A., 2001, AJ, 122, |
---|
854 | 3492 |
---|
855 | |
---|
856 | \bibitem[\protect\citeauthoryear{Minchin}{Minchin}{1999}]{minchin99} |
---|
857 | Minchin R., 1999, PASA, 16, 12 |
---|
858 | |
---|
859 | \bibitem[\protect\citeauthoryear{Starck \& Murtagh}{Starck \& |
---|
860 | Murtagh}{2002}]{starck02:book} |
---|
861 | Starck J.-L., Murtagh F., 2002, {``Astronomical Image and Data Analysis''}. |
---|
862 | Springer |
---|
863 | |
---|
864 | \end{thebibliography} |
---|
865 | |
---|
866 | |
---|
867 | \appendix |
---|
868 | \newpage |
---|
869 | \section{Available parameters} |
---|
870 | \label{app-param} |
---|
871 | |
---|
872 | The full list of parameters that can be listed in the input file are |
---|
873 | given here. If not listed, they take the default value given in |
---|
874 | parentheses. Since the order of the parameters in the input file does |
---|
875 | not matter, they are grouped here in logical sections. |
---|
876 | |
---|
877 | \subsection*{Input-output related} |
---|
878 | \begin{entry} |
---|
879 | \item[ImageFile (no default assumed)] The filename of the |
---|
880 | data cube to be analysed. |
---|
881 | \item[OutFile {\tt [./duchamp-Results]}] The file where the final |
---|
882 | detections are to be recorded. This also records the list of input |
---|
883 | parameters. |
---|
884 | \item[SpectraFile {\tt [./duchamp-Spectra.ps]}] The postscript file |
---|
885 | containing the resulting integrated spectra and images of the |
---|
886 | detections. |
---|
887 | \item[flagLog {\tt [true]}] A flag to indicate whether intermediate |
---|
888 | detections should be logged. |
---|
889 | \item[LogFile {\tt [./duchamp-Logfile]}] The file in which intermediate |
---|
890 | detections are logged. These are detections that have not been |
---|
891 | merged. This is primarily for use in debugging and diagnostic |
---|
892 | purposes -- normal use of the program will probably not require |
---|
893 | this. |
---|
894 | \item[flagSubsection {\tt [false]}] A flag to indicate whether one |
---|
895 | wants a subsection of the requested image. |
---|
896 | \item[Subsection {\tt [ [*,*,*] ]}] The requested subsection, which |
---|
897 | should be specified in the format {\tt [x1:x2,y1:y2,z1:z2]}, where |
---|
898 | the limits are inclusive. If the full range of a dimension is |
---|
899 | required, use a {\tt *}, \eg if you want the full spectral range of |
---|
900 | a subsection of the image, use {\tt [30:140,30:140,*]}. |
---|
901 | \item[flagOutputRecon {\tt [false]}] A flag to say whether or not to |
---|
902 | save the reconstructed cube as a FITS file. The filename will be |
---|
903 | derived from the ImageFile -- the reconstruction of {\tt image.fits} |
---|
904 | will be saved as {\tt image.RECON?.fits}, where {\tt ?} stands for |
---|
905 | the value of {\tt snrRecon} (see below). |
---|
906 | \item[flagOutputResid {\tt [false]}] As for {\tt flagOutputRecon}, but |
---|
907 | for the residual array -- the difference between the original cube |
---|
908 | and the reconstructed cube. The filename will be {\tt |
---|
909 | image.RESID?.fits}. |
---|
910 | \item[flagVOT {\tt [false]}] A flag to say whether to create a VOTable |
---|
911 | file corresponding to the information in {\tt outfile}. This will be |
---|
912 | an XML file in the Virtual Observatory VOTable format. |
---|
913 | \item[votFile {\tt [./duchamp-Results.xml]}] The VOTable file with the |
---|
914 | list of final detections. Some input parameters are also recorded. |
---|
915 | \item[flagKarma {\tt [false]}] A flag to say whether to create a Karma |
---|
916 | annotation file corresponding to the information in {\tt |
---|
917 | outfile}. This can be used as an overlay for the Karma programs such |
---|
918 | as {\tt kvis}. |
---|
919 | \item[karmaFile {\tt [./duchamp-Results.ann]}] The Karma annotation |
---|
920 | file showing the list of final detections. |
---|
921 | \item[flagMaps {\tt [true]}] A flag to say whether to save postscript |
---|
922 | files showing the 0th moment map of the whole cube (parameter {\tt |
---|
923 | momentMap}) and the detection image ({\tt detectionMap}). |
---|
924 | \item[momentMap {\tt [./latest-moment-map.ps]}] A postscript file |
---|
925 | containing a map of the 0th moment of the detected sources, as well |
---|
926 | as pixel and WCS coordinates. |
---|
927 | \item[detectionMap {\tt [./latest-detection-map.ps]}] A postscript |
---|
928 | file showing each of the detected objects, coloured in greyscale by |
---|
929 | the number of channels they span. Also shows pixel and WCS |
---|
930 | coordinates. |
---|
931 | \end{entry} |
---|
932 | |
---|
933 | \subsection*{Modifying the cube} |
---|
934 | \begin{entry} |
---|
935 | \item[flagBlankPix {\tt [true]}] A flag to say whether to remove BLANK |
---|
936 | pixels from the analysis -- these are pixels set to some particular |
---|
937 | value because they fall outside the imaged area. |
---|
938 | \item[blankPixValue {\tt [-8.00061]}] The value of the BLANK pixels, |
---|
939 | if this information is not contained in the FITS header (the usual |
---|
940 | procedure is to obtain this value from the header information -- in |
---|
941 | which case the value set by this parameter is ignored). |
---|
942 | \item[flagMW {\tt [true]}] A flag to say whether to remove channels |
---|
943 | contaminated by Milky Way (or other) emission -- the flux in these |
---|
944 | channels is currently just set to 0. |
---|
945 | \item[maxMW {\tt [112]}] The maximum channel for the Milky Way |
---|
946 | emission. |
---|
947 | \item[minMW {\tt [75]}] The minimum channel for the Milky Way |
---|
948 | emission. Note that the channels specified by {\tt maxMW} and {\tt |
---|
949 | minMW} are assumed to be Milky Way channels (\ie the range is |
---|
950 | inclusive). |
---|
951 | \item[flagBaseline {\tt [false]}] A flag to say whether to remove the |
---|
952 | baseline from each spectrum in the cube for the purposes of |
---|
953 | reconstruction and detection. |
---|
954 | \end{entry} |
---|
955 | |
---|
956 | \subsection*{Detection related} |
---|
957 | |
---|
958 | \subsubsection*{General detection} |
---|
959 | \begin{entry} |
---|
960 | \item[snrCut {\tt [3.]}] The cut-off value for thresholding, in terms |
---|
961 | of number of $\sigma$ above the mean. |
---|
962 | \item[flagGrowth {\tt [true]}] A flag indicating whether or not to |
---|
963 | grow the detected objects to a smaller threshold. |
---|
964 | \item[growthCut {\tt [1.5]}] The smaller threshold using in growing |
---|
965 | detections. In units of $\sigma$ above the mean. |
---|
966 | \end{entry} |
---|
967 | |
---|
968 | \subsubsection*{{\` a} trous reconstruction} |
---|
969 | \begin{entry} |
---|
970 | \item [flagATrous {\tt [true]}] A flag indicating whether or not to |
---|
971 | reconstruct the cube using the {\it {\`a} trous} wavelet |
---|
972 | reconstruction. Currently does this in 3-dimensions. See |
---|
973 | \S\ref{sec-recon} for details. |
---|
974 | \item[scaleMin {\tt [1]}] The minimum wavelet scale to be used in the |
---|
975 | reconstruction. A value of 1 means ``use all scales''. |
---|
976 | \item[snrRecon {\tt [4]}] The thresholding cutoff used in the |
---|
977 | reconstruction -- only wavelet coefficients this many $\sigma$ above |
---|
978 | the mean (or greater) are included in the reconstruction. |
---|
979 | \item[filterCode {\tt [2]}] The code number of the filter to use in |
---|
980 | the reconstruction. The options are: |
---|
981 | \begin{itemize} |
---|
982 | \item {\bf 1:} B$_3$-spline filter: coefficients = |
---|
983 | $(\frac{1}{16}, \frac{1}{4}, \frac{3}{8}, \frac{1}{4}, \frac{1}{16})$ |
---|
984 | \item {\bf 2:} Triangle filter: coefficients = $(\frac{1}{4}, \frac{1}{2}, \frac{1}{4})$ |
---|
985 | \item {\bf 3:} Haar wavelet: coefficients = $(0, \frac{1}{2}, \frac{1}{2})$ |
---|
986 | \end{itemize} |
---|
987 | \end{entry} |
---|
988 | |
---|
989 | \subsubsection*{FDR method} |
---|
990 | \begin{entry} |
---|
991 | \item[flagFDR {\tt [false]}] A flag indicating whether or not to use |
---|
992 | the False Discovery Rate method in thresholding the pixels. |
---|
993 | \item[alphaFDR {\tt [0.01]}] The $\alpha$ parameter used in the FDR |
---|
994 | analysis. The average number of false detections, as a fraction of the |
---|
995 | total number, will be less than $\alpha$ (see \S\ref{sec-detection}). |
---|
996 | \end{entry} |
---|
997 | |
---|
998 | \subsubsection*{Merging detections} |
---|
999 | \begin{entry} |
---|
1000 | \item[flagAdjacent {\tt [true]}] A flag indicating whether to use the |
---|
1001 | ``adjacent pixel'' criterion to decide whether to merge objects. If |
---|
1002 | not, the next two parameters are used to determine whether objects |
---|
1003 | are within the necessary thresholds. |
---|
1004 | \item[minPix {\tt [2]}] The minimum number of spatial pixels for a single |
---|
1005 | detection to be counted. |
---|
1006 | \item[minChannels {\tt [3]}] The minimum number of consecutive |
---|
1007 | channels that must be present in the detection for it to be accepted |
---|
1008 | by the Merging algorithm. |
---|
1009 | %The minimum number of channels that a |
---|
1010 | % detection must span for it to be accepted by the Merging algorithm. |
---|
1011 | \item[threshSpatial {\tt [3.]}] The maximum allowed minimum spatial |
---|
1012 | separation (in pixels) between two detections for them to be merged |
---|
1013 | into one. Only used if {\tt flagAdjacent = false}. |
---|
1014 | \item[threshVelocity {\tt [7.]}] The maximum allowed minimum channel |
---|
1015 | separation between two detections for them to be merged into |
---|
1016 | one. %Only used if {\tt flagAdjacent = false}. |
---|
1017 | \end{entry} |
---|
1018 | |
---|
1019 | \subsubsection*{Other parameters} |
---|
1020 | \begin{entry} |
---|
1021 | \item[drawBorders {\tt [true]}] A flag indicating whether borders |
---|
1022 | are to be drawn around the detected objects in the moment maps |
---|
1023 | included in the output (see for example Fig.~\ref{fig-spect}). |
---|
1024 | \item[verbose {\tt [true]}] A flag indicating whether to print the |
---|
1025 | progress of computationally-intensive algorithms (such as the |
---|
1026 | searching and merging) to screen. |
---|
1027 | \end{entry} |
---|
1028 | |
---|
1029 | |
---|
1030 | \newpage |
---|
1031 | \section{Example parameter files} |
---|
1032 | \label{app-input} |
---|
1033 | |
---|
1034 | This is what a typical parameter file would look like. |
---|
1035 | |
---|
1036 | \begin{verbatim} |
---|
1037 | imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1038 | logFile temp-Logfile |
---|
1039 | outFile temp-Results |
---|
1040 | spectraFile spectra.ps |
---|
1041 | flagSubsection 0 |
---|
1042 | flagOutputRecon 0 |
---|
1043 | flagOutputResid 0 |
---|
1044 | flagBlankPix 1 |
---|
1045 | blankPixValue -8.00061 |
---|
1046 | flagMW 1 |
---|
1047 | minMW 75 |
---|
1048 | maxMW 112 |
---|
1049 | minPix 3 |
---|
1050 | flagGrowth 1 |
---|
1051 | growthCut 1.5 |
---|
1052 | flagATrous 0 |
---|
1053 | scaleMin 1 |
---|
1054 | snrRecon 4 |
---|
1055 | flagFDR 1 |
---|
1056 | alphaFDR 0.1 |
---|
1057 | numPixPSF 20 |
---|
1058 | snrCut 3 |
---|
1059 | threshSpatial 3 |
---|
1060 | threshVelocity 7 |
---|
1061 | minChannels 4 |
---|
1062 | \end{verbatim} |
---|
1063 | |
---|
1064 | Note that it is not necessary to include all these parameters in the |
---|
1065 | file, only those that need to be changed from the defaults (as listed |
---|
1066 | in Appendix~\ref{app-param}), which in this case would be very few. A |
---|
1067 | minimal parameter file might look like: |
---|
1068 | \begin{verbatim} |
---|
1069 | imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1070 | flagLog 0 |
---|
1071 | snrRecon 3 |
---|
1072 | snrCut 2.5 |
---|
1073 | minChannels 3 |
---|
1074 | \end{verbatim} |
---|
1075 | This will reconstruct the cube with a lower SNR value than the |
---|
1076 | default, select objects at a lower threshold, with a looser minimum |
---|
1077 | channel requirement, and not keep a log of the intermediate |
---|
1078 | detections. |
---|
1079 | |
---|
1080 | The following page demonstrates how the parameters are presented to |
---|
1081 | the user, both on the screen at execution time and in the output and |
---|
1082 | log files: |
---|
1083 | \newpage |
---|
1084 | \begin{landscape} |
---|
1085 | Presentation of parameters in output and log files: |
---|
1086 | \begin{verbatim} |
---|
1087 | ---- Parameters ---- |
---|
1088 | Image to be analysed = /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1089 | Intermediate Logfile = logfile.txt |
---|
1090 | Final Results file = results.txt |
---|
1091 | Spectrum file = spectra.ps |
---|
1092 | VOTable file = results.xml |
---|
1093 | 0th Moment Map = latest-moment-map.ps |
---|
1094 | Detection Map = latest-detection-map.ps |
---|
1095 | Saving reconstructed cube? = false |
---|
1096 | Saving residuals from reconstruction? = false |
---|
1097 | ------ |
---|
1098 | Fixing Blank Pixels? = true |
---|
1099 | Blank Pixel Value = -8.00061 |
---|
1100 | Removing Milky Way channels? = true |
---|
1101 | Milky Way Channels = 75-112 |
---|
1102 | Beam Size (pixels) = 10.1788 |
---|
1103 | Removing baselines before search? = false |
---|
1104 | Minimum # Pixels in a detection = 2 |
---|
1105 | Growing objects after detection? = false |
---|
1106 | Using A Trous reconstruction? = true |
---|
1107 | Minimum scale in reconstruction = 1 |
---|
1108 | SNR Threshold within reconstruction = 4 |
---|
1109 | Filter being used for reconstruction = B3 spline function |
---|
1110 | Using FDR analysis? = false |
---|
1111 | SNR Threshold = 2.5 |
---|
1112 | Using Adjacent-pixel criterion? = true |
---|
1113 | Min. # channels for merging = 4 |
---|
1114 | -------------------- |
---|
1115 | \end{verbatim} |
---|
1116 | |
---|
1117 | \newpage |
---|
1118 | \section{Example output file} |
---|
1119 | \label{app-output} |
---|
1120 | This the typical content of an output file, after running Duchamp |
---|
1121 | with the parameters illustrated on the previous page. |
---|
1122 | |
---|
1123 | {\scriptsize |
---|
1124 | \begin{verbatim} |
---|
1125 | Results of the Duchamp source finder: Tue Mar 21 16:28:50 EST 2006 |
---|
1126 | ---- Parameters ---- |
---|
1127 | |
---|
1128 | (... omitted for clarity -- see previous page for examples...) |
---|
1129 | |
---|
1130 | -------------------- |
---|
1131 | Total number of detections = 23 |
---|
1132 | -------------------- |
---|
1133 | Obj# Name X Y Z RA DEC w_RA w_DEC VEL w_VEL X1 X2 Y1 Y2 Z1 Z2 Npix F_tot F_peak |
---|
1134 | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
---|
1135 | 1 J0609-2200 59.4 140.6 114.7 06:09:38.50 -22:00:48.20 48.50 39.42 213.061 65.957 55 66 136 145 113 118 185 17.5725 0.2125 |
---|
1136 | 2 J0608-2605 65.2 79.6 116.2 06:08:10.23 -26:05:06.57 44.47 39.47 233.119 39.574 60 70 76 85 115 118 50 4.1441 0.1002 |
---|
1137 | 3 J0606-2724 70.8 59.8 121.4 06:06:33.08 -27:24:43.28 52.48 47.57 302.213 39.574 65 77 53 64 120 123 213 17.0659 0.1497 |
---|
1138 | 4 J0611-2142 52.5 145.1 162.5 06:11:36.34 -21:42:00.01 32.40 23.47 843.727 118.722 49 56 142 147 158 167 303 44.3940 0.4103 |
---|
1139 | 5 J0600-2903 89.7 35.3 202.4 06:00:51.38 -29:03:02.51 23.94 28.09 1370.285 184.679 87 92 32 38 195 209 319 26.5725 0.1729 |
---|
1140 | 6 J0559-2643 95.5 70.2 222.6 05:59:10.59 -26:43:05.94 15.94 12.09 1637.316 105.531 94 97 69 71 219 227 35 1.9253 0.0630 |
---|
1141 | 7 J0617-2727 34.8 58.3 227.5 06:17:24.45 -27:27:53.89 20.77 23.41 1701.802 303.400 33 37 56 61 215 238 176 11.4138 0.0929 |
---|
1142 | 8 J0609-2145 60.3 144.4 229.6 06:09:23.17 -21:45:36.06 16.15 11.81 1729.279 105.531 59 62 143 145 225 233 25 1.4760 0.0679 |
---|
1143 | 9 J0559-2529 95.7 88.6 231.1 05:59:08.81 -25:29:34.50 27.88 24.14 1749.440 250.635 92 98 86 91 220 239 257 16.9297 0.1155 |
---|
1144 | 10 J0601-2145 88.9 144.4 232.3 06:01:10.14 -21:45:58.59 31.96 24.13 1764.657 224.253 86 93 142 147 222 239 415 34.0304 0.1655 |
---|
1145 | 11 J0615-2638 40.0 70.8 232.6 06:15:44.32 -26:38:29.42 16.56 19.57 1769.033 52.765 38 41 69 73 231 235 44 2.7565 0.0685 |
---|
1146 | 12 J0605-2610 75.9 78.4 233.1 06:05:00.16 -26:10:21.68 28.14 23.84 1775.066 224.253 73 79 76 81 225 242 352 27.0587 0.1545 |
---|
1147 | 13 J0601-2344 88.0 114.9 235.7 06:01:25.72 -23:44:18.18 35.96 32.07 1809.749 263.826 84 92 112 119 226 246 724 85.1317 0.2968 |
---|
1148 | 14 J0615-2238 38.2 130.6 253.6 06:15:48.32 -22:38:45.75 12.39 15.70 2046.530 118.722 37 39 129 132 248 257 40 2.3169 0.0697 |
---|
1149 | 15 J0617-2309 31.4 122.8 258.0 06:17:51.07 -23:09:29.22 16.46 15.53 2103.912 39.574 30 33 121 124 256 259 23 1.4243 0.0624 |
---|
1150 | 16 J0612-2153 49.5 142.3 271.1 06:12:29.32 -21:53:16.05 24.36 19.56 2276.976 395.740 47 52 140 144 257 287 318 20.7117 0.1008 |
---|
1151 | 17 J0616-2137 35.2 145.9 300.0 06:16:34.07 -21:37:30.95 20.22 7.46 2658.607 224.252 33 37 145 146 294 311 40 3.8507 0.1271 |
---|
1152 | 18 J0544-2740 144.0 54.9 325.4 05:44:31.07 -27:40:42.30 3.58 12.13 2993.384 39.574 144 144 54 56 324 327 7 0.4362 0.0569 |
---|
1153 | 19 J0555-3000 107.2 20.7 367.5 05:55:28.58 -30:00:46.76 19.67 24.31 3547.812 39.574 105 109 18 23 366 369 72 6.4819 0.1692 |
---|
1154 | 20 J0559-2325 96.0 119.6 532.1 05:59:04.98 -23:25:19.04 11.92 16.08 5720.289 52.765 95 97 118 121 530 534 27 1.2865 0.0508 |
---|
1155 | 21 J0616-2653 37.9 67.0 547.0 06:16:23.08 -26:53:11.73 12.36 11.67 5916.731 39.574 37 39 66 68 546 549 25 1.6374 0.0642 |
---|
1156 | 22 J0619-2256 25.1 125.9 724.2 06:19:39.49 -22:56:06.15 12.38 11.60 8254.112 39.573 24 26 125 127 723 726 13 0.6982 0.0593 |
---|
1157 | 23 J0552-2920 116.9 30.5 727.0 05:52:33.03 -29:20:54.43 11.60 20.25 8290.842 303.400 116 118 28 32 716 739 132 35.8343 0.4787 |
---|
1158 | \end{verbatim} |
---|
1159 | } |
---|
1160 | |
---|
1161 | %\end{landscape} |
---|
1162 | |
---|
1163 | \newpage |
---|
1164 | \section{Example VOTable output} |
---|
1165 | \label{app-votable} |
---|
1166 | This is part of the VOTable, in XML format, corresponding to the |
---|
1167 | output file in Appendix~\ref{app-output} (the indentation has been removed to make it fit on the page!). |
---|
1168 | |
---|
1169 | %\begin{landscape} |
---|
1170 | {\scriptsize |
---|
1171 | \begin{verbatim} |
---|
1172 | <?xml version="1.0"?> |
---|
1173 | <VOTABLE version="1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
---|
1174 | xsi:noNamespaceSchemaLocation="http://www.ivoa.net/xml/VOTable/VOTable/v1.1"> |
---|
1175 | <COOSYS ID="J2000" equinox="J2000." epoch="J2000." system="eq_FK5"/> |
---|
1176 | <RESOURCE name="Duchamp Output"> |
---|
1177 | <TABLE name="Detections"> |
---|
1178 | <DESCRIPTION>Detected sources and parameters from running the Duchamp source finder.</DESCRIPTION> |
---|
1179 | <PARAM name="FITS file" datatype="char" ucd="meta.file;meta.fits" value="/DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits"/> |
---|
1180 | <FIELD name="ID" ID="col1" ucd="meta.id" datatype="int" width="4"/> |
---|
1181 | <FIELD name="Name" ID="col2" ucd="meta.id;meta.main" datatype="char" arraysize="14"/> |
---|
1182 | <FIELD name="RA" ID="col3" ucd="pos.eq.ra;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/> |
---|
1183 | <FIELD name="Dec" ID="col4" ucd="pos.eq.dec;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/> |
---|
1184 | <FIELD name="w_RA" ID="col3" ucd="phys.angSize;pos.eq.ra" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/> |
---|
1185 | <FIELD name="w_Dec" ID="col4" ucd="phys.angSize;pos.eq.dec" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/> |
---|
1186 | <FIELD name="Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc" datatype="float" width="9" precision="3" unit="km/s"/> |
---|
1187 | <FIELD name="w_Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc;spect.line.width" datatype="float" width="8" precision="3" unit="km/s"/> |
---|
1188 | <FIELD name="Integrated_Flux" ID="col4" ucd="phys.flux;spect.line.intensity" datatype="float" width="10" precision="3" unit="km/s"/> |
---|
1189 | <DATA> |
---|
1190 | <TABLEDATA> |
---|
1191 | <TR> |
---|
1192 | <TD> 1</TD><TD> J0609-2200</TD><TD> 92.410416</TD><TD>-22.013390</TD><TD> 48.50</TD><TD> 39.42</TD><TD> 213.061</TD><TD> 65.957</TD><TD> 17.572</TD> |
---|
1193 | </TR> |
---|
1194 | <TR> |
---|
1195 | <TD> 2</TD><TD> J0608-2605</TD><TD> 92.042633</TD><TD>-26.085157</TD><TD> 44.47</TD><TD> 39.47</TD><TD> 233.119</TD><TD> 39.574</TD><TD> 4.144</TD> |
---|
1196 | </TR> |
---|
1197 | <TR> |
---|
1198 | <TD> 3</TD><TD> J0606-2724</TD><TD> 91.637840</TD><TD>-27.412022</TD><TD> 52.48</TD><TD> 47.57</TD><TD> 302.213</TD><TD> 39.574</TD><TD> 17.066</TD> |
---|
1199 | </TR> |
---|
1200 | <TR> |
---|
1201 | <TD> 4</TD><TD> J0611-2142</TD><TD> 92.901421</TD><TD>-21.700003</TD><TD> 32.40</TD><TD> 23.47</TD><TD> 843.727</TD><TD> 118.722</TD><TD> 44.394</TD> |
---|
1202 | </TR> |
---|
1203 | <TR> |
---|
1204 | <TD> 5</TD><TD> J0600-2903</TD><TD> 90.214081</TD><TD>-29.050697</TD><TD> 23.94</TD><TD> 28.09</TD><TD> 1370.285</TD><TD> 184.679</TD><TD> 26.573</TD> |
---|
1205 | </TR> |
---|
1206 | (... table truncated for clarity ...) |
---|
1207 | </TABLEDATA> |
---|
1208 | </DATA> |
---|
1209 | </TABLE> |
---|
1210 | </RESOURCE> |
---|
1211 | </VOTABLE> |
---|
1212 | \end{verbatim} |
---|
1213 | } |
---|
1214 | \end{landscape} |
---|
1215 | |
---|
1216 | \section{Robust statistics for a Normal distribution} |
---|
1217 | \label{app-madfm} |
---|
1218 | |
---|
1219 | The Normal, or Gaussian, distribution for mean $\mu$ and standard |
---|
1220 | deviation $\sigma$ can be written as |
---|
1221 | \[ |
---|
1222 | f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\ e^{-(x-\mu)^2/2\sigma^2}. |
---|
1223 | \] |
---|
1224 | |
---|
1225 | When one has a purely Gaussian signal, it is straightforward to |
---|
1226 | estimate $\sigma$ by calculating the standard deviation (or rms) of |
---|
1227 | the data. However, if there is a small amount of signal present on top |
---|
1228 | of Gaussian noise, and one wants to estimate the $\sigma$ for the |
---|
1229 | noise, the presence of the large values from the signal can bias the |
---|
1230 | estimator to higher values. |
---|
1231 | |
---|
1232 | An alternative way is to use the median ($m$) and median absolute deviation |
---|
1233 | from the median ($s$) to estimate $\mu$ and $\sigma$. The median is the |
---|
1234 | middle of the distribution, defined for a continuous distribution by |
---|
1235 | \[ |
---|
1236 | \int_{-\infty}^{m} f(x) \diff x = \int_{m}^{\infty} f(x) \diff x. |
---|
1237 | \] |
---|
1238 | From symmetry, we quickly see that for the continuous Normal |
---|
1239 | distribution, $m=\mu$. We consider the case henceforth of $\mu=0$, |
---|
1240 | without loss of generality. |
---|
1241 | |
---|
1242 | To find $s$, we find the distribution of the absolute deviation from |
---|
1243 | the median, and then find the median of that distribution. This |
---|
1244 | distribution is given by |
---|
1245 | \begin{eqnarray*} |
---|
1246 | g(x) &= &{\mbox{\rm distribution of }} |x|\\ |
---|
1247 | &= &f(x) + f(-x),\ x\ge0\\ |
---|
1248 | &= &\sqrt{\frac{2}{\pi\sigma^2}}\, e^{-x^2/2\sigma^2},\ x\ge0. |
---|
1249 | \end{eqnarray*} |
---|
1250 | So, the median absolute deviation from the median, $s$, is given by |
---|
1251 | \[ |
---|
1252 | \int_{0}^{s} g(x) \diff x = \int_{s}^{\infty} g(x) \diff x. |
---|
1253 | \] |
---|
1254 | Now, $\int_{0}^{\infty}e^{-x^2/2\sigma^2} \diff x = \sqrt{\pi\sigma^2/2}$, and |
---|
1255 | so $\int_{s}^{\infty} e^{-x^2/2\sigma^2} \diff x = |
---|
1256 | \sqrt{\pi\sigma^2/2} - \int_{0}^{s} e^{-\frac{x^2}{2\sigma^2}} \diff x |
---|
1257 | $. Hence, to find $s$ we simply solve the following equation (setting $\sigma=1$ for |
---|
1258 | simplicity -- equivalent to stating $x$ and $s$ in units of $\sigma$): |
---|
1259 | \[ |
---|
1260 | \int_{0}^{s}e^{-x^2/2} \diff x - \sqrt{\pi/8} = 0. |
---|
1261 | \] |
---|
1262 | This is hard to solve analytically (no nice analytic solution exists |
---|
1263 | for the finite integral that I'm aware of), but straightforward to |
---|
1264 | solve numerically, yielding the value of $s=0.6744888$. Thus, to |
---|
1265 | estimate $\sigma$ for a Normally distributed data set, one can calculate |
---|
1266 | $s$, then divide by 0.6744888 (or multiply by 1.4826042) to obtain the |
---|
1267 | correct estimator. |
---|
1268 | |
---|
1269 | Note that this is different to solutions quoted elsewhere, |
---|
1270 | specifically in \citet{meyer04:trunc}, where the same robust estimator |
---|
1271 | is used but with an incorrect conversion to standard deviation -- they |
---|
1272 | assume $\sigma = s\sqrt{\pi/2}$. This, in fact, is the conversion used |
---|
1273 | to convert the {\it mean} absolute deviation from the mean to the |
---|
1274 | standard deviation. This means that the cube noise in the \hipass\ |
---|
1275 | catalogue (parameter Rms$_{\rm cube}$) should be 18\% larger than |
---|
1276 | quoted. |
---|
1277 | |
---|
1278 | %a different value for $s/\sigma$ (of |
---|
1279 | %$\sqrt{2/\pi}\approx0.797885$) is quoted. It should thus be noted that |
---|
1280 | %this means the values quoted by \citet{meyer04:trunc} for the cube noise in |
---|
1281 | %the \hipass\ catalogue should be 18\% larger (since 0.1486042 is 18\% |
---|
1282 | %larger than $\sqrt{\pi/2}\approx1.253314$). |
---|
1283 | |
---|
1284 | \section{How Gaussian noise changes with wavelet scale.} |
---|
1285 | \label{app-scaling} |
---|
1286 | |
---|
1287 | The key element in the wavelet reconstruction of an array is the |
---|
1288 | thresholding of the individual wavelet coefficient arrays. This is |
---|
1289 | usually done by choosing a level to be some number of standard |
---|
1290 | deviations above the mean value. |
---|
1291 | |
---|
1292 | However, since the wavelet arrays are produced by convolving the input |
---|
1293 | array by an increasingly large filter, the pixels in the coefficient |
---|
1294 | arrays become increasingly correlated as the scale of the filter |
---|
1295 | increases. This results in the measured standard deviation from a |
---|
1296 | given coefficient array decreasing with increasing scale. To calculate |
---|
1297 | this, we need to take into account how many other pixels each pixel in |
---|
1298 | the convolved array depends on. |
---|
1299 | |
---|
1300 | To demonstrate, suppose we have a 1-D array with $N$ pixel values |
---|
1301 | given by $F_i,\ i=1,...,N$, and we convolve it with the B$_3$-spline |
---|
1302 | filter with coefficients $\{1/16,1/4,3/8,1/4,1/16\}$. The flux of the |
---|
1303 | $i$th pixel in the convolved array will be |
---|
1304 | \[ |
---|
1305 | F'_i = \frac{1}{16}F_{i-2} + \frac{1}{16}F_{i-2} + \frac{3}{8}F_{i} |
---|
1306 | + \frac{1}{4}F_{i-1} + \frac{1}{16}F_{i+2} |
---|
1307 | \] |
---|
1308 | and the flux of the corresponding pixel in the wavelet array will be |
---|
1309 | \[ |
---|
1310 | W'_i = F_i - F'_i = \frac{1}{16}F_{i-2} + \frac{1}{16}F_{i-2} + \frac{5}{8}F_{i} |
---|
1311 | + \frac{1}{4}F_{i-1} + \frac{1}{16}F_{i+2} |
---|
1312 | \] |
---|
1313 | Now, assuming each pixel has the same standard deviation |
---|
1314 | $\sigma_i=\sigma$, we can work out the standard deviation for the |
---|
1315 | coefficient array: |
---|
1316 | \[ |
---|
1317 | \sigma'_i = \sigma \sqrt{\left(\frac{1}{16}\right)^2 + \left(\frac{1}{4}\right)^2 |
---|
1318 | + \left(\frac{5}{8}\right)^2 + \left(\frac{1}{4}\right)^2 + \left(\frac{1}{16}\right)^2} |
---|
1319 | = 0.72349\ \sigma |
---|
1320 | \] |
---|
1321 | Thus, the first scale wavelet coefficient array will have a standard |
---|
1322 | deviation of 72.3\% of the input array. This procedure can be followed |
---|
1323 | to calculate the necessary values for all scales, dimensions and |
---|
1324 | filters used by Duchamp. |
---|
1325 | |
---|
1326 | Calculating these values is, therefore, a critical step in performing |
---|
1327 | the reconstruction. \citet{starck02:book} did so by simulating data sets |
---|
1328 | with Gaussian noise, taking the wavelet transform, and measuring the |
---|
1329 | value of $\sigma$ for each scale. We take a different approach, by |
---|
1330 | calculating the scaling factors directly from the filter coefficients |
---|
1331 | by taking the wavelet transform of an array made up of a 1 in the |
---|
1332 | central pixel and 0s everywhere else. The scaling value is then |
---|
1333 | derived by adding in quadrature all the wavelet coefficient values at |
---|
1334 | each scale. We give the scaling factors for the three filters |
---|
1335 | available to Duchamp on the following page. These values are |
---|
1336 | hard-coded into Duchamp, so no on-the-fly calculation of them is |
---|
1337 | necessary. |
---|
1338 | |
---|
1339 | Memory limitations prevent us from calculating factors for large |
---|
1340 | scales, particularly for the three-dimensional case (hence the -- |
---|
1341 | symbols in the tables). To calculate factors for |
---|
1342 | higher scales than those available, we note the following |
---|
1343 | relationships apply for large scales to a sufficient level of precision: |
---|
1344 | \begin{itemize} |
---|
1345 | \item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{2}$. |
---|
1346 | \item 2-D: factor(scale $i$) = factor(scale $i-1$)$/2$. |
---|
1347 | \item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{8}$. |
---|
1348 | \end{itemize} |
---|
1349 | |
---|
1350 | \newpage |
---|
1351 | \begin{itemize} |
---|
1352 | \item {\bf B$_3$-Spline Function:} $\{1/16,1/4,3/8,1/4,1/16\}$ |
---|
1353 | |
---|
1354 | \begin{tabular}{llll} |
---|
1355 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1356 | 1 & 0.723489806 & 0.890796310 & 0.956543592\\ |
---|
1357 | 2 & 0.285450405 & 0.200663851 & 0.120336499\\ |
---|
1358 | 3 & 0.177947535 & 0.0855075048 & 0.0349500154\\ |
---|
1359 | 4 & 0.122223156 & 0.0412474444 & 0.0118164242\\ |
---|
1360 | 5 & 0.0858113122 & 0.0204249666 & 0.00413233507\\ |
---|
1361 | 6 & 0.0605703043 & 0.0101897592 & 0.00145703714\\ |
---|
1362 | 7 & 0.0428107206 & 0.00509204670 & 0.000514791120\\ |
---|
1363 | 8 & 0.0302684024 & 0.00254566946 & --\\ |
---|
1364 | 9 & 0.0214024008 & 0.00127279050 & --\\ |
---|
1365 | 10 & 0.0151336781 & 0.000636389722 & --\\ |
---|
1366 | 11 & 0.0107011079 & 0.000318194170 & --\\ |
---|
1367 | 12 & 0.00756682272 & -- & --\\ |
---|
1368 | 13 & 0.00535055108 & -- & --\\ |
---|
1369 | %14 & 0.00378341085 & -- & --\\ |
---|
1370 | %15 & 0.00267527545 & -- & --\\ |
---|
1371 | %16 & 0.00189170541 & -- & --\\ |
---|
1372 | %17 & 0.00133763772 & -- & --\\ |
---|
1373 | %18 & 0.000945852704 & -- & -- |
---|
1374 | \end{tabular} |
---|
1375 | |
---|
1376 | \item {\bf Triangle Function:} $\{1/4,1/2,1/4\}$ |
---|
1377 | |
---|
1378 | \begin{tabular}{llll} |
---|
1379 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1380 | 1 & 0.612372436 & 0.800390530 & 0.895954449 \\ |
---|
1381 | 2 & 0.330718914 & 0.272878894 & 0.192033014\\ |
---|
1382 | 3 & 0.211947812 & 0.119779282 & 0.0576484078\\ |
---|
1383 | 4 & 0.145740298 & 0.0577664785 & 0.0194912393\\ |
---|
1384 | 5 & 0.102310944 & 0.0286163283 & 0.00681278387\\ |
---|
1385 | 6 & 0.0722128185 & 0.0142747506 & 0.00240175885\\ |
---|
1386 | 7 & 0.0510388224 & 0.00713319703 & 0.000848538128 \\ |
---|
1387 | 8 & 0.0360857673 & 0.00356607618 & 0.000299949455 \\ |
---|
1388 | 9 & 0.0255157615 & 0.00178297280 & -- \\ |
---|
1389 | 10 & 0.0180422389 & 0.000891478237 & -- \\ |
---|
1390 | 11 & 0.0127577667 & 0.000445738098 & -- \\ |
---|
1391 | 12 & 0.00902109930 & 0.000222868922 & -- \\ |
---|
1392 | 13 & 0.00637887978 & -- & -- \\ |
---|
1393 | %14 & 0.00451054902 & -- & -- \\ |
---|
1394 | %15 & 0.00318942978 & -- & -- \\ |
---|
1395 | %16 & 0.00225527449 & -- & -- \\ |
---|
1396 | %17 & 0.00159471988 & -- & -- \\ |
---|
1397 | %18 & 0.000112763724 & -- & -- |
---|
1398 | |
---|
1399 | \end{tabular} |
---|
1400 | |
---|
1401 | \item {\bf Haar Wavelet:} $\{0,1/2,1/2\}$ |
---|
1402 | |
---|
1403 | \begin{tabular}{llll} |
---|
1404 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1405 | 1 & 0.707167810 & 0.433012702 & 0.935414347 \\ |
---|
1406 | 2 & 0.500000000 & 0.216506351 & 0.330718914\\ |
---|
1407 | 3 & 0.353553391 & 0.108253175 & 0.116926793\\ |
---|
1408 | 4 & 0.250000000 & 0.0541265877 & 0.0413398642\\ |
---|
1409 | 5 & 0.176776695 & 0.0270632939 & 0.0146158492\\ |
---|
1410 | 6 & 0.125000000 & 0.0135316469 & 0.00516748303 |
---|
1411 | |
---|
1412 | \end{tabular} |
---|
1413 | |
---|
1414 | |
---|
1415 | \end{itemize} |
---|
1416 | |
---|
1417 | \end{document} |
---|
1418 | |
---|
1419 | |
---|
1420 | |
---|
1421 | |
---|
1422 | |
---|
1423 | |
---|
1424 | |
---|
1425 | |
---|
1426 | |
---|
1427 | |
---|
1428 | |
---|
1429 | |
---|
1430 | |
---|
1431 | |
---|
1432 | |
---|
1433 | |
---|
1434 | |
---|
1435 | |
---|
1436 | |
---|
1437 | |
---|
1438 | |
---|
1439 | |
---|
1440 | |
---|
1441 | |
---|
1442 | |
---|
1443 | |
---|
1444 | |
---|
1445 | |
---|
1446 | |
---|
1447 | |
---|
1448 | |
---|
1449 | |
---|
1450 | |
---|