1 | \documentclass[12pt,a4paper]{article} |
---|
2 | |
---|
3 | %%%%%% LINE SPACING %%%%%%%%%%%% |
---|
4 | \usepackage{setspace} |
---|
5 | \singlespacing |
---|
6 | %\onehalfspacing |
---|
7 | %\doublespacing |
---|
8 | |
---|
9 | %Define a test for doing PDF format -- use different code below |
---|
10 | \newif\ifPDF |
---|
11 | \ifx\pdfoutput\undefined\PDFfalse |
---|
12 | \else\ifnum\pdfoutput > 0\PDFtrue |
---|
13 | \else\PDFfalse |
---|
14 | \fi |
---|
15 | \fi |
---|
16 | |
---|
17 | \textwidth=161 mm |
---|
18 | \textheight=245 mm |
---|
19 | \topmargin=-15 mm |
---|
20 | \oddsidemargin=0 mm |
---|
21 | \parindent=6 mm |
---|
22 | |
---|
23 | \usepackage[sort]{natbib} |
---|
24 | \usepackage{lscape} |
---|
25 | \bibpunct[,]{(}{)}{;}{a}{}{,} |
---|
26 | |
---|
27 | \newcommand{\eg}{e.g.\ } |
---|
28 | \newcommand{\ie}{i.e.\ } |
---|
29 | \newcommand{\hi}{H{\sc i}} |
---|
30 | \newcommand{\hipass}{{\sc hipass}} |
---|
31 | \newcommand{\duchamp}{\emph{Duchamp}} |
---|
32 | \newcommand{\atrous}{\textit{{\`a} trous}} |
---|
33 | \newcommand{\Atrous}{\textit{{\`A} trous}} |
---|
34 | \newcommand{\diff}{{\rm d}} |
---|
35 | \newcommand{\entrylabel}[1]{\mbox{\textsf{\bf{#1:}}}\hfil} |
---|
36 | \newenvironment{entry} |
---|
37 | {\begin{list}{}% |
---|
38 | {\renewcommand{\makelabel}{\entrylabel}% |
---|
39 | \setlength{\labelwidth}{30mm}% |
---|
40 | \setlength{\labelsep}{5pt}% |
---|
41 | \setlength{\itemsep}{2pt}% |
---|
42 | \setlength{\parsep}{2pt}% |
---|
43 | \setlength{\leftmargin}{35mm}% |
---|
44 | }% |
---|
45 | }% |
---|
46 | {\end{list}} |
---|
47 | |
---|
48 | |
---|
49 | \title{Source Detection with \duchamp\ v1.0\\A User's Guide} |
---|
50 | \author{Matthew Whiting\\ |
---|
51 | Australia Telescope National Facility\\CSIRO} |
---|
52 | \date{} |
---|
53 | |
---|
54 | % If we are creating a PDF, use different options for graphicx, hyperref. |
---|
55 | \ifPDF |
---|
56 | \usepackage[pdftex]{graphicx,color} |
---|
57 | \usepackage[pdftex]{hyperref} |
---|
58 | \hypersetup{colorlinks=true,% |
---|
59 | citecolor=red,% |
---|
60 | filecolor=red,% |
---|
61 | linkcolor=red,% |
---|
62 | urlcolor=red,% |
---|
63 | } |
---|
64 | \else |
---|
65 | \usepackage[dvips]{graphicx} |
---|
66 | \usepackage[dvips]{hyperref} |
---|
67 | \fi |
---|
68 | |
---|
69 | \pagestyle{headings} |
---|
70 | \begin{document} |
---|
71 | |
---|
72 | \maketitle |
---|
73 | \thispagestyle{empty} |
---|
74 | \begin{figure}[!h] |
---|
75 | \begin{center} |
---|
76 | \includegraphics[width=\textwidth]{cover_image} |
---|
77 | \end{center} |
---|
78 | \end{figure} |
---|
79 | |
---|
80 | \newpage |
---|
81 | \tableofcontents |
---|
82 | |
---|
83 | \newpage |
---|
84 | \section{Introduction and getting going quickly} |
---|
85 | |
---|
86 | This document provides a user's guide to \duchamp, an object-finder |
---|
87 | for use on spectral-line data cubes. The basic execution of \duchamp\ |
---|
88 | is to read in a FITS data cube, find sources in the cube, and produce |
---|
89 | a text file of positions, velocities and fluxes of the detections, as |
---|
90 | well as a postscript file of the spectra of each detection. |
---|
91 | |
---|
92 | So, you have a FITS cube, and you want to find the sources in it. What |
---|
93 | do you do? First, you need to get \duchamp: there are instructions in |
---|
94 | Appendix~\ref{app-install} for obtaining and installing it. Once you |
---|
95 | have it running, the first step is to make an input file that contains |
---|
96 | the list of parameters. Brief and detailed examples are shown in |
---|
97 | Appendix~\ref{app-input}. This file provides the input file name, the |
---|
98 | various output files, and defines various parameters that control the |
---|
99 | execution. |
---|
100 | |
---|
101 | The standard way to run \duchamp\ is by the command |
---|
102 | \begin{quote} |
---|
103 | \texttt{Duchamp -p [parameter file]} |
---|
104 | \end{quote} |
---|
105 | replacing \texttt{[parameter file]} with the name of the file listing |
---|
106 | the parameters. |
---|
107 | |
---|
108 | An even easier way is to use the default values for all parameters |
---|
109 | (these are given in Appendix~\ref{app-param} and in the file |
---|
110 | InputComplete included in the distribution directory) and use the |
---|
111 | syntax |
---|
112 | \begin{quote} |
---|
113 | \texttt{Duchamp -f [FITS file]} |
---|
114 | \end{quote} |
---|
115 | where \texttt{[FITS file]} is the file you wish to search. |
---|
116 | |
---|
117 | In either case, the program will then work away and give you the list |
---|
118 | of detections and their spectra. The program execution is summarised |
---|
119 | below, and detailed in \S\ref{sec-flow}. Information on inputs is in |
---|
120 | \S\ref{sec-param} and Appendix~\ref{app-param}, and descriptions of |
---|
121 | the output is in \S\ref{sec-output}. |
---|
122 | |
---|
123 | \subsection{A summary of the execution steps} |
---|
124 | |
---|
125 | The basic flow of the program is summarised here -- all steps are |
---|
126 | discussed in more detail in the following sections. |
---|
127 | \begin{enumerate} |
---|
128 | \item If the \texttt{-p} option is used, the parameter file given on |
---|
129 | the command line is read in, and the parameters absorbed. |
---|
130 | \item The FITS image is located and read in to memory. |
---|
131 | \item If requested, a FITS image with a previously reconstructed array |
---|
132 | is read in. |
---|
133 | \item If requested, BLANK pixels are trimmed from the edges, and |
---|
134 | the baseline of each spectrum is removed. |
---|
135 | \item If the reconstruction method is requested, and the reconstructed |
---|
136 | array has not been read in at Step 3 above, the cube is |
---|
137 | reconstructed using the \atrous\ wavelet method. |
---|
138 | \item Searching for objects then takes place, using the requested |
---|
139 | thresholding method. |
---|
140 | \item The list of objects is condensed by merging neighbouring objects |
---|
141 | and removing those deemed unacceptable. |
---|
142 | \item The baselines and trimmed pixels are replaced prior to output. |
---|
143 | \item The details of the detections are written to screen and to the |
---|
144 | requested output file. |
---|
145 | \item Maps showing the spatial location of the detections are written. |
---|
146 | \item The integrated spectra of each detection are written to a |
---|
147 | postscript file. |
---|
148 | \item If requested, the reconstructed array can be written to a new |
---|
149 | FITS file. |
---|
150 | \end{enumerate} |
---|
151 | |
---|
152 | \subsection{Guide to terminology and conventions} |
---|
153 | |
---|
154 | First, a brief note on the use of terminology in this guide. \duchamp\ |
---|
155 | is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is |
---|
156 | the Flexible Image Transport System -- see \citet{hanisch01} or |
---|
157 | websites such as |
---|
158 | \href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html} |
---|
159 | for details.} image arrays with three dimensions -- they are assumed |
---|
160 | to have the following form: the first two dimensions (referred to as |
---|
161 | $x$ and $y$) are spatial directions (that is, relating to the position |
---|
162 | on the sky), while the third dimension, $z$, is the spectral |
---|
163 | direction, which can correspond to frequency, wavelength, or |
---|
164 | velocity. The three dimensional analogue of pixels are ``voxels'', or |
---|
165 | volume cells -- a voxel is defined by a unique $(x,y,z)$ location and |
---|
166 | has a unique flux or intensity value associated with it. |
---|
167 | |
---|
168 | Note that it is possible for the FITS file to have more than three |
---|
169 | dimensions (for instance, a fourth dimension representing Stokes |
---|
170 | parameters). \duchamp\ assumes that these extra dimensions occur |
---|
171 | \emph{after} the spatial and spectral ones -- the data will not be |
---|
172 | read correctly if the order is different! Herein, we discuss the data |
---|
173 | in terms of the three basic dimensions, but you should be aware it is |
---|
174 | possible to have more than three. |
---|
175 | |
---|
176 | Each spatial pixel (a given $(x,y)$ coordinate) can be said to be a |
---|
177 | single spectrum, while a slice through the cube perpendicular to the |
---|
178 | spectral direction at a given $z$-value is a single channel (the 2-D |
---|
179 | image is a channel map). |
---|
180 | |
---|
181 | Detection involves locating a contiguous group of voxels with fluxes |
---|
182 | above a certain threshold. \duchamp\ makes no assumptions as to the |
---|
183 | size or shape of the detected features, other than having |
---|
184 | user-selected minimum size criteria. |
---|
185 | |
---|
186 | Features that are detected are assumed to be positive. The user can |
---|
187 | choose to search for negative features by setting an input parameter |
---|
188 | -- this inverts the cube prior to the search (see |
---|
189 | \S\ref{sec-detection} for details). |
---|
190 | |
---|
191 | Note that it is possible to run \duchamp\ on a two-dimensional image |
---|
192 | (\ie one with no frequency or velocity information), or indeed a |
---|
193 | one-dimensional array, and many of the features of the program will |
---|
194 | work fine. The focus, however, is on object detection in three |
---|
195 | dimensions. |
---|
196 | |
---|
197 | \subsection{Why \duchamp?} |
---|
198 | |
---|
199 | Well, it's important for a program to have a name, and the initial |
---|
200 | working title of \emph{cubefind} was somewhat uninspiring. I wanted to |
---|
201 | avoid the classic astronomical approach of designing a cute acronym, |
---|
202 | and since it is designed to work on cubes, I looked at naming it after |
---|
203 | a cubist. \emph{Picasso}, sadly, was already taken \citep{minchin99}, |
---|
204 | so I settled on naming it after Marcel Duchamp, another cubist, but |
---|
205 | also one of the first artists to work with ``found objects''. |
---|
206 | |
---|
207 | \section{User Inputs} |
---|
208 | \label{sec-param} |
---|
209 | |
---|
210 | Input to the program is provided by means of a parameter |
---|
211 | file. Parameters are listed in the file, followed by the value that |
---|
212 | should be assigned to them. The syntax used is \texttt{paramName |
---|
213 | value}. Parameter names are not case-sensitive, and lines in the input |
---|
214 | file that start with \texttt{\#} are ignored. If a parameter is listed |
---|
215 | more than once, the latter value is used, but otherwise the order in |
---|
216 | which the parameters are listed in the input file is |
---|
217 | arbitrary. Example input files can be seen in |
---|
218 | Appendix~\ref{app-input}. |
---|
219 | |
---|
220 | If a parameter is not listed, the default value is assumed. The |
---|
221 | defaults are chosen to provide a good result (using the reconstruction |
---|
222 | method), so the user doesn't need to specify many new parameters in |
---|
223 | the input file. Note that the image file \textbf{must} be specified! |
---|
224 | The parameters that can be set are listed in Appendix~\ref{app-param}, |
---|
225 | with their default values in parentheses. |
---|
226 | |
---|
227 | The parameters with names starting with \texttt{flag} are stored as |
---|
228 | \texttt{bool} variables, and so are either \texttt{true = 1} or |
---|
229 | \texttt{false = 0}. They can be entered in the file either in text or |
---|
230 | integer format -- \duchamp\ will read them correctly in either case. |
---|
231 | |
---|
232 | \section{What \duchamp\ is doing} |
---|
233 | \label{sec-flow} |
---|
234 | |
---|
235 | The execution flow of \duchamp\ is detailed here, indicating the main |
---|
236 | algorithmic steps that are used. The program is written in C/C++ and |
---|
237 | makes use of the {\sc cfitsio}, {\sc wcslib} and {\sc pgplot} |
---|
238 | libraries. |
---|
239 | |
---|
240 | \subsection{Image input} |
---|
241 | \label{sec-input} |
---|
242 | |
---|
243 | The cube is read in using basic {\sc cfitsio} commands, and stored as |
---|
244 | an array in a special C++ class. This class keeps track of the list of |
---|
245 | detected objects, as well as any reconstructed arrays that are made |
---|
246 | (see \S\ref{sec-recon}). The World Coordinate System (WCS) information |
---|
247 | for the cube is also obtained from the FITS header by {\sc wcslib} |
---|
248 | functions \citep{greisen02, calabretta02}, and this information, in |
---|
249 | the form of a \texttt{wcsprm} structure, is also stored in the same |
---|
250 | class. |
---|
251 | |
---|
252 | A sub-section of an image can be requested via the \texttt{subsection} |
---|
253 | parameter in the parameter file -- this can be a good idea if the cube |
---|
254 | has very noisy edges, which may produce many spurious detections. The |
---|
255 | generalised form of the subsection that is used by {\sc cfitsio} is |
---|
256 | \texttt{[x1:x2:dx,y1:y2:dy,z1:z2:dz]}, such that the x-coordinates run |
---|
257 | from \texttt{x1} to \texttt{x2} (inclusive), with steps of |
---|
258 | \texttt{dx}. The step value can be omitted (so a subsection of the |
---|
259 | form \texttt{[2:50,2:50,10:1000]} is still valid). \duchamp\ does not |
---|
260 | make use of any step value present in the subsection string, and any |
---|
261 | that are present are removed before the file is opened. |
---|
262 | |
---|
263 | If one wants the full range of a coordinate then replace the range |
---|
264 | with an asterisk, \eg \texttt{[2:50,2:50,*]}. If one wants to use a |
---|
265 | subsection, one must set \texttt{flagSubsection = 1}. A complete |
---|
266 | description of the section syntax can be found at the {\sc fitsio} web |
---|
267 | site |
---|
268 | \footnote{ |
---|
269 | \href{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}% |
---|
270 | {http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}}. |
---|
271 | |
---|
272 | \subsection{Image modification} |
---|
273 | \label{sec-modify} |
---|
274 | |
---|
275 | Several modifications to the cube can be made that improve the |
---|
276 | execution and efficiency of \duchamp\ (these are optional -- their |
---|
277 | use is indicated by the relevant flags set in the input parameter |
---|
278 | file). |
---|
279 | |
---|
280 | \subsubsection{BLANK pixel removal} |
---|
281 | |
---|
282 | If the imaged area of a cube is non-rectangular, BLANK pixels are used |
---|
283 | to pad it out to a rectangular shape. The value of these pixels is |
---|
284 | given by the FITS header keywords BLANK, BSCALE and BZERO. While these |
---|
285 | pixels make the image a nice shape, they will unnecessarily interfere |
---|
286 | with the processing (as well as taking up needless memory). The first |
---|
287 | step, then, is to trim them from the edge. This is done when the |
---|
288 | parameter \texttt{flagBlankPix=true}. If the above keywords are not |
---|
289 | present, the user can specify the BLANK value by the parameter |
---|
290 | \texttt{blankPixValue}. |
---|
291 | |
---|
292 | Removing BLANK pixels is particularly important for the reconstruction |
---|
293 | step, as lots of BLANK pixels on the edges will smooth out features in |
---|
294 | the wavelet calculation stage. The trimming will also reduce the size |
---|
295 | of the cube's array, speeding up the execution. The amount of trimming |
---|
296 | is recorded, and these pixels are added back in once the |
---|
297 | source-detection is completed (so that quoted pixel positions are |
---|
298 | applicable to the original cube). |
---|
299 | |
---|
300 | Rows and columns are trimmed one at a time until the first non-BLANK |
---|
301 | pixel is reached, so that the image remains rectangular. In practice, |
---|
302 | this means that there will be BLANK pixels left in the trimmed image |
---|
303 | (if the non-BLANK region is non-rectangular). However, these are |
---|
304 | ignored in all further calculations done on the cube. |
---|
305 | |
---|
306 | \subsubsection{Baseline removal} |
---|
307 | |
---|
308 | Second, the user may request the removal of baselines from the |
---|
309 | spectra, via the parameter \texttt{flagBaseline}. This may be |
---|
310 | necessary if there is a strong baseline ripple present, which can |
---|
311 | result in spurious detections at the high points of the ripple. The |
---|
312 | baseline is calculated from a wavelet reconstruction procedure (see |
---|
313 | \S\ref{sec-recon}) that keeps only the two largest scales. This is |
---|
314 | done separately for each spatial pixel (\ie for each spectrum in the |
---|
315 | cube), and the baselines are stored and added back in before any |
---|
316 | output is done. In this way the quoted fluxes and displayed spectra |
---|
317 | are as one would see from the input cube itself -- even though the |
---|
318 | detection (and reconstruction if applicable) is done on the |
---|
319 | baseline-removed cube. |
---|
320 | |
---|
321 | The presence of very strong signals (for instance, masers at several |
---|
322 | hundred Jy) can affect the determination of the baseline, leading to a |
---|
323 | large dip centred on the signal in the baseline-subtracted |
---|
324 | spectrum. To prevent this, the signal is trimmed prior to the |
---|
325 | reconstruction process at some standard threshold (at $8\sigma$ above |
---|
326 | the mean). The baseline determined should thus be representative of |
---|
327 | the true, signal-free baseline. Note that this trimming is only a |
---|
328 | temporary measure which does not affect the source-detection. |
---|
329 | |
---|
330 | \subsubsection{Ignoring bright Milky Way emission} |
---|
331 | |
---|
332 | Finally, a single set of contiguous channels can be ignored -- these |
---|
333 | may exhibit very strong emission, such as that from the Milky Way as |
---|
334 | seen in extragalactic \hi\ cubes (hence the references to ``Milky |
---|
335 | Way'' in relation to this task -- apologies to Galactic |
---|
336 | astronomers!). Such dominant channels will produce many detections |
---|
337 | that are unnecessary, uninteresting (if one is interested in |
---|
338 | extragalactic \hi) and large (in size and hence in memory usage), and |
---|
339 | so will slow the program down and detract from the interesting |
---|
340 | detections. |
---|
341 | |
---|
342 | The use of this feature is controlled by the \texttt{flagMW} |
---|
343 | parameter, and the exact channels concerned are able to be set by the |
---|
344 | user (using \texttt{maxMW} and \texttt{minMW} -- these give an |
---|
345 | inclusive range of channels). When employed, these channels are |
---|
346 | ignored for the searching, and the scaling of the spectral output (see |
---|
347 | Fig.~\ref{fig-spect}) will not take them into account. They will be |
---|
348 | present in the reconstructed array, however, and so will be included |
---|
349 | in the saved FITS file (see \S\ref{sec-reconIO}). When the final |
---|
350 | spectra are plotted, the range of channels covered by these parameters |
---|
351 | is indicated by a green hashed box. |
---|
352 | |
---|
353 | \subsection{Image reconstruction} |
---|
354 | \label{sec-recon} |
---|
355 | |
---|
356 | The user can direct \duchamp\ to reconstruct the data cube using the |
---|
357 | \atrous\ wavelet procedure. A good description of the procedure can be |
---|
358 | found in \citet{starck02:book}. The reconstruction is an effective way |
---|
359 | of removing a lot of the noise in the image, allowing one to search |
---|
360 | reliably to fainter levels, and reducing the number of spurious |
---|
361 | detections. This is an optional step, but one that greatly enhances |
---|
362 | the source-detection process, with the payoff that it can be |
---|
363 | relatively time- and memory-intensive. |
---|
364 | |
---|
365 | \subsubsection{Algorithm} |
---|
366 | |
---|
367 | The steps in the \atrous\ reconstruction are as follows: |
---|
368 | \begin{enumerate} |
---|
369 | \item Set the reconstructed array to 0 everywhere. |
---|
370 | \item The input array is discretely convolved with a given filter |
---|
371 | function. This is determined from the parameter file via the |
---|
372 | \texttt{filterCode} parameter -- see Appendix~\ref{app-param} for |
---|
373 | details on the filters available. |
---|
374 | \item The wavelet coefficients are calculated by taking the difference |
---|
375 | between the convolved array and the input array. |
---|
376 | \item If the wavelet coefficients at a given point are above the |
---|
377 | requested threshold (given by \texttt{snrRecon} as the number of |
---|
378 | $\sigma$ above the mean and adjusted to the current scale -- see |
---|
379 | Appendix~\ref{app-scaling}), add these to the reconstructed array. |
---|
380 | \item The separation of the filter coefficients is doubled. (Note that |
---|
381 | this step provides the name of the procedure\footnote{\atrous\ means |
---|
382 | ``with holes'' in French.}, as gaps or holes are created in the |
---|
383 | filter coverage.) |
---|
384 | \item The procedure is repeated from step 2, using the convolved array |
---|
385 | as the input array. |
---|
386 | \item Continue until the required maximum number of scales is reached. |
---|
387 | \item Add the final smoothed (\ie convolved) array to the |
---|
388 | reconstructed array. This provides the ``DC offset'', as each of the |
---|
389 | wavelet coefficient arrays will have zero mean. |
---|
390 | \end{enumerate} |
---|
391 | |
---|
392 | The reconstruction has at least two iterations. The first iteration |
---|
393 | makes a first pass at the wavelet reconstruction (the process outlined |
---|
394 | in the 8 stages above), but the residual array will inevitably have |
---|
395 | some structure still in it, so the wavelet filtering is done on the |
---|
396 | residual, and any significant wavelet terms are added to the final |
---|
397 | reconstruction. This step is repeated until the change in the $\sigma$ |
---|
398 | of the background is less than some fiducial amount. |
---|
399 | |
---|
400 | It is important to note that the \atrous\ decomposition is an example |
---|
401 | of a ``redundant'' transformation. If no thresholding is performed, |
---|
402 | the sum of all the wavelet coefficient arrays and the final smoothed |
---|
403 | array is identical to the input array. The thresholding thus removes |
---|
404 | only the unwanted structure in the array. |
---|
405 | |
---|
406 | Note that any BLANK pixels that are still in the cube will not be |
---|
407 | altered by the reconstruction -- they will be left as BLANK so that |
---|
408 | the shape of the valid part of the cube is preserved. |
---|
409 | |
---|
410 | \subsubsection{Note on Statistics} |
---|
411 | |
---|
412 | The correct calculation of the reconstructed array needs good |
---|
413 | estimation of the underlying mean and standard deviation of the |
---|
414 | background noise distribution. These statistics are estimated using |
---|
415 | robust methods, to avoid corruption by strong outlying points. The |
---|
416 | mean of the distribution is actually estimated by the median, while |
---|
417 | the median absolute deviation from the median (MADFM) is calculated |
---|
418 | and corrected assuming Gaussianity to estimate the underlying standard |
---|
419 | deviation $\sigma$. The Gaussianity (or Normality) assumption is |
---|
420 | critical, as the MADFM does not give the same value as the usual rms |
---|
421 | or standard deviation value -- for a normal distribution |
---|
422 | $N(\mu,\sigma)$ we find MADFM$=0.6744888\sigma$. The difference |
---|
423 | between the MADFM and $\sigma$ is corrected for, so the user need only |
---|
424 | think in the usual multiples of $\sigma$ when setting |
---|
425 | \texttt{snrRecon}. See Appendix~\ref{app-madfm} for a derivation of |
---|
426 | this value. |
---|
427 | |
---|
428 | When thresholding the different wavelet scales, the value of $\sigma$ |
---|
429 | as measured from the wavelet array needs to be scaled to account for |
---|
430 | the increased amount of correlation between neighbouring pixels (due |
---|
431 | to the convolution). See Appendix~\ref{app-scaling} for details on |
---|
432 | this scaling. |
---|
433 | |
---|
434 | \subsubsection{User control of reconstruction parameters} |
---|
435 | |
---|
436 | The most important parameter for the user to select in relation to the |
---|
437 | reconstruction is the threshold for each wavelet array. This is set |
---|
438 | using the \texttt{snrRecon} parameter, and is given as a multiple of |
---|
439 | the rms (estimated by the MADFM) above the mean (which for the wavelet |
---|
440 | arrays should be approximately zero). There are several other |
---|
441 | parameters that can be altered as well that affect the outcome of the |
---|
442 | reconstruction. |
---|
443 | |
---|
444 | By default, the cube is reconstructed in three dimensions, using a |
---|
445 | 3-dimensional filter and 3-dimensional convolution. This can be |
---|
446 | altered, however, using the parameter \texttt{reconDim}. If set to 1, |
---|
447 | this means the cube is reconstructed by considering each spectrum |
---|
448 | separately, whereas \texttt{reconDim=2} will mean the cube is |
---|
449 | reconstructed by doing each channel map separately. The merits of |
---|
450 | these choices are discussed in \S\ref{sec-notes}, but it should be |
---|
451 | noted that a 2-dimensional reconstruction can be susceptible to edge |
---|
452 | effects if the spatial shape is not rectangular. |
---|
453 | |
---|
454 | The user can also select the minimum scale to be used in the |
---|
455 | reconstruction -- the first scale exhibits the highest frequency |
---|
456 | variations, and so ignoring this one can sometimes be beneficial in |
---|
457 | removing excess noise. The default, however, is to use all scales |
---|
458 | (\texttt{minscale = 1}). |
---|
459 | |
---|
460 | Finally, the filter that is used for the convolution can be selected |
---|
461 | by using \texttt{filterCode} and the relevant code number -- the |
---|
462 | choices are listed in Appendix~\ref{app-param}. A larger filter will |
---|
463 | give a better reconstruction, but take longer and use more memory when |
---|
464 | executing. When multi-dimensional reconstruction is selected, this |
---|
465 | filter is used to construct a 2- or 3-dimensional equivalent. |
---|
466 | |
---|
467 | \subsection{Input/Output of reconstructed arrays} |
---|
468 | \label{sec-reconIO} |
---|
469 | |
---|
470 | The reconstruction stage can be relatively time-consuming, |
---|
471 | particularly for large cubes and reconstructions in 3-D. To get around |
---|
472 | this, \duchamp\ provides a shortcut to allow users to perform multiple |
---|
473 | searches (\eg with different thresholds) on the same reconstruction |
---|
474 | without calculating the reconstruction each time. |
---|
475 | |
---|
476 | The first step is to choose to save the reconstructed array as a FITS |
---|
477 | file by setting \texttt{flagOutputRecon = true}. The file will be |
---|
478 | saved in the same directory as the input image, so the user needs to |
---|
479 | have write permissions for that directory. |
---|
480 | |
---|
481 | The filename will be derived from the input filename, with extra |
---|
482 | information detailing the reconstruction that has been done. For |
---|
483 | example, suppose \texttt{image.fits} has been reconstructed using a |
---|
484 | 3-dimensional reconstruction with filter 2, thresholded at $4\sigma$ |
---|
485 | using all scales. The output filename will then be |
---|
486 | \texttt{image.RECON-3-2-4-1.fits} (\ie it uses the four parameters |
---|
487 | relevant for the \atrous\ reconstruction as listed in |
---|
488 | Appendix~\ref{app-param}). The new FITS file will also have these |
---|
489 | parameters as header keywords. If a subsection of the input image has |
---|
490 | been used (see \S\ref{sec-input}), the format of the output filename |
---|
491 | will be \texttt{image.sub.RECON-3-2-4-1.fits}, and the subsection that |
---|
492 | has been used is also stored in the FITS header. |
---|
493 | |
---|
494 | Likewise, the residual image, defined as the difference between the |
---|
495 | input and reconstructed arrays, can also be saved in the same manner |
---|
496 | by setting \texttt{flagOutputResid = true}. Its filename will be the |
---|
497 | same as above, with RESID replacing RECON. |
---|
498 | |
---|
499 | If a reconstructed image has been saved, it can be read in and used |
---|
500 | instead of redoing the reconstruction. To do so, the user should set |
---|
501 | \texttt{flagReconExists = true}. The user can indicate the name of the |
---|
502 | reconstructed FITS file using the \texttt{reconFile} parameter, or, if |
---|
503 | this is not specified, \duchamp\ searches for the file with the name |
---|
504 | as defined above. If the file is not found, the reconstruction is |
---|
505 | performed as normal. Note that to do this, the user needs to set |
---|
506 | \texttt{flagAtrous = true} (obviously, if this is \texttt{false}, the |
---|
507 | reconstruction is not needed). |
---|
508 | |
---|
509 | \subsection{Searching the image} |
---|
510 | \label{sec-detection} |
---|
511 | |
---|
512 | The image is searched for detections in two ways: spectrally (a |
---|
513 | 1-dimensional search in the spectrum in each spatial pixel), and |
---|
514 | spatially (a 2-dimensional search in the spatial image in each |
---|
515 | channel). In both cases, the algorithm finds connected pixels that are |
---|
516 | above the user-specified threshold. In the case of the spatial image |
---|
517 | search, the algorithm of \citet{lutz80} is used to raster scan through |
---|
518 | the image and connect groups of pixels on neighbouring rows. |
---|
519 | |
---|
520 | Note that this algorithm cannot be applied directly to a 3-dimensional |
---|
521 | case, as it requires that objects are completely nested in a row: that |
---|
522 | is, if you are scanning along a row, and one object finishes and |
---|
523 | another starts, you know that you will not get back to the first one |
---|
524 | (if at all) until the second is completely finished for that |
---|
525 | row. Three-dimensional data does not have this property, which is why |
---|
526 | we break up the searching into 1- and 2-dimensional cases. |
---|
527 | |
---|
528 | The determination of the threshold is done in one of two ways. The |
---|
529 | first way is a simple sigma-clipping, where a threshold is set at a |
---|
530 | fixed number $n$ of standard deviations above the mean, and pixels |
---|
531 | above this threshold are flagged as detected. The value of $n$ is set |
---|
532 | with the parameter \texttt{snrCut}. As before, the value of the |
---|
533 | standard deviation is estimated by the MADFM, and corrected by the |
---|
534 | ratio derived in Appendix~\ref{app-madfm}. |
---|
535 | |
---|
536 | The second method uses the False Discovery Rate (FDR) technique |
---|
537 | \citep{miller01,hopkins02}, whose basis we briefly detail here. The |
---|
538 | false discovery rate (given by the number of false detections divided |
---|
539 | by the total number of detections) is fixed at a certain value |
---|
540 | $\alpha$ (\eg $\alpha=0.05$ implies 5\% of detections are false |
---|
541 | positives). In practice, an $\alpha$ value is chosen, and the ensemble |
---|
542 | average FDR (\ie $\langle FDR \rangle$) when the method is used will |
---|
543 | be less than $\alpha$. One calculates $p$ -- the probability, |
---|
544 | assuming the null hypothesis is true, of obtaining a test statistic as |
---|
545 | extreme as the pixel value (the observed test statistic) -- for each |
---|
546 | pixel, and sorts them in increasing order. One then calculates $d$ |
---|
547 | where |
---|
548 | \[ |
---|
549 | d = \max_j \left\{ j : P_j < \frac{j\alpha}{c_N N} \right\}, |
---|
550 | \] |
---|
551 | and then rejects all hypotheses whose $p$-values are less than or |
---|
552 | equal to $P_d$. (So a $P_i<P_d$ will be rejected even if $P_i \geq |
---|
553 | j\alpha/c_N N$.) Note that ``reject hypothesis'' here means ``accept |
---|
554 | the pixel as an object pixel'' (\ie we are rejecting the null |
---|
555 | hypothesis that the pixel belongs to the background). |
---|
556 | |
---|
557 | The $c_N$ values here are normalisation constants that depend on the |
---|
558 | correlated nature of the pixel values. If all the pixels are |
---|
559 | uncorrelated, then $c_N=1$. If $N$ pixels are correlated, then their |
---|
560 | tests will be dependent on each other, and so $c_N = \sum_{i=1}^N |
---|
561 | i^{-1}$. \citet{hopkins02} consider real radio data, where the pixels |
---|
562 | are correlated over the beam. In this case the sum is made over the |
---|
563 | $N$ pixels that make up the beam. The value of $N$ is calculated from |
---|
564 | the FITS header (if the correct keywords -- BMAJ, BMIN -- are not |
---|
565 | present, a default value of 10 pixels is assumed). |
---|
566 | |
---|
567 | The theory behind the FDR method implies a direct connection between |
---|
568 | the choice of $\alpha$ and the fraction of detections that will be |
---|
569 | false positives. However, due to the merging process, this direct |
---|
570 | connection is lost when looking at the final number of detections -- |
---|
571 | see discussion in \S\ref{sec-notes}. The effect is that the number of |
---|
572 | false detections will be less than indicated by the $\alpha$ value |
---|
573 | used. |
---|
574 | |
---|
575 | If a reconstruction has been made, the residuals (defined in the sense |
---|
576 | of original $-$ reconstruction) are used to estimate the noise |
---|
577 | parameters of the cube. Otherwise they are estimated directly from the |
---|
578 | cube itself. In both cases, robust estimators are used as described |
---|
579 | above. |
---|
580 | |
---|
581 | Detections must have a minimum number of pixels to be counted. This |
---|
582 | minimum number is given by the input parameters \texttt{minPix} (for |
---|
583 | 2-dimensional searches) and \texttt{minChannels} (for 1-dimensional |
---|
584 | searches). |
---|
585 | |
---|
586 | The search only looks for positive features. If one is interested |
---|
587 | instead in negative features (such as absorption lines), set the |
---|
588 | parameter \texttt{flagNegative = true}. This will invert the cube (\ie |
---|
589 | multiply all pixels by $-1$) prior to the search, and then re-invert |
---|
590 | the cube (and the fluxes of any detections) after searching is |
---|
591 | complete. All outputs are done in the same manner as normal, so that |
---|
592 | fluxes of detections will be negative. |
---|
593 | |
---|
594 | \subsection{Merging detected objects} |
---|
595 | \label{sec-merger} |
---|
596 | |
---|
597 | The searching step produces a list of detected objects that will have |
---|
598 | many repeated detections of a given object -- for instance, spectral |
---|
599 | detections in adjacent pixels of the same object and/or spatial |
---|
600 | detections in neighbouring channels. These are then combined in an |
---|
601 | algorithm that matches all objects judged to be ``close''. This |
---|
602 | determination is made in one of two ways. |
---|
603 | |
---|
604 | One way is to define two thresholds -- one spatial and one in velocity |
---|
605 | -- and say that two objects should be merged if there is at least one |
---|
606 | pair of pixels that lie within these threshold distances of each |
---|
607 | other. These thresholds are specified by the parameters |
---|
608 | \texttt{threshSpatial} and \texttt{threshVelocity} (in units of pixels |
---|
609 | and channels respectively). |
---|
610 | |
---|
611 | Alternatively, the spatial requirement can be changed to say that |
---|
612 | there must be a pair of pixels that are \emph{adjacent} -- a stricter, |
---|
613 | but perhaps more realistic requirement, particularly when the spatial |
---|
614 | pixels have a large angular size (as is the case for \hi\ |
---|
615 | surveys). This method can be selected by setting the parameter |
---|
616 | \texttt{flagAdjacent} to 1 (\ie \texttt{true}) in the parameter |
---|
617 | file. The velocity thresholding is done in the same way as the first |
---|
618 | option. |
---|
619 | |
---|
620 | Once the detections have been merged, they may be ``grown''. This is a |
---|
621 | process of increasing the size of the detection by adding adjacent |
---|
622 | pixels that are above some secondary threshold. This threshold is |
---|
623 | lower than the one used for the initial detection, but above the noise |
---|
624 | level, so that faint pixels are only detected when they are close to a |
---|
625 | bright pixel. The value of this threshold is a possible input |
---|
626 | parameter (\texttt{growthCut}), with a default value of |
---|
627 | $1.5\sigma$. The use of the growth algorithm is controlled by the |
---|
628 | \texttt{flagGrowth} parameter -- the default value of which is |
---|
629 | \texttt{false}. If the detections are grown, they are sent through the |
---|
630 | merging algorithm a second time, to pick up any detections that now |
---|
631 | overlap or have grown over each other. |
---|
632 | |
---|
633 | Finally, to be accepted, the detections must span \emph{both} a |
---|
634 | minimum number of channels (to remove any spurious single-channel |
---|
635 | spikes that may be present), and a minimum number of spatial |
---|
636 | pixels. These numbers, as for the original detection step, are set |
---|
637 | with the \texttt{minChannels} and \texttt{minPix} parameters. The |
---|
638 | channel requirement means there must be at least one set of |
---|
639 | \texttt{minChannels} consecutive channels in the source for it to be |
---|
640 | accepted. |
---|
641 | |
---|
642 | \section{Outputs} |
---|
643 | \label{sec-output} |
---|
644 | |
---|
645 | \subsection{During execution} |
---|
646 | |
---|
647 | \duchamp\ provides the user with feedback whilst it is running, to |
---|
648 | keep the user informed on the progress of the analysis. Most of this |
---|
649 | consists of self-explanatory messages about the particular stage the |
---|
650 | program is up to. The relevant parameters are printed to the screen at |
---|
651 | the start (once the file has been successfully read in), so the user |
---|
652 | is able to make a quick check that the setup is correct (see |
---|
653 | Appendix~{app-input} for an example). |
---|
654 | |
---|
655 | If the cube is being trimmed (\S\ref{sec-modify}), the resulting |
---|
656 | dimensions are printed to indicate how much has been trimmed. If a |
---|
657 | reconstruction is being done, a continually updating message shows |
---|
658 | either the current iteration and scale, compared to the maximum scale |
---|
659 | (when \texttt{reconDim=3}), or a progress bar showing the amount of |
---|
660 | the cube that has been reconstructed (for smaller values of |
---|
661 | \texttt{reconDim}). |
---|
662 | |
---|
663 | During the searching algorithms, the progress through the 1D and 2D |
---|
664 | searches are shown. When the searches have completed, the number of |
---|
665 | objects found in both the 1D and 2D searches are reported (see |
---|
666 | \S\ref{sec-detection} for details). |
---|
667 | |
---|
668 | In the merging process (where multiple detections of the same object |
---|
669 | are combined -- see \S\ref{sec-merger}), two stages of output |
---|
670 | occur. The first is when each object in the list is compared with all |
---|
671 | others. The output shows two numbers: the first being how far through |
---|
672 | the list the current object is, and the second being the length of the |
---|
673 | list. As the algorithm proceeds, the first number should increase and |
---|
674 | the second should decrease (as objects are combined). When the numbers |
---|
675 | meet (\ie the whole list has been compared), the second phase begins, |
---|
676 | in which multiply-appearing pixels in each object are removed, as are |
---|
677 | objects not meeting the minimum channels requirement. During this |
---|
678 | phase, the total number of accepted objects is shown, which should |
---|
679 | steadily increase until all have been accepted or rejected. Note that |
---|
680 | these steps can be very quick for small numbers of detections. |
---|
681 | |
---|
682 | Since this continual printing to screen has some overhead of time and |
---|
683 | CPU involved, the user can elect to not print this information by |
---|
684 | setting the parameter \texttt{verbose = 0}. In this case, the user is |
---|
685 | still informed as to the steps being undertaken, but the details of |
---|
686 | the progress are not shown. |
---|
687 | |
---|
688 | \subsection{Results} |
---|
689 | |
---|
690 | \subsubsection{Table of results} |
---|
691 | |
---|
692 | Finally, we get to the results -- the reason for running \duchamp\ in |
---|
693 | the first place. Once the detection list is finalised, it is sorted by |
---|
694 | the mean velocity of the detections (or, if there is no good WCS |
---|
695 | associated with the cube, by the mean Z-pixel position). The results |
---|
696 | are then printed to the screen and to the output file, given by the |
---|
697 | \texttt{OutFile} parameter. The results list, an example of which can |
---|
698 | be seen in Appendix~\ref{app-output}, contains the following columns |
---|
699 | (note that the title of the columns depending on WCS information will |
---|
700 | depend on the projection of the WCS): |
---|
701 | |
---|
702 | \begin{entry} |
---|
703 | \item[Obj\#] The ID number of the detection (simply the sequential |
---|
704 | count for the list, which is ordered by increasing velocity, or |
---|
705 | channel number, if the WCS is not good enough to find the velocity). |
---|
706 | \item[Name] The ``IAU''-format name of the detection (derived from the |
---|
707 | WCS position -- see below for a description of the format). |
---|
708 | \item[X] The average X-pixel position (averaged over all detected |
---|
709 | voxels). |
---|
710 | \item[Y] The average Y-pixel position. |
---|
711 | \item[Z] The average Z-pixel position. |
---|
712 | \item[RA/GLON] The Right Ascension or Galactic Longitude of the centre |
---|
713 | of the object. |
---|
714 | \item[DEC/GLAT] The Declination or Galactic Latitude of the centre of |
---|
715 | the object. |
---|
716 | \item[VEL] The mean velocity of the object [units given by the |
---|
717 | \texttt{spectralUnits} parameter]. |
---|
718 | \item[w\_RA/w\_GLON] The width of the object in Right Ascension or |
---|
719 | Galactic Longitude [arcmin]. |
---|
720 | \item[w\_DEC/w\_GLAT] The width of the object in Declination Galactic |
---|
721 | Latitude [arcmin]. |
---|
722 | \item[w\_VEL] The full velocity width of the detection (max channel |
---|
723 | $-$ min channel, in velocity units [see note below]). |
---|
724 | \item[F\_int] The integrated flux over the object, in the units of |
---|
725 | flux times velocity, corrected for the beam if necessary. |
---|
726 | \item[F\_peak] The peak flux over the object, in the units of flux. |
---|
727 | \item[X1, X2] The minimum and maximum X-pixel coordinates. |
---|
728 | \item[Y1, Y2] The minimum and maximum Y-pixel coordinates. |
---|
729 | \item[Z1, Z2] The minimum and maximum Z-pixel coordinates. |
---|
730 | \item[Npix] The number of voxels (\ie distinct $(x,y,z)$ coordinates) |
---|
731 | in the detection. |
---|
732 | \item[Flag] Whether the detection has any warning flags (see below). |
---|
733 | \end{entry} |
---|
734 | The Name is derived from the WCS position. For instance, a source |
---|
735 | centred on the RA,Dec position 12$^h$53$^m$45$^s$, |
---|
736 | -36$^\circ$24$'$12$''$ will be called J125345$-$362412 (if the epoch |
---|
737 | is J2000) or B125345$-$362412 (if B1950). An alternative form is used |
---|
738 | for Galactic coordinates: a source centred on the position ($l$,$b$) = |
---|
739 | (323.1245, 5.4567) will be called G323.124$+$05.457. If the WCS is not |
---|
740 | valid (\ie is not present or does not have all the necessary |
---|
741 | information), the Name, RA, DEC, VEL and related columns are not |
---|
742 | printed, but the pixel coordinates are still provided. |
---|
743 | |
---|
744 | The velocity units can be specified by the user, using the parameter |
---|
745 | \texttt{spectralUnits} (enter it as a single string). The default |
---|
746 | value is km/s, which should be suitable for most users. These units |
---|
747 | are also used to give the units of integrated flux. Note that if there |
---|
748 | is no rest frequency specified in the FITS header, the \duchamp\ |
---|
749 | output will instead default to using Frequency, with units of MHz. |
---|
750 | |
---|
751 | If the WCS is not specified sufficiently to be used, then all columns |
---|
752 | from RA/GLON to w\_VEL will be left blank. Also, F\_int will be |
---|
753 | replaced with the more simple F\_tot -- the total flux in the |
---|
754 | detection, being the sum of all detected voxels. |
---|
755 | |
---|
756 | The last column contains any warning flags about the detection. There |
---|
757 | are currently two options here. An `E' is printed if the detection is |
---|
758 | next to the edge of the image, meaning either the limit of the pixels, |
---|
759 | or the limit of the non-BLANK pixel region. An `N' is printed if the |
---|
760 | total flux, summed over all the (non-BLANK) pixels in the smallest box |
---|
761 | that completely encloses the detection, is negative. Note that this |
---|
762 | sum is likely to include non-detected pixels. It is of use in pointing |
---|
763 | out detections that lie next to strongly negative pixels, such as |
---|
764 | might arise due to interference -- the detected pixels might then also |
---|
765 | be due to the interference, so caution is advised. |
---|
766 | |
---|
767 | \subsubsection{Other results lists} |
---|
768 | |
---|
769 | Two alternative results files can also be requested. One option is a |
---|
770 | VOTable-format XML file, containing just the RA, Dec, Velocity and the |
---|
771 | corresponding widths of the detections, as well as the fluxes. The |
---|
772 | user should set \texttt{flagVOT = 1}, and put the desired filename in |
---|
773 | the parameter \texttt{votFile} -- note that the default is for it not |
---|
774 | to be produced. This file should be compatible with all Virtual |
---|
775 | Observatory tools (such as Aladin\footnote{ Aladin can be found on the |
---|
776 | web at |
---|
777 | \href{http://aladin.u-strasbg.fr/}{http://aladin.u-strasbg.fr/}}). The |
---|
778 | second option is an annotation file for use with the Karma toolkit of |
---|
779 | visualisation tools (in particular, with \texttt{kvis}). This will |
---|
780 | draw a circle at the position of each detection, and number it |
---|
781 | according to the Obj\# given above. To make use of this option, the |
---|
782 | user should set \texttt{flagKarma = 1}, and put the desired filename |
---|
783 | in the parameter \texttt{karmaFile} -- again, the default is for it |
---|
784 | not to be produced. |
---|
785 | |
---|
786 | As the program is running, it also (optionally) records the detections |
---|
787 | made in each individual spectrum or channel (see \S\ref{sec-detection} |
---|
788 | for details on this process). This is recorded in the file given by |
---|
789 | the parameter \texttt{LogFile}. This file does not include the columns |
---|
790 | \texttt{Name, RA, DEC, w\_RA, w\_DEC, VEL, w\_VEL}. This file is |
---|
791 | designed primarily for diagnostic purposes: \eg to see if a given set |
---|
792 | of pixels is detected in, say, one channel image, but does not survive |
---|
793 | the merging process. The list of pixels (and their fluxes) in the |
---|
794 | final detection list are also printed to this file, again for |
---|
795 | diagnostic purposes. The file also records the execution time, as well |
---|
796 | as the command-line statement used to run \duchamp. The creation of |
---|
797 | this log file can be prevented by setting \texttt{flagLog = |
---|
798 | false}. (This may be a good idea if you are not interested in its |
---|
799 | contents, as it can be a large file if many pixels are being |
---|
800 | detected.) |
---|
801 | |
---|
802 | \subsubsection{Graphical output -- spectra} |
---|
803 | |
---|
804 | As well as the output data file, a postscript file is created that |
---|
805 | shows the spectrum for each detection, together with a small cutout |
---|
806 | image (the 0th moment) and basic information about the detection (note |
---|
807 | that any flags are printed after the name of the detection, in the |
---|
808 | format \texttt{[E]}). If the cube was reconstructed, the spectrum from |
---|
809 | the reconstruction is shown in red, over the top of the original |
---|
810 | spectrum. The spectral extent of the detected object is indicated by |
---|
811 | two dashed blue lines, and the region covered by the ``Milky Way'' |
---|
812 | channels is shown by a green hashed box. An example detection can be |
---|
813 | seen below in Fig.~\ref{fig-spect}. |
---|
814 | |
---|
815 | The spectrum that is plotted is governed by the |
---|
816 | \texttt{spectralMethod} parameter. It can be either \texttt{peak}, |
---|
817 | where the spectrum is from the spatial pixel containing the |
---|
818 | detection's peak flux; or \texttt{sum}, where the spectrum is summed |
---|
819 | over all spatial pixels, and then corrected for the beam size. The |
---|
820 | spectral extent of the detection is indicated with blue lines, and a |
---|
821 | zoom is shown in a separate window. |
---|
822 | |
---|
823 | The cutout image can optionally include a border around the spatial |
---|
824 | pixels that are in the detection (turned on and off by the parameter |
---|
825 | \texttt{drawBorders} -- the default is \texttt{true}). It includes a |
---|
826 | scale bar in the bottom left corner to indicate size -- its length is |
---|
827 | indicated next to it (the choice of length depends on the size of the |
---|
828 | image). |
---|
829 | |
---|
830 | There may also be one or two extra lines on the image. A yellow line |
---|
831 | shows the limits of the cube -- the detected object is obviously close |
---|
832 | to the edge of the cube, and the box size extends outside the region |
---|
833 | covered by the data. A purple line, however, shows the dividing line |
---|
834 | between BLANK and non-BLANK pixels. The BLANK pixels will always be |
---|
835 | shown in black. The first type of line is always drawn, while the |
---|
836 | second is governed by the parameter \texttt{drawBlankEdges} (whose |
---|
837 | default is \texttt{true}), and obviously whether there are any BLANK |
---|
838 | pixel present. |
---|
839 | |
---|
840 | \begin{figure}[t] |
---|
841 | \begin{center} |
---|
842 | \includegraphics[width=\textwidth]{example_spectrum} |
---|
843 | \end{center} |
---|
844 | \caption{\footnotesize An example of the spectrum output. Note several |
---|
845 | of the features discussed in the text: the red lines indicating the |
---|
846 | reconstructed spectrum; the blue dashed lines indicating the |
---|
847 | spectral extent of the detection; the green hashed area indicating |
---|
848 | the Milky Way channels that are ignored by the searching algorithm; |
---|
849 | the blue border showing its spatial extent on the 0th moment map; |
---|
850 | and the 15~arcmin-long scale bar.} |
---|
851 | \label{fig-spect} |
---|
852 | \end{figure} |
---|
853 | |
---|
854 | \subsubsection{Graphical output -- maps} |
---|
855 | |
---|
856 | \begin{figure}[!t] |
---|
857 | \begin{center} |
---|
858 | \includegraphics[width=\textwidth]{example_moment_map} |
---|
859 | \end{center} |
---|
860 | \caption{\footnotesize An example of the moment map created by |
---|
861 | \duchamp. The full extent of the cube is covered, and the 0th moment |
---|
862 | of each object is shown (integrated individually over all the |
---|
863 | detected channels). The purple line indicates the limit of the |
---|
864 | non-BLANK pixels.} |
---|
865 | \label{fig-moment} |
---|
866 | \end{figure} |
---|
867 | |
---|
868 | Finally, a couple of images are optionally produced: a 0th moment map |
---|
869 | of the cube, combining just the detected channels in each object, |
---|
870 | showing the integrated flux in grey-scale; and a ``detection image'', |
---|
871 | a grey-scale image where the pixel values are the number of channels |
---|
872 | that spatial pixel is detected in. In both cases, if |
---|
873 | \texttt{drawBorders = true}, a border is drawn around the spatial |
---|
874 | extent of each detection, and if \texttt{drawBlankEdges = true}, the |
---|
875 | purple line dividing BLANK and non-BLANK pixels (as described above) |
---|
876 | is drawn. An example moment map is shown in Fig.~\ref{fig-moment}. |
---|
877 | The production or otherwise of these images is governed by the |
---|
878 | \texttt{flagMaps} parameter. |
---|
879 | |
---|
880 | The purpose of these images are to provide a visual guide to where the |
---|
881 | detections have been made, and, particularly in the case of the moment |
---|
882 | map, to provide an indication of the strength of the source. In both |
---|
883 | cases, the detections are numbered (in the same sense as the output |
---|
884 | list and as the spectral plots), and the spatial borders are marked |
---|
885 | out as for the cutout images in the spectra file. Both these images |
---|
886 | are saved as postscript files (given by the parameters |
---|
887 | \texttt{momentMap} and \texttt{detectionMap} respectively), with the |
---|
888 | latter also displayed in a {\sc pgplot} window (regardless of the |
---|
889 | state of \texttt{flagMaps}). |
---|
890 | |
---|
891 | \section{Notes and hints on the use of \duchamp} |
---|
892 | \label{sec-notes} |
---|
893 | |
---|
894 | In using \duchamp, the user has to make a number of decisions about |
---|
895 | the way the program runs. This section is designed to give the user |
---|
896 | some idea about what to choose. |
---|
897 | |
---|
898 | The main choice is whether or not to use the wavelet |
---|
899 | reconstruction. The main benefits of this are the marked reduction in |
---|
900 | the noise level, leading to regularly-shaped detections, and good |
---|
901 | reliability for faint sources. The main drawback with its use is the |
---|
902 | long execution time: to reconstruct a $170\times160\times1024$ |
---|
903 | (\hipass) cube often requires three iterations and takes about 20-25 |
---|
904 | minutes to run completely. Note that this is for the three-dimensional |
---|
905 | reconstruction: using \texttt{reconDim=1} makes the reconstruction |
---|
906 | quicker (the full program then takes about 6 minutes), but it is still |
---|
907 | the largest part of the time. |
---|
908 | |
---|
909 | The searching part of the procedure is much quicker: searching an |
---|
910 | un-reconstructed cube leads to execution times of only a couple of |
---|
911 | minutes. Alternatively, using the ability to read in previously-saved |
---|
912 | reconstructed arrays makes running the reconstruction more than once a |
---|
913 | more feasible prospect. |
---|
914 | |
---|
915 | On the positive side, the shape of the detections in a cube that has |
---|
916 | been reconstructed will be much more regular and smooth -- the ragged |
---|
917 | edges that objects in the raw cube possess are smoothed by the removal |
---|
918 | of most of the noise. This enables better determination of the shapes |
---|
919 | and characteristics of objects. |
---|
920 | |
---|
921 | A further point to consider when using the reconstruction is that if |
---|
922 | the two-dimensional reconstruction is chosen (\texttt{reconDim=2}), it |
---|
923 | can be susceptible to edge effects. If the valid area in the cube (\ie |
---|
924 | the part that is not BLANK) has non-rectangular edges, the convolution |
---|
925 | can produce artefacts in the reconstruction that mimic the edges and |
---|
926 | can lead (depending on the selection threshold) to some spurious |
---|
927 | sources. Caution is advised with such data -- the user is advised to |
---|
928 | check carefully the reconstructed cube for the presence of such |
---|
929 | artefacts. Note, however, that the 1- and 3-dimensional |
---|
930 | reconstructions are \emph{not} susceptible in the same way, since the |
---|
931 | spectral direction does not generally exhibit these BLANK edges, and |
---|
932 | so we recommend the use of either of these. |
---|
933 | |
---|
934 | If one chooses the reconstruction method, a further decision is |
---|
935 | required on the signal-to-noise cutoff used in determining acceptable |
---|
936 | wavelet coefficients. A larger value will remove more noise from the |
---|
937 | cube, at the expense of losing fainter sources, while a smaller value |
---|
938 | will include more noise, which may produce spurious detections, but |
---|
939 | will be more sensitive to faint sources. Values of less than about |
---|
940 | $3\sigma$ tend to not reduce the noise a great deal and can lead to |
---|
941 | many spurious sources (although this will depend on the nature of the |
---|
942 | cube). |
---|
943 | |
---|
944 | When it comes to searching, the FDR method produces more reliable |
---|
945 | results than simple sigma-clipping, particularly in the absence of |
---|
946 | reconstruction. However, it does not work in exactly the way one |
---|
947 | would expect for a given value of \texttt{alpha}. For instance, |
---|
948 | setting fairly liberal values of \texttt{alpha} (say, 0.1) will often |
---|
949 | lead to a much smaller fraction of false detections (\ie much less |
---|
950 | than 10\%). This is the effect of the merging algorithms, that combine |
---|
951 | the sources after the detection stage, and reject detections not |
---|
952 | meeting the minimum pixel or channel requirements. It is thus better |
---|
953 | to aim for larger \texttt{alpha} values than those derived from a |
---|
954 | straight conversion of the desired false detection rate. |
---|
955 | |
---|
956 | Finally, as \duchamp\ is still undergoing development, there are some |
---|
957 | elements that are not fully developed. In particular, it is not as |
---|
958 | clever as I would like at avoiding interference. The ability to place |
---|
959 | requirements on the minimum number of channels and pixels partially |
---|
960 | circumvents this problem, but work is being done to make \duchamp\ |
---|
961 | smarter at rejecting signals that are clearly (to a human eye at |
---|
962 | least) interference. See the following section for further |
---|
963 | improvements that are planned. |
---|
964 | |
---|
965 | \section{Future developments} |
---|
966 | |
---|
967 | This is both a list of planned improvements and a wish-list of |
---|
968 | features that would be nice to include (but are not planned in the |
---|
969 | immediate future). Let me know if there are items not on this list, or |
---|
970 | items on the list you would like prioritised. |
---|
971 | |
---|
972 | \begin{itemize} |
---|
973 | |
---|
974 | \item Better determination of the noise characteristics of |
---|
975 | spectral-line cubes, including understanding how the noise is |
---|
976 | generated and developing a model for it. \textbf{Planned.} |
---|
977 | |
---|
978 | \item Include more source analysis. Examples could be: shape |
---|
979 | information; measurements of HI mass; more variety of measurements |
---|
980 | of velocity width and profile. \textbf{Some planned.} |
---|
981 | |
---|
982 | \item Provide some indication of the significance of the detection |
---|
983 | (\ie some S/N-like value). \textbf{Planned.} |
---|
984 | |
---|
985 | \item Improved ability to reject interference, possibly on the |
---|
986 | spectral shape of features. \textbf{Planned.} |
---|
987 | |
---|
988 | \item Ability to separate (de-blend) distinct sources that have been |
---|
989 | merged. \textbf{Planned.} |
---|
990 | |
---|
991 | \item Link to lists of possible counterparts (\eg via NED/SIMBAD/other |
---|
992 | VO tools?). \textbf{Wish-list.} |
---|
993 | |
---|
994 | \item On-line web service interface, so a user can upload a cube and |
---|
995 | get back a source-list. \textbf{Wish-list}. |
---|
996 | |
---|
997 | \item Embed \duchamp\ in a GUI, to move away from the text-based |
---|
998 | interaction. \textbf{Wish-list}. |
---|
999 | \end{itemize} |
---|
1000 | |
---|
1001 | |
---|
1002 | %\bibliographystyle{mn2e} |
---|
1003 | %\bibliographystyle{abbrvnat} |
---|
1004 | %\bibliography{mnrasmnemonic,sourceDetection} |
---|
1005 | \begin{thebibliography}{} |
---|
1006 | |
---|
1007 | \bibitem[\protect\citeauthoryear{{Calabretta} \& {Greisen}}{{Calabretta} \& |
---|
1008 | {Greisen}}{2002}]{calabretta02} |
---|
1009 | {Calabretta} M., {Greisen} E., 2002, A\&A, 395, 1077 |
---|
1010 | |
---|
1011 | \bibitem[\protect\citeauthoryear{{Greisen} \& {Calabretta}}{{Greisen} \& |
---|
1012 | {Calabretta}}{2002}]{greisen02} |
---|
1013 | {Greisen} E., {Calabretta} M., 2002, A\&A, 395, 1061 |
---|
1014 | |
---|
1015 | \bibitem[\protect\citeauthoryear{{Hanisch}, {Farris}, {Greisen}, {Pence}, |
---|
1016 | {Schlesinger}, {Teuben}, {Thompson} \& {Warnock}}{{Hanisch} |
---|
1017 | et~al.}{2001}]{hanisch01} |
---|
1018 | {Hanisch} R., {Farris} A., {Greisen} E., {Pence} W., {Schlesinger} B., |
---|
1019 | {Teuben} P., {Thompson} R., {Warnock} A., 2001, A\&A, 376, 359 |
---|
1020 | |
---|
1021 | \bibitem[\protect\citeauthoryear{{Hopkins}, {Miller}, {Connolly}, {Genovese}, |
---|
1022 | {Nichol} \& {Wasserman}}{{Hopkins} et~al.}{2002}]{hopkins02} |
---|
1023 | {Hopkins} A., {Miller} C., {Connolly} A., {Genovese} C., {Nichol} R., |
---|
1024 | {Wasserman} L., 2002, AJ, 123, 1086 |
---|
1025 | |
---|
1026 | \bibitem[\protect\citeauthoryear{Lutz}{Lutz}{1980}]{lutz80} |
---|
1027 | Lutz R., 1980, The Computer Journal, 23, 262 |
---|
1028 | |
---|
1029 | \bibitem[\protect\citeauthoryear{{Meyer} et~al.,}{{Meyer} |
---|
1030 | et~al.}{2004}]{meyer04:trunc} |
---|
1031 | {Meyer} M., et~al., 2004, MNRAS, 350, 1195 |
---|
1032 | |
---|
1033 | \bibitem[\protect\citeauthoryear{{Miller}, {Genovese}, {Nichol}, {Wasserman}, |
---|
1034 | {Connolly}, {Reichart}, {Hopkins}, {Schneider} \& {Moore}}{{Miller} |
---|
1035 | et~al.}{2001}]{miller01} |
---|
1036 | {Miller} C., {Genovese} C., {Nichol} R., {Wasserman} L., {Connolly} A., |
---|
1037 | {Reichart} D., {Hopkins} A., {Schneider} J., {Moore} A., 2001, AJ, 122, |
---|
1038 | 3492 |
---|
1039 | |
---|
1040 | \bibitem[\protect\citeauthoryear{Minchin}{Minchin}{1999}]{minchin99} |
---|
1041 | Minchin R., 1999, PASA, 16, 12 |
---|
1042 | |
---|
1043 | \bibitem[\protect\citeauthoryear{Starck \& Murtagh}{Starck \& |
---|
1044 | Murtagh}{2002}]{starck02:book} |
---|
1045 | Starck J.-L., Murtagh F., 2002, {``Astronomical Image and Data Analysis''}. |
---|
1046 | Springer |
---|
1047 | |
---|
1048 | \end{thebibliography} |
---|
1049 | |
---|
1050 | |
---|
1051 | \appendix |
---|
1052 | \newpage |
---|
1053 | \section{Obtaining and installing \duchamp} |
---|
1054 | |
---|
1055 | The \duchamp\ web page can be found at the following location:\\ |
---|
1056 | \href{http://www.atnf.csiro.au/people/Matthew.Whiting/Duchamp}% |
---|
1057 | {http://www.atnf.csiro.au/people/Matthew.Whiting/Duchamp}\\ |
---|
1058 | Here you can find a gzipped tar archive of the source code that can be |
---|
1059 | downloaded and extracted, as well as this User's Guide in postscript |
---|
1060 | and hyperlinked PDF formats. |
---|
1061 | |
---|
1062 | To build \duchamp, you will need three main external libraries: |
---|
1063 | pgplot, cfitsio and wcslib. If these are not present on your system, |
---|
1064 | you can download them from the following locations: |
---|
1065 | \begin{itemize} |
---|
1066 | \item PGPlot: |
---|
1067 | \href{http://www.astro.caltech.edu/~tjp/pgplot/}% |
---|
1068 | {http://www.astro.caltech.edu/~tjp/pgplot/} |
---|
1069 | \item CFITSIO: |
---|
1070 | \href{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/fitsio.html}% |
---|
1071 | {http://heasarc.gsfc.nasa.gov/docs/software/fitsio/fitsio.html} |
---|
1072 | \item WCSLIB: |
---|
1073 | \href{http://www.atnf.csiro.au/people/Mark.Calabretta/WCS/index.html}% |
---|
1074 | {http://www.atnf.csiro.au/people/Mark.Calabretta/WCS/index.html} |
---|
1075 | \end{itemize} |
---|
1076 | |
---|
1077 | \duchamp\ can be built on Unix systems by typing (assuming that the |
---|
1078 | prompt your terminal provides is a \texttt{> } -- don't type this |
---|
1079 | character!): |
---|
1080 | \begin{quote} |
---|
1081 | \texttt{% |
---|
1082 | > ./configure\\ |
---|
1083 | > make\\ |
---|
1084 | > make clean (optional -- to remove the object files)} |
---|
1085 | \end{quote} |
---|
1086 | |
---|
1087 | Run in this manner, \texttt{configure} should find all the necessary |
---|
1088 | libraries, but if some libraries have been installed in non-standard |
---|
1089 | locations, it may fail. In this case, you can specify additional |
---|
1090 | directories to look in by giving extra command-line arguments. There |
---|
1091 | are separate options for library files (eg. libcpgplot.a) and header |
---|
1092 | files (eg. cpgplot.h). |
---|
1093 | |
---|
1094 | For example, if \textsc{wcslib} had been installed in |
---|
1095 | \texttt{/home/mduchamp/wcslib}, there are two libraries that are |
---|
1096 | likely to be in separate subdirectories: \texttt{C/} and |
---|
1097 | \texttt{pgsbox/}. Each subdirectory needs to be searched for library |
---|
1098 | and header files, so one could build Duchamp by typing: |
---|
1099 | \begin{quote} |
---|
1100 | \texttt{% |
---|
1101 | > ./configure $\backslash$ \\ |
---|
1102 | LIBDIRS="/home/mduchamp/wcslib/C /home/mduchamp/wcslib/pgsbox" |
---|
1103 | $\backslash$\\ |
---|
1104 | INCDIRS="/home/mduchamp/wcslib/C /home/mduchamp/wcslib/pgsbox"} |
---|
1105 | \end{quote} |
---|
1106 | And then just run make in the usual fashion: |
---|
1107 | \begin{quote} |
---|
1108 | \texttt{> make} |
---|
1109 | \end{quote} |
---|
1110 | |
---|
1111 | This will produce the executable \texttt{Duchamp}. |
---|
1112 | You can verify that it is running correctly by running the test |
---|
1113 | script: |
---|
1114 | \begin{quote} |
---|
1115 | \texttt{> VerifyDuchamp.sh} |
---|
1116 | \end{quote} |
---|
1117 | This will compile and run a bit of |
---|
1118 | C++ code in the directory \texttt{verification/} that creates a dummy FITS |
---|
1119 | image. It then runs Duchamp on this image with three different sets of |
---|
1120 | inputs, and compares to known results, looking for differences and |
---|
1121 | reporting any. There should be none reported if everything is working |
---|
1122 | correctly. |
---|
1123 | |
---|
1124 | You can then run \duchamp\ on your own data. This can be done in one |
---|
1125 | of two ways. The first is: |
---|
1126 | \begin{quote} |
---|
1127 | \texttt{> Duchamp -f [FITS file]} |
---|
1128 | \end{quote} |
---|
1129 | where \texttt{[FITS file]} is the file you wish to search. This method |
---|
1130 | simply uses the default values of all parameters. |
---|
1131 | |
---|
1132 | The second method allows some determination of the parameter values by |
---|
1133 | the user. Type: |
---|
1134 | \begin{quote} |
---|
1135 | \texttt{> Duchamp -p [parameter file]} |
---|
1136 | \end{quote} |
---|
1137 | where \texttt{[parameterFile]} is a file with the input parameters, |
---|
1138 | including the name of the cube you want to search. There are two |
---|
1139 | example input files included with the distribution. The smaller one, |
---|
1140 | \texttt{InputExample}, shows the typical parameters one might want to |
---|
1141 | set. The large one, \texttt{InputComplete}, lists all possible |
---|
1142 | parameters that can be entered, and a brief description of them. To |
---|
1143 | get going quickly, just replace the "your-file-here" in |
---|
1144 | \texttt{InputExample} with your image name, and type |
---|
1145 | \begin{quote} |
---|
1146 | \texttt{> Duchamp -p InputExample} |
---|
1147 | \end{quote} |
---|
1148 | |
---|
1149 | The following appendices provide details on the individual parameters, |
---|
1150 | and show examples of the output files that \duchamp\ produces. |
---|
1151 | |
---|
1152 | \newpage |
---|
1153 | \section{Available parameters} |
---|
1154 | \label{app-param} |
---|
1155 | |
---|
1156 | The full list of parameters that can be listed in the input file are |
---|
1157 | given here. If not listed, they take the default value given in |
---|
1158 | parentheses. Since the order of the parameters in the input file does |
---|
1159 | not matter, they are grouped here in logical sections. |
---|
1160 | |
---|
1161 | \subsection*{Input-output related} |
---|
1162 | \begin{entry} |
---|
1163 | \item[ImageFile (no default assumed)] The filename of the |
---|
1164 | data cube to be analysed. |
---|
1165 | \item[flagSubsection \texttt{[false]}] A flag to indicate whether one |
---|
1166 | wants a subsection of the requested image. |
---|
1167 | \item[Subsection \texttt{[ [*,*,*] ]}] The requested subsection, which |
---|
1168 | should be specified in the format \texttt{[x1:x2,y1:y2,z1:z2]}, |
---|
1169 | where the limits are inclusive. If the full range of a dimension is |
---|
1170 | required, use a \texttt{*}, \eg if you want the full spectral range |
---|
1171 | of a subsection of the image, use \texttt{[30:140,30:140,*]}. |
---|
1172 | \item[flagReconExists \texttt{[false]}] A flag to indicate whether the |
---|
1173 | reconstructed array has been saved by a previous run of \duchamp. If |
---|
1174 | set true, the reconstructed array will be read from the file given |
---|
1175 | by \texttt{reconFile}, rather than calculated directly. |
---|
1176 | \item[reconFile (no default assumed)] The FITS file that contains the |
---|
1177 | reconstructed array. If \texttt{flagReconExists} is true and this |
---|
1178 | parameter is not defined, the default file searched will be |
---|
1179 | determined by the \atrous\ parameters (see \S\ref{sec-recon}). |
---|
1180 | \item[OutFile \texttt{[duchamp-Results.txt]}] The file containing the |
---|
1181 | final list of detections. This also records the list of input |
---|
1182 | parameters. |
---|
1183 | \item[SpectraFile \texttt{[duchamp-Spectra.ps]}] The postscript file |
---|
1184 | containing the resulting integrated spectra and images of the |
---|
1185 | detections. |
---|
1186 | \item[flagLog \texttt{[true]}] A flag to indicate whether intermediate |
---|
1187 | detections should be logged. |
---|
1188 | \item[LogFile \texttt{[duchamp-Logfile.txt]}] The file in which |
---|
1189 | intermediate detections are logged. These are detections that have |
---|
1190 | not been merged. This is primarily for use in debugging and |
---|
1191 | diagnostic purposes -- normal use of the program will probably not |
---|
1192 | require this. |
---|
1193 | \item[flagOutputRecon \texttt{[false]}] A flag to say whether or not |
---|
1194 | to save the reconstructed cube as a FITS file. The filename will be |
---|
1195 | derived from the ImageFile -- the reconstruction of |
---|
1196 | \texttt{image.fits} will be saved as \texttt{image.RECON?.fits}, |
---|
1197 | where \texttt{?} stands for the value of \texttt{snrRecon} (see |
---|
1198 | below). |
---|
1199 | \item[flagOutputResid \texttt{[false]}] As for |
---|
1200 | \texttt{flagOutputRecon}, but for the residual array -- the |
---|
1201 | difference between the original cube and the reconstructed cube. The |
---|
1202 | filename will be \texttt{image.RESID?.fits}. |
---|
1203 | \item[flagVOT \texttt{[false]}] A flag to say whether to create a |
---|
1204 | VOTable file corresponding to the information in |
---|
1205 | \texttt{outfile}. This will be an XML file in the Virtual |
---|
1206 | Observatory VOTable format. |
---|
1207 | \item[votFile \texttt{[duchamp-Results.xml]}] The VOTable file with |
---|
1208 | the list of final detections. Some input parameters are also |
---|
1209 | recorded. |
---|
1210 | \item[flagKarma \texttt{[false]}] A flag to say whether to create a |
---|
1211 | Karma annotation file corresponding to the information in |
---|
1212 | \texttt{outfile}. This can be used as an overlay for the Karma |
---|
1213 | programs such as \texttt{kvis}. |
---|
1214 | \item[karmaFile \texttt{[duchamp-Results.ann]}] The Karma annotation |
---|
1215 | file showing the list of final detections. |
---|
1216 | \item[flagMaps \texttt{[true]}] A flag to say whether to save |
---|
1217 | postscript files showing the 0th moment map of the whole cube |
---|
1218 | (parameter \texttt{momentMap}) and the detection image |
---|
1219 | (\texttt{detectionMap}). |
---|
1220 | \item[momentMap \texttt{[duchamp-MomentMap.ps]}] A postscript file |
---|
1221 | containing a map of the 0th moment of the detected sources, as well |
---|
1222 | as pixel and WCS coordinates. |
---|
1223 | \item[detectionMap \texttt{[duchamp-DetectionMap.ps]}] A postscript |
---|
1224 | file showing each of the detected objects, coloured in greyscale by |
---|
1225 | the number of channels spanned by each pixel. Also shows pixel and |
---|
1226 | WCS coordinates. |
---|
1227 | \end{entry} |
---|
1228 | |
---|
1229 | \subsection*{Modifying the cube} |
---|
1230 | \begin{entry} |
---|
1231 | \item[flagBlankPix \texttt{[true]}] A flag to say whether to remove |
---|
1232 | BLANK pixels from the analysis -- these are pixels set to some |
---|
1233 | particular value because they fall outside the imaged area. |
---|
1234 | \item[blankPixValue \texttt{[-8.00061]}] The value of the BLANK |
---|
1235 | pixels, if this information is not contained in the FITS header (the |
---|
1236 | usual procedure is to obtain this value from the header information |
---|
1237 | -- in which case the value set by this parameter is ignored). |
---|
1238 | \item[flagMW \texttt{[false]}] A flag to say whether to ignore |
---|
1239 | channels contaminated by Milky Way (or other) emission -- the |
---|
1240 | searching algorithms will not look at these channels. |
---|
1241 | \item[maxMW \texttt{[112]}] The maximum channel number containing |
---|
1242 | ``Milky Way'' emission. |
---|
1243 | \item[minMW \texttt{[75]}] The minimum channel number containing |
---|
1244 | ``Milky Way'' emission. Note that the range specified by |
---|
1245 | \texttt{maxMW} and \texttt{minMW} is inclusive. |
---|
1246 | \item[flagBaseline \texttt{[false]}] A flag to say whether to remove |
---|
1247 | the baseline from each spectrum in the cube for the purposes of |
---|
1248 | reconstruction and detection. |
---|
1249 | \end{entry} |
---|
1250 | |
---|
1251 | \subsection*{Detection related} |
---|
1252 | |
---|
1253 | \subsubsection*{General detection} |
---|
1254 | \begin{entry} |
---|
1255 | \item[flagNegative \texttt{[false]}] A flag to indicate that the |
---|
1256 | features being searched for are negative. The cube will be inverted |
---|
1257 | prior to searching. |
---|
1258 | \item[snrCut \texttt{[3.]}] The cut-off value for thresholding, in |
---|
1259 | terms of number of $\sigma$ above the mean. |
---|
1260 | \item[flagGrowth \texttt{[false]}] A flag indicating whether or not to |
---|
1261 | grow the detected objects to a smaller threshold. |
---|
1262 | \item[growthCut \texttt{[2.]}] The smaller threshold using in growing |
---|
1263 | detections. In units of $\sigma$ above the mean. |
---|
1264 | \end{entry} |
---|
1265 | |
---|
1266 | \subsubsection*{\Atrous\ reconstruction} |
---|
1267 | \begin{entry} |
---|
1268 | \item [flagATrous \texttt{[true]}] A flag indicating whether or not to |
---|
1269 | reconstruct the cube using the \atrous\ wavelet |
---|
1270 | reconstruction. See \S\ref{sec-recon} for details. |
---|
1271 | \item[reconDim \texttt{[3]}] The number of dimensions to use in the |
---|
1272 | reconstruction. 1 means reconstruct each spectrum separately, 2 |
---|
1273 | means each channel map is done separately, and 3 means do the whole |
---|
1274 | cube in one go. |
---|
1275 | \item[scaleMin \texttt{[1]}] The minimum wavelet scale to be used in the |
---|
1276 | reconstruction. A value of 1 means ``use all scales''. |
---|
1277 | \item[snrRecon \texttt{[4]}] The thresholding cutoff used in the |
---|
1278 | reconstruction -- only wavelet coefficients this many $\sigma$ above |
---|
1279 | the mean (or greater) are included in the reconstruction. |
---|
1280 | \item[filterCode \texttt{[1]}] The code number of the filter to use in |
---|
1281 | the reconstruction. The options are: |
---|
1282 | \begin{itemize} |
---|
1283 | \item \textbf{1:} B$_3$-spline filter: coefficients = |
---|
1284 | $(\frac{1}{16}, \frac{1}{4}, \frac{3}{8}, \frac{1}{4}, \frac{1}{16})$ |
---|
1285 | \item \textbf{2:} Triangle filter: coefficients = |
---|
1286 | $(\frac{1}{4}, \frac{1}{2}, \frac{1}{4})$ |
---|
1287 | \item \textbf{3:} Haar wavelet: coefficients = |
---|
1288 | $(0, \frac{1}{2}, \frac{1}{2})$ |
---|
1289 | \end{itemize} |
---|
1290 | \end{entry} |
---|
1291 | |
---|
1292 | \subsubsection*{FDR method} |
---|
1293 | \begin{entry} |
---|
1294 | \item[flagFDR \texttt{[false]}] A flag indicating whether or not to use |
---|
1295 | the False Discovery Rate method in thresholding the pixels. |
---|
1296 | \item[alphaFDR \texttt{[0.01]}] The $\alpha$ parameter used in the FDR |
---|
1297 | analysis. The average number of false detections, as a fraction of the |
---|
1298 | total number, will be less than $\alpha$ (see \S\ref{sec-detection}). |
---|
1299 | \end{entry} |
---|
1300 | |
---|
1301 | \subsubsection*{Merging detections} |
---|
1302 | \begin{entry} |
---|
1303 | \item[minPix \texttt{[2]}] The minimum number of spatial pixels for a |
---|
1304 | single detection to be counted. |
---|
1305 | \item[minChannels \texttt{[3]}] The minimum number of consecutive |
---|
1306 | channels that must be present in a detection. |
---|
1307 | \item[flagAdjacent \texttt{[true]}] A flag indicating whether to use |
---|
1308 | the ``adjacent pixel'' criterion to decide whether to merge |
---|
1309 | objects. If not, the next two parameters are used to determine |
---|
1310 | whether objects are within the necessary thresholds. |
---|
1311 | \item[threshSpatial \texttt{[3.]}] The maximum allowed minimum spatial |
---|
1312 | separation (in pixels) between two detections for them to be merged |
---|
1313 | into one. Only used if \texttt{flagAdjacent = false}. |
---|
1314 | \item[threshVelocity \texttt{[7.]}] The maximum allowed minimum channel |
---|
1315 | separation between two detections for them to be merged into |
---|
1316 | one. |
---|
1317 | \end{entry} |
---|
1318 | |
---|
1319 | \subsubsection*{Other parameters} |
---|
1320 | \begin{entry} |
---|
1321 | \item[spectralMethod \texttt{[peak]}] This indicates which method is used |
---|
1322 | to plot the output spectra: \texttt{peak} means plot the spectrum |
---|
1323 | containing the detection's peak pixel; \texttt{sum} means sum the |
---|
1324 | spectra of each detected spatial pixel, and correct for the beam |
---|
1325 | size. Any other choice defaults to \texttt{peak}. |
---|
1326 | \item[spectralUnits \texttt{[km/s]}] The user can specify the units of |
---|
1327 | the spectral axis. Assuming the WCS of the FITS file is valid, the |
---|
1328 | spectral axis is transformed into velocity, and put into these units |
---|
1329 | for all output and for calculations such as the integrated flux of a |
---|
1330 | detection. |
---|
1331 | \item[drawBorders \texttt{[true]}] A flag indicating whether borders |
---|
1332 | are to be drawn around the detected objects in the moment maps |
---|
1333 | included in the output (see for example Fig.~\ref{fig-spect}). |
---|
1334 | \item[drawBlankEdges \texttt{[true]}] A flag indicating whether to |
---|
1335 | draw the dividing line between BLANK and non-BLANK pixels on the |
---|
1336 | 2-dimensional images (see for example Fig.~\ref{fig-moment}). |
---|
1337 | \item[verbose \texttt{[true]}] A flag indicating whether to print the |
---|
1338 | progress of computationally-intensive algorithms (such as the |
---|
1339 | searching and merging) to screen. |
---|
1340 | \end{entry} |
---|
1341 | |
---|
1342 | |
---|
1343 | \newpage |
---|
1344 | \section{Example parameter files} |
---|
1345 | \label{app-input} |
---|
1346 | |
---|
1347 | This is what a typical parameter file would look like. |
---|
1348 | |
---|
1349 | \begin{verbatim} |
---|
1350 | imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1351 | logFile logfile.txt |
---|
1352 | outFile results.txt |
---|
1353 | spectraFile spectra.ps |
---|
1354 | flagSubsection false |
---|
1355 | flagOutputRecon false |
---|
1356 | flagOutputResid 0 |
---|
1357 | flagBlankPix 1 |
---|
1358 | flagMW 1 |
---|
1359 | minMW 75 |
---|
1360 | maxMW 112 |
---|
1361 | minPix 3 |
---|
1362 | flagGrowth 1 |
---|
1363 | growthCut 1.5 |
---|
1364 | flagATrous 0 |
---|
1365 | scaleMin 1 |
---|
1366 | snrRecon 4 |
---|
1367 | flagFDR 1 |
---|
1368 | alphaFDR 0.1 |
---|
1369 | numPixPSF 20 |
---|
1370 | snrCut 3 |
---|
1371 | threshSpatial 3 |
---|
1372 | threshVelocity 7 |
---|
1373 | \end{verbatim} |
---|
1374 | |
---|
1375 | Note that, as in this example, the flag parameters can be entered as |
---|
1376 | strings (true/false) or integers (1/0). Also, note that it is not |
---|
1377 | necessary to include all these parameters in the file, only those that |
---|
1378 | need to be changed from the defaults (as listed in |
---|
1379 | Appendix~\ref{app-param}), which in this case would be very few. A |
---|
1380 | minimal parameter file might look like: |
---|
1381 | \begin{verbatim} |
---|
1382 | imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1383 | flagLog false |
---|
1384 | snrRecon 3 |
---|
1385 | snrCut 2.5 |
---|
1386 | minChannels 4 |
---|
1387 | \end{verbatim} |
---|
1388 | This will reconstruct the cube with a lower SNR value than the |
---|
1389 | default, select objects at a lower threshold, with a looser minimum |
---|
1390 | channel requirement, and not keep a log of the intermediate |
---|
1391 | detections. |
---|
1392 | |
---|
1393 | The following page demonstrates how the parameters are presented to |
---|
1394 | the user, both on the screen at execution time, and in the output and |
---|
1395 | log files. On each line, there is a description on the parameter, the |
---|
1396 | relevant parameter name that is used in the input file (if there is |
---|
1397 | one that the user can enter), and the value of the parameter being |
---|
1398 | used. |
---|
1399 | \newpage |
---|
1400 | \begin{landscape} |
---|
1401 | Typical presentation of parameters in output and log files: |
---|
1402 | \begin{verbatim} |
---|
1403 | ---- Parameters ---- |
---|
1404 | Image to be analysed.........................[imageFile] = input.fits |
---|
1405 | Intermediate Logfile...........................[logFile] = duchamp-Logfile.txt |
---|
1406 | Final Results file.............................[outFile] = duchamp-Results.txt |
---|
1407 | Spectrum file..............................[spectraFile] = duchamp-Spectra.ps |
---|
1408 | 0th Moment Map...............................[momentMap] = duchamp-MomentMap.ps |
---|
1409 | Detection Map.............................[detectionMap] = duchamp-DetectionMap.ps |
---|
1410 | Saving reconstructed cube?.............[flagoutputrecon] = false |
---|
1411 | Saving residuals from reconstruction?..[flagoutputresid] = false |
---|
1412 | ------ |
---|
1413 | Searching for Negative features?..........[flagNegative] = false |
---|
1414 | Fixing Blank Pixels?......................[flagBlankPix] = true |
---|
1415 | Blank Pixel Value....................................... = -8.00061 |
---|
1416 | Removing Milky Way channels?....................[flagMW] = true |
---|
1417 | Milky Way Channels.......................[minMW - maxMW] = 75-112 |
---|
1418 | Beam Size (pixels)...................................... = 10.1788 |
---|
1419 | Removing baselines before search?.........[flagBaseline] = false |
---|
1420 | Minimum # Pixels in a detection.................[minPix] = 2 |
---|
1421 | Minimum # Channels in a detection..........[minChannels] = 3 |
---|
1422 | Growing objects after detection?............[flagGrowth] = false |
---|
1423 | Using A Trous reconstruction?...............[flagATrous] = true |
---|
1424 | Number of dimensions in reconstruction........[reconDim] = 3 |
---|
1425 | Minimum scale in reconstruction...............[scaleMin] = 1 |
---|
1426 | SNR Threshold within reconstruction...........[snrRecon] = 4 |
---|
1427 | Filter being used for reconstruction........[filterCode] = 1 (B3 spline function) |
---|
1428 | Using FDR analysis?............................[flagFDR] = false |
---|
1429 | SNR Threshold...................................[snrCut] = 3 |
---|
1430 | Using Adjacent-pixel criterion?...........[flagAdjacent] = true |
---|
1431 | Max. velocity separation for merging....[threshVelocity] = 7 |
---|
1432 | Method of spectral plotting.............[spectralMethod] = peak |
---|
1433 | \end{verbatim} |
---|
1434 | |
---|
1435 | \newpage |
---|
1436 | \section{Example results file} |
---|
1437 | \label{app-output} |
---|
1438 | This the typical content of an output file, after running \duchamp\ |
---|
1439 | with the parameters illustrated on the previous page. |
---|
1440 | |
---|
1441 | {\scriptsize |
---|
1442 | \begin{verbatim} |
---|
1443 | Results of the \duchamp\ source finder: Tue May 23 14:51:38 2006 |
---|
1444 | ---- Parameters ---- |
---|
1445 | (... omitted for clarity -- see previous page for examples...) |
---|
1446 | -------------------- |
---|
1447 | Total number of detections = 25 |
---|
1448 | -------------------- |
---|
1449 | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
---|
1450 | Obj# Name X Y Z RA DEC VEL w_RA w_DEC w_VEL F_int F_peak X1 X2 Y1 Y2 Z1 Z2 Npix Flag |
---|
1451 | [km/s] [arcmin] [arcmin] [km/s] [Jy km/s] [Jy/beam] [pix] |
---|
1452 | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
---|
1453 | 1 J0618-2532 30.2 86.0 113.3 06:18:12.54 -25:32:44.79 208.502 45.17 34.61 26.383 24.394 0.350 25 35 82 90 112 114 137 E |
---|
1454 | 2 J0609-2156 59.5 140.6 114.6 06:09:19.66 -21:56:31.20 225.572 44.39 31.47 65.957 16.128 0.213 55 65 137 144 113 118 153 |
---|
1455 | 3 J0545-2143 141.2 143.2 114.8 05:45:51.71 -21:43:36.20 228.470 19.61 16.66 26.383 2.412 0.090 139 143 142 145 114 116 29 |
---|
1456 | 4 J0617-2633 33.3 70.8 115.6 06:17:25.52 -26:33:33.83 238.736 65.02 30.10 26.383 9.776 0.117 26 41 68 75 115 117 104 E |
---|
1457 | 5 J0601-2500 86.2 94.9 117.9 06:01:39.54 -25:00:32.46 269.419 27.99 24.02 26.383 3.920 0.124 83 89 92 97 117 119 44 |
---|
1458 | 6 J0602-2547 84.0 83.1 118.0 06:02:18.29 -25:47:31.69 270.319 20.01 19.99 26.383 2.999 0.118 82 86 81 85 117 119 34 |
---|
1459 | 7 J0547-2448 133.0 97.2 118.7 05:47:52.53 -24:48:38.16 279.113 19.72 12.54 26.383 1.474 0.074 131 135 96 98 118 120 21 |
---|
1460 | 8 J0606-2719 71.1 60.0 121.3 06:06:10.99 -27:19:48.61 314.090 52.36 39.59 39.574 14.268 0.150 65 77 55 64 120 123 154 |
---|
1461 | 9 J0611-2137 52.4 145.3 162.5 06:11:20.92 -21:37:29.57 857.955 32.39 23.49 118.722 43.178 0.410 49 56 142 147 158 167 265 E |
---|
1462 | 10 J0600-2859 89.7 35.3 202.4 06:00:34.08 -28:59:00.43 1383.160 23.93 24.10 171.487 24.439 0.173 87 92 33 38 196 209 271 |
---|
1463 | 11 J0558-2638 95.4 70.3 223.1 05:58:53.03 -26:38:45.91 1656.140 11.93 12.07 92.339 1.045 0.063 94 96 69 71 220 227 18 |
---|
1464 | 12 J0617-2723 34.7 58.3 227.4 06:17:07.07 -27:23:50.65 1712.868 16.75 23.53 290.209 8.529 0.093 33 36 56 61 215 237 118 |
---|
1465 | 13 J0558-2525 95.8 88.6 231.7 05:58:49.27 -25:25:33.60 1770.134 27.87 24.16 237.444 12.863 0.115 92 98 86 91 221 239 175 |
---|
1466 | 14 J0600-2141 88.8 144.4 232.5 06:00:54.02 -21:41:57.06 1780.188 27.96 24.13 224.252 30.743 0.166 86 92 142 147 222 239 344 E |
---|
1467 | 15 J0615-2634 40.0 70.8 232.6 06:15:25.50 -26:34:20.04 1782.214 12.44 15.69 52.765 2.084 0.068 39 41 69 72 231 235 31 |
---|
1468 | 16 J0604-2606 76.0 78.4 233.0 06:04:41.13 -26:06:21.19 1787.226 24.13 23.87 211.061 23.563 0.155 73 78 76 81 225 241 278 |
---|
1469 | 17 J0601-2340 87.9 114.9 235.8 06:01:08.83 -23:40:19.37 1824.122 31.95 28.09 237.444 82.380 0.297 85 92 112 118 227 245 647 |
---|
1470 | 18 J0615-2235 38.2 130.5 254.5 06:15:32.09 -22:35:37.24 2070.934 12.29 11.70 105.531 1.555 0.070 37 39 129 131 249 257 24 |
---|
1471 | 19 J0617-2305 31.4 122.8 258.1 06:17:33.45 -23:05:28.94 2118.752 12.34 11.65 26.383 1.022 0.062 30 32 122 124 257 259 16 |
---|
1472 | 20 J0612-2149 49.6 142.2 270.3 06:12:11.04 -21:49:29.72 2279.926 16.27 15.73 395.740 15.156 0.101 48 51 141 144 257 287 204 |
---|
1473 | 21 J0616-2133 35.3 146.0 300.6 06:16:15.78 -21:33:09.69 2679.148 20.22 7.47 224.252 3.014 0.127 33 37 145 146 294 311 28 E |
---|
1474 | 22 J0555-2956 107.3 20.9 367.6 05:55:08.02 -29:56:09.08 3562.236 19.71 20.30 39.574 5.891 0.169 105 109 19 23 366 369 58 |
---|
1475 | 23 J0557-2246 99.8 128.2 434.0 05:57:43.77 -22:46:42.95 4438.776 11.88 16.12 105.531 1.703 0.167 99 101 127 130 430 438 17 N |
---|
1476 | 24 J0616-2648 38.1 67.2 546.8 06:16:02.10 -26:48:35.49 5926.464 12.35 11.67 26.383 1.276 0.064 37 39 66 68 546 548 18 |
---|
1477 | 25 J0552-2916 117.0 30.5 727.0 05:52:13.64 -29:16:58.02 8303.952 11.59 20.25 303.400 35.523 0.479 116 118 28 32 716 739 111 |
---|
1478 | \end{verbatim} |
---|
1479 | } |
---|
1480 | Note that the |
---|
1481 | width of the table can make it hard to read. A good trick for those |
---|
1482 | using UNIX/Linux is to make use of the \texttt{a2ps} command. The |
---|
1483 | following works well, producing a postscript file \texttt{results.ps}: |
---|
1484 | \\\verb|a2ps -1 -r -f8 -o duchamp-Results.ps duchamp-Results.txt| |
---|
1485 | |
---|
1486 | %\end{landscape} |
---|
1487 | |
---|
1488 | \newpage |
---|
1489 | \section{Example VOTable output} |
---|
1490 | \label{app-votable} |
---|
1491 | This is part of the VOTable, in XML format, corresponding to the |
---|
1492 | output file in Appendix~\ref{app-output} (the indentation has been |
---|
1493 | removed to make it fit on the page). |
---|
1494 | |
---|
1495 | %\begin{landscape} |
---|
1496 | {\scriptsize |
---|
1497 | \begin{verbatim} |
---|
1498 | <?xml version="1.0"?> |
---|
1499 | <VOTABLE version="1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
---|
1500 | xsi:noNamespaceSchemaLocation="http://www.ivoa.net/xml/VOTable/VOTable/v1.1"> |
---|
1501 | <COOSYS ID="J2000" equinox="J2000." epoch="J2000." system="eq_FK5"/> |
---|
1502 | <RESOURCE name="Duchamp Output"> |
---|
1503 | <TABLE name="Detections"> |
---|
1504 | <DESCRIPTION>Detected sources and parameters from running the Duchamp source finder.</DESCRIPTION> |
---|
1505 | <PARAM name="FITS file" datatype="char" ucd="meta.file;meta.fits" value="/DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits"/> |
---|
1506 | <PARAM name="Threshold" datatype="float" ucd="stat.snr" value="2.5"> |
---|
1507 | <PARAM name="ATrous note" datatype="char" ucd="meta.note" value="The a trous reconstruction method was used, with the following parameters."> |
---|
1508 | <PARAM name="ATrous Dimension" datatype="int" ucd="meta.code;stat" value="3"> |
---|
1509 | <PARAM name="ATrous Cut" datatype="float" ucd="stat.snr" value="4"> |
---|
1510 | <PARAM name="ATrous Minimum Scale" datatype="int" ucd="stat.param" value="1"> |
---|
1511 | <PARAM name="ATrous Filter" datatype="char" ucd="meta.code;stat" value="B3 spline function"> |
---|
1512 | <FIELD name="ID" ID="col1" ucd="meta.id" datatype="int" width="4"/> |
---|
1513 | <FIELD name="Name" ID="col2" ucd="meta.id;meta.main" datatype="char" arraysize="14"/> |
---|
1514 | <FIELD name="RA" ID="col3" ucd="pos.eq.ra;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/> |
---|
1515 | <FIELD name="Dec" ID="col4" ucd="pos.eq.dec;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/> |
---|
1516 | <FIELD name="w_RA" ID="col3" ucd="phys.angSize;pos.eq.ra" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/> |
---|
1517 | <FIELD name="w_Dec" ID="col4" ucd="phys.angSize;pos.eq.dec" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/> |
---|
1518 | <FIELD name="Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc" datatype="float" width="9" precision="3" unit="km/s"/> |
---|
1519 | <FIELD name="w_Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc;spect.line.width" datatype="float" width="8" precision="3" unit="km/s"/> |
---|
1520 | <FIELD name="Integrated_Flux" ID="col4" ucd="phys.flux;spect.line.intensity" datatype="float" width="10" precision="3" unit="km/s"/> |
---|
1521 | <DATA> |
---|
1522 | <TABLEDATA> |
---|
1523 | <TR> |
---|
1524 | <TD> 1</TD><TD> J0609-2200</TD><TD> 92.410416</TD><TD>-22.013390</TD><TD> 48.50</TD><TD> 39.42</TD><TD> 213.061</TD><TD> 65.957</TD><TD> 17.572</TD> |
---|
1525 | </TR> |
---|
1526 | <TR> |
---|
1527 | <TD> 2</TD><TD> J0608-2605</TD><TD> 92.042633</TD><TD>-26.085157</TD><TD> 44.47</TD><TD> 39.47</TD><TD> 233.119</TD><TD> 39.574</TD><TD> 4.144</TD> |
---|
1528 | </TR> |
---|
1529 | <TR> |
---|
1530 | <TD> 3</TD><TD> J0606-2724</TD><TD> 91.637840</TD><TD>-27.412022</TD><TD> 52.48</TD><TD> 47.57</TD><TD> 302.213</TD><TD> 39.574</TD><TD> 17.066</TD> |
---|
1531 | </TR> |
---|
1532 | (... table truncated for clarity ...) |
---|
1533 | </TABLEDATA> |
---|
1534 | </DATA> |
---|
1535 | </TABLE> |
---|
1536 | </RESOURCE> |
---|
1537 | </VOTABLE> |
---|
1538 | \end{verbatim} |
---|
1539 | } |
---|
1540 | \end{landscape} |
---|
1541 | |
---|
1542 | \newpage |
---|
1543 | \section{Example Karma Annotation file output} |
---|
1544 | \label{app-karma} |
---|
1545 | |
---|
1546 | This is the format of the Karma Annotation file, showing the locations |
---|
1547 | of the detected objects. This can be loaded by the plotting tools of |
---|
1548 | the Karma package (for instance, \texttt{kvis}) as an overlay on the FITS |
---|
1549 | file. |
---|
1550 | |
---|
1551 | \begin{verbatim} |
---|
1552 | # Duchamp Source Finder results for |
---|
1553 | # cube /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1554 | COLOR RED |
---|
1555 | COORD W |
---|
1556 | CIRCLE 92.3376 -21.9475 0.403992 |
---|
1557 | TEXT 92.3376 -21.9475 1 |
---|
1558 | CIRCLE 91.9676 -26.0193 0.37034 |
---|
1559 | TEXT 91.9676 -26.0193 2 |
---|
1560 | CIRCLE 91.5621 -27.3459 0.437109 |
---|
1561 | TEXT 91.5621 -27.3459 3 |
---|
1562 | CIRCLE 92.8285 -21.6344 0.269914 |
---|
1563 | TEXT 92.8285 -21.6344 4 |
---|
1564 | CIRCLE 90.1381 -28.9838 0.234179 |
---|
1565 | TEXT 90.1381 -28.9838 5 |
---|
1566 | CIRCLE 89.72 -26.6513 0.132743 |
---|
1567 | TEXT 89.72 -26.6513 6 |
---|
1568 | CIRCLE 94.2743 -27.4003 0.195175 |
---|
1569 | TEXT 94.2743 -27.4003 7 |
---|
1570 | CIRCLE 92.2739 -21.6941 0.134538 |
---|
1571 | TEXT 92.2739 -21.6941 8 |
---|
1572 | CIRCLE 89.7133 -25.4259 0.232252 |
---|
1573 | TEXT 89.7133 -25.4259 9 |
---|
1574 | CIRCLE 90.2206 -21.6993 0.266247 |
---|
1575 | TEXT 90.2206 -21.6993 10 |
---|
1576 | CIRCLE 93.8581 -26.5766 0.163153 |
---|
1577 | TEXT 93.8581 -26.5766 11 |
---|
1578 | CIRCLE 91.176 -26.1064 0.234356 |
---|
1579 | TEXT 91.176 -26.1064 12 |
---|
1580 | CIRCLE 90.2844 -23.6716 0.299509 |
---|
1581 | TEXT 90.2844 -23.6716 13 |
---|
1582 | CIRCLE 93.8774 -22.581 0.130925 |
---|
1583 | TEXT 93.8774 -22.581 14 |
---|
1584 | CIRCLE 94.3882 -23.0934 0.137108 |
---|
1585 | TEXT 94.3882 -23.0934 15 |
---|
1586 | CIRCLE 93.0491 -21.8223 0.202928 |
---|
1587 | TEXT 93.0491 -21.8223 16 |
---|
1588 | CIRCLE 94.0685 -21.5603 0.168456 |
---|
1589 | TEXT 94.0685 -21.5603 17 |
---|
1590 | CIRCLE 86.0568 -27.6095 0.101113 |
---|
1591 | TEXT 86.0568 -27.6095 18 |
---|
1592 | CIRCLE 88.7932 -29.9453 0.202624 |
---|
1593 | TEXT 88.7932 -29.9453 19 |
---|
1594 | \end{verbatim} |
---|
1595 | |
---|
1596 | \newpage |
---|
1597 | \section{Robust statistics for a Normal distribution} |
---|
1598 | \label{app-madfm} |
---|
1599 | |
---|
1600 | The Normal, or Gaussian, distribution for mean $\mu$ and standard |
---|
1601 | deviation $\sigma$ can be written as |
---|
1602 | \[ |
---|
1603 | f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\ e^{-(x-\mu)^2/2\sigma^2}. |
---|
1604 | \] |
---|
1605 | |
---|
1606 | When one has a purely Gaussian signal, it is straightforward to |
---|
1607 | estimate $\sigma$ by calculating the standard deviation (or rms) of |
---|
1608 | the data. However, if there is a small amount of signal present on top |
---|
1609 | of Gaussian noise, and one wants to estimate the $\sigma$ for the |
---|
1610 | noise, the presence of the large values from the signal can bias the |
---|
1611 | estimator to higher values. |
---|
1612 | |
---|
1613 | An alternative way is to use the median ($m$) and median absolute |
---|
1614 | deviation from the median ($s$) to estimate $\mu$ and $\sigma$. The |
---|
1615 | median is the middle of the distribution, defined for a continuous |
---|
1616 | distribution by |
---|
1617 | \[ |
---|
1618 | \int_{-\infty}^{m} f(x) \diff x = \int_{m}^{\infty} f(x) \diff x. |
---|
1619 | \] |
---|
1620 | From symmetry, we quickly see that for the continuous Normal |
---|
1621 | distribution, $m=\mu$. We consider the case henceforth of $\mu=0$, |
---|
1622 | without loss of generality. |
---|
1623 | |
---|
1624 | To find $s$, we find the distribution of the absolute deviation from |
---|
1625 | the median, and then find the median of that distribution. This |
---|
1626 | distribution is given by |
---|
1627 | \begin{eqnarray*} |
---|
1628 | g(x) &= &{\mbox{\rm distribution of }} |x|\\ |
---|
1629 | &= &f(x) + f(-x),\ x\ge0\\ |
---|
1630 | &= &\sqrt{\frac{2}{\pi\sigma^2}}\, e^{-x^2/2\sigma^2},\ x\ge0. |
---|
1631 | \end{eqnarray*} |
---|
1632 | So, the median absolute deviation from the median, $s$, is given by |
---|
1633 | \[ |
---|
1634 | \int_{0}^{s} g(x) \diff x = \int_{s}^{\infty} g(x) \diff x. |
---|
1635 | \] |
---|
1636 | Now, $\int_{0}^{\infty}e^{-x^2/2\sigma^2} \diff x = \sqrt{\pi\sigma^2/2}$, and |
---|
1637 | so $\int_{s}^{\infty} e^{-x^2/2\sigma^2} \diff x = |
---|
1638 | \sqrt{\pi\sigma^2/2} - \int_{0}^{s} e^{-\frac{x^2}{2\sigma^2}} \diff x |
---|
1639 | $. Hence, to find $s$ we simply solve the following equation (setting $\sigma=1$ for |
---|
1640 | simplicity -- equivalent to stating $x$ and $s$ in units of $\sigma$): |
---|
1641 | \[ |
---|
1642 | \int_{0}^{s}e^{-x^2/2} \diff x - \sqrt{\pi/8} = 0. |
---|
1643 | \] |
---|
1644 | This is hard to solve analytically (no nice analytic solution exists |
---|
1645 | for the finite integral that I'm aware of), but straightforward to |
---|
1646 | solve numerically, yielding the value of $s=0.6744888$. Thus, to |
---|
1647 | estimate $\sigma$ for a Normally distributed data set, one can |
---|
1648 | calculate $s$, then divide by 0.6744888 (or multiply by 1.4826042) to |
---|
1649 | obtain the correct estimator. |
---|
1650 | |
---|
1651 | Note that this is different to solutions quoted elsewhere, |
---|
1652 | specifically in \citet{meyer04:trunc}, where the same robust estimator |
---|
1653 | is used but with an incorrect conversion to standard deviation -- they |
---|
1654 | assume $\sigma = s\sqrt{\pi/2}$. This, in fact, is the conversion used |
---|
1655 | to convert the \emph{mean} absolute deviation from the mean to the |
---|
1656 | standard deviation. This means that the cube noise in the \hipass\ |
---|
1657 | catalogue (their parameter Rms$_{\rm cube}$) should be 18\% larger |
---|
1658 | than quoted. |
---|
1659 | |
---|
1660 | \section{How Gaussian noise changes with wavelet scale} |
---|
1661 | \label{app-scaling} |
---|
1662 | |
---|
1663 | The key element in the wavelet reconstruction of an array is the |
---|
1664 | thresholding of the individual wavelet coefficient arrays. This is |
---|
1665 | usually done by choosing a level to be some number of standard |
---|
1666 | deviations above the mean value. |
---|
1667 | |
---|
1668 | However, since the wavelet arrays are produced by convolving the input |
---|
1669 | array by an increasingly large filter, the pixels in the coefficient |
---|
1670 | arrays become increasingly correlated as the scale of the filter |
---|
1671 | increases. This results in the measured standard deviation from a |
---|
1672 | given coefficient array decreasing with increasing scale. To calculate |
---|
1673 | this, we need to take into account how many other pixels each pixel in |
---|
1674 | the convolved array depends on. |
---|
1675 | |
---|
1676 | To demonstrate, suppose we have a 1-D array with $N$ pixel values |
---|
1677 | given by $F_i,\ i=1,...,N$, and we convolve it with the B$_3$-spline |
---|
1678 | filter, defined by the set of coefficients |
---|
1679 | $\{1/16,1/4,3/8,1/4,1/16\}$. The flux of the $i$th pixel in the |
---|
1680 | convolved array will be |
---|
1681 | \[ |
---|
1682 | F'_i = \frac{1}{16}F_{i-2} + \frac{1}{4}F_{i-1} + \frac{3}{8}F_{i} |
---|
1683 | + \frac{1}{4}F_{i+1} + \frac{1}{16}F_{i+2} |
---|
1684 | \] |
---|
1685 | and the flux of the corresponding pixel in the wavelet array will be |
---|
1686 | \[ |
---|
1687 | W'_i = F_i - F'_i = \frac{-1}{16}F_{i-2} - \frac{1}{4}F_{i-1} |
---|
1688 | + \frac{5}{8}F_{i} - \frac{1}{4}F_{i+1} - \frac{1}{16}F_{i+2} |
---|
1689 | \] |
---|
1690 | Now, assuming each pixel has the same standard deviation |
---|
1691 | $\sigma_i=\sigma$, we can work out the standard deviation for the |
---|
1692 | wavelet array: |
---|
1693 | \[ |
---|
1694 | \sigma'_i = \sigma \sqrt{\left(\frac{1}{16}\right)^2 |
---|
1695 | + \left(\frac{1}{4}\right)^2 + \left(\frac{5}{8}\right)^2 |
---|
1696 | + \left(\frac{1}{4}\right)^2 + \left(\frac{1}{16}\right)^2} |
---|
1697 | = 0.72349\ \sigma |
---|
1698 | \] |
---|
1699 | Thus, the first scale wavelet coefficient array will have a standard |
---|
1700 | deviation of 72.3\% of the input array. This procedure can be followed |
---|
1701 | to calculate the necessary values for all scales, dimensions and |
---|
1702 | filters used by \duchamp. |
---|
1703 | |
---|
1704 | Calculating these values is clearly a critical step in performing the |
---|
1705 | reconstruction. \citet{starck02:book} did so by simulating data sets |
---|
1706 | with Gaussian noise, taking the wavelet transform, and measuring the |
---|
1707 | value of $\sigma$ for each scale. We take a different approach, by |
---|
1708 | calculating the scaling factors directly from the filter coefficients |
---|
1709 | by taking the wavelet transform of an array made up of a 1 in the |
---|
1710 | central pixel and 0s everywhere else. The scaling value is then |
---|
1711 | derived by taking the square root of the sum (in quadrature) of all |
---|
1712 | the wavelet coefficient values at each scale. We give the scaling |
---|
1713 | factors for the three filters available to \duchamp\ on the following |
---|
1714 | page. These values are hard-coded into \duchamp, so no on-the-fly |
---|
1715 | calculation of them is necessary. |
---|
1716 | |
---|
1717 | Memory limitations prevent us from calculating factors for large |
---|
1718 | scales, particularly for the three-dimensional case (hence the -- |
---|
1719 | symbols in the tables). To calculate factors for higher scales than |
---|
1720 | those available, we note the following relationships apply for large |
---|
1721 | scales to a sufficient level of precision: |
---|
1722 | \begin{itemize} |
---|
1723 | \item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{2}$. |
---|
1724 | \item 2-D: factor(scale $i$) = factor(scale $i-1$)$/2$. |
---|
1725 | \item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{8}$. |
---|
1726 | \end{itemize} |
---|
1727 | |
---|
1728 | \newpage |
---|
1729 | \begin{itemize} |
---|
1730 | \item \textbf{B$_3$-Spline Function:} $\{1/16,1/4,3/8,1/4,1/16\}$ |
---|
1731 | |
---|
1732 | \begin{tabular}{llll} |
---|
1733 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1734 | 1 & 0.723489806 & 0.890796310 & 0.956543592\\ |
---|
1735 | 2 & 0.285450405 & 0.200663851 & 0.120336499\\ |
---|
1736 | 3 & 0.177947535 & 0.0855075048 & 0.0349500154\\ |
---|
1737 | 4 & 0.122223156 & 0.0412474444 & 0.0118164242\\ |
---|
1738 | 5 & 0.0858113122 & 0.0204249666 & 0.00413233507\\ |
---|
1739 | 6 & 0.0605703043 & 0.0101897592 & 0.00145703714\\ |
---|
1740 | 7 & 0.0428107206 & 0.00509204670 & 0.000514791120\\ |
---|
1741 | 8 & 0.0302684024 & 0.00254566946 & --\\ |
---|
1742 | 9 & 0.0214024008 & 0.00127279050 & --\\ |
---|
1743 | 10 & 0.0151336781 & 0.000636389722 & --\\ |
---|
1744 | 11 & 0.0107011079 & 0.000318194170 & --\\ |
---|
1745 | 12 & 0.00756682272 & -- & --\\ |
---|
1746 | 13 & 0.00535055108 & -- & --\\ |
---|
1747 | %14 & 0.00378341085 & -- & --\\ |
---|
1748 | %15 & 0.00267527545 & -- & --\\ |
---|
1749 | %16 & 0.00189170541 & -- & --\\ |
---|
1750 | %17 & 0.00133763772 & -- & --\\ |
---|
1751 | %18 & 0.000945852704 & -- & -- |
---|
1752 | \end{tabular} |
---|
1753 | |
---|
1754 | \item \textbf{Triangle Function:} $\{1/4,1/2,1/4\}$ |
---|
1755 | |
---|
1756 | \begin{tabular}{llll} |
---|
1757 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1758 | 1 & 0.612372436 & 0.800390530 & 0.895954449 \\ |
---|
1759 | 2 & 0.330718914 & 0.272878894 & 0.192033014\\ |
---|
1760 | 3 & 0.211947812 & 0.119779282 & 0.0576484078\\ |
---|
1761 | 4 & 0.145740298 & 0.0577664785 & 0.0194912393\\ |
---|
1762 | 5 & 0.102310944 & 0.0286163283 & 0.00681278387\\ |
---|
1763 | 6 & 0.0722128185 & 0.0142747506 & 0.00240175885\\ |
---|
1764 | 7 & 0.0510388224 & 0.00713319703 & 0.000848538128 \\ |
---|
1765 | 8 & 0.0360857673 & 0.00356607618 & 0.000299949455 \\ |
---|
1766 | 9 & 0.0255157615 & 0.00178297280 & -- \\ |
---|
1767 | 10 & 0.0180422389 & 0.000891478237 & -- \\ |
---|
1768 | 11 & 0.0127577667 & 0.000445738098 & -- \\ |
---|
1769 | 12 & 0.00902109930 & 0.000222868922 & -- \\ |
---|
1770 | 13 & 0.00637887978 & -- & -- \\ |
---|
1771 | %14 & 0.00451054902 & -- & -- \\ |
---|
1772 | %15 & 0.00318942978 & -- & -- \\ |
---|
1773 | %16 & 0.00225527449 & -- & -- \\ |
---|
1774 | %17 & 0.00159471988 & -- & -- \\ |
---|
1775 | %18 & 0.000112763724 & -- & -- |
---|
1776 | |
---|
1777 | \end{tabular} |
---|
1778 | |
---|
1779 | \item \textbf{Haar Wavelet:} $\{0,1/2,1/2\}$ |
---|
1780 | |
---|
1781 | \begin{tabular}{llll} |
---|
1782 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1783 | 1 & 0.707167810 & 0.433012702 & 0.935414347 \\ |
---|
1784 | 2 & 0.500000000 & 0.216506351 & 0.330718914\\ |
---|
1785 | 3 & 0.353553391 & 0.108253175 & 0.116926793\\ |
---|
1786 | 4 & 0.250000000 & 0.0541265877 & 0.0413398642\\ |
---|
1787 | 5 & 0.176776695 & 0.0270632939 & 0.0146158492\\ |
---|
1788 | 6 & 0.125000000 & 0.0135316469 & 0.00516748303 |
---|
1789 | |
---|
1790 | \end{tabular} |
---|
1791 | |
---|
1792 | |
---|
1793 | \end{itemize} |
---|
1794 | |
---|
1795 | \end{document} |
---|