1 | \documentclass[12pt,a4paper]{article} |
---|
2 | |
---|
3 | %%%%%% LINE SPACING %%%%%%%%%%%% |
---|
4 | \usepackage{setspace} |
---|
5 | \singlespacing |
---|
6 | %\onehalfspacing |
---|
7 | %\doublespacing |
---|
8 | |
---|
9 | %Define a test for doing PDF format -- use different code below |
---|
10 | \newif\ifPDF |
---|
11 | \ifx\pdfoutput\undefined\PDFfalse |
---|
12 | \else\ifnum\pdfoutput > 0\PDFtrue |
---|
13 | \else\PDFfalse |
---|
14 | \fi |
---|
15 | \fi |
---|
16 | |
---|
17 | \textwidth=161 mm |
---|
18 | \textheight=245 mm |
---|
19 | \topmargin=-15 mm |
---|
20 | \oddsidemargin=0 mm |
---|
21 | \parindent=6 mm |
---|
22 | |
---|
23 | \usepackage[sort]{natbib} |
---|
24 | \usepackage{lscape} |
---|
25 | \bibpunct[,]{(}{)}{;}{a}{}{,} |
---|
26 | |
---|
27 | \newcommand{\eg}{e.g.\ } |
---|
28 | \newcommand{\ie}{i.e.\ } |
---|
29 | \newcommand{\hi}{H{\sc i}} |
---|
30 | \newcommand{\hipass}{{\sc hipass}} |
---|
31 | \newcommand{\duchamp}{\emph{Duchamp}} |
---|
32 | \newcommand{\atrous}{\textit{{\`a} trous}} |
---|
33 | \newcommand{\Atrous}{\textit{{\`A} trous}} |
---|
34 | \newcommand{\diff}{{\rm d}} |
---|
35 | \newcommand{\entrylabel}[1]{\mbox{\textsf{\bf{#1:}}}\hfil} |
---|
36 | \newenvironment{entry} |
---|
37 | {\begin{list}{}% |
---|
38 | {\renewcommand{\makelabel}{\entrylabel}% |
---|
39 | \setlength{\labelwidth}{30mm}% |
---|
40 | \setlength{\labelsep}{5pt}% |
---|
41 | \setlength{\itemsep}{2pt}% |
---|
42 | \setlength{\parsep}{2pt}% |
---|
43 | \setlength{\leftmargin}{35mm}% |
---|
44 | }% |
---|
45 | }% |
---|
46 | {\end{list}} |
---|
47 | |
---|
48 | |
---|
49 | \title{Source Detection with \duchamp\ v1.0\\A User's Guide} |
---|
50 | \author{Matthew Whiting\\ |
---|
51 | %{\small \href{mailto:Matthew.Whiting@csiro.au}{Matthew.Whiting@csiro.au}}\\ |
---|
52 | Australia Telescope National Facility\\CSIRO} |
---|
53 | %\date{January 2006} |
---|
54 | \date{} |
---|
55 | |
---|
56 | % If we are creating a PDF, use different options for graphicx, hyperref. |
---|
57 | \ifPDF |
---|
58 | \usepackage[pdftex]{graphicx,color} |
---|
59 | \usepackage[pdftex]{hyperref} |
---|
60 | \hypersetup{colorlinks=true,% |
---|
61 | citecolor=red,% |
---|
62 | filecolor=red,% |
---|
63 | linkcolor=red,% |
---|
64 | urlcolor=red,% |
---|
65 | } |
---|
66 | \else |
---|
67 | \usepackage[dvips]{graphicx} |
---|
68 | \usepackage[dvips]{hyperref} |
---|
69 | \fi |
---|
70 | |
---|
71 | \pagestyle{headings} |
---|
72 | \begin{document} |
---|
73 | |
---|
74 | \maketitle |
---|
75 | \thispagestyle{empty} |
---|
76 | \begin{figure}[!h] |
---|
77 | \begin{center} |
---|
78 | \includegraphics[width=\textwidth]{cover_image} |
---|
79 | \end{center} |
---|
80 | \end{figure} |
---|
81 | |
---|
82 | \newpage |
---|
83 | \tableofcontents |
---|
84 | |
---|
85 | \newpage |
---|
86 | \section{Introduction and getting going quickly} |
---|
87 | |
---|
88 | This document provides a user's guide to \duchamp, an object-finder |
---|
89 | for use on spectral-line data cubes. The basic execution of |
---|
90 | \duchamp\ is to read in a FITS data cube, find sources in the cube, |
---|
91 | and produce a text file of positions, velocities and fluxes of the |
---|
92 | detections, as well as a postscript file of the spectra of each |
---|
93 | detection. |
---|
94 | |
---|
95 | So, you have a FITS cube, and you want to find the sources in it. What |
---|
96 | do you do? The first step is to make an input file that contains the |
---|
97 | list of parameters. Brief and detailed examples are shown in |
---|
98 | Appendix~\ref{app-input}. This file provides the input file name, the various |
---|
99 | output files, and defines various parameters that control the |
---|
100 | execution. |
---|
101 | |
---|
102 | The standard way to run \duchamp\ is by the command |
---|
103 | \begin{quote} |
---|
104 | \texttt{Duchamp -p [parameter file]} |
---|
105 | \end{quote} |
---|
106 | replacing \texttt{[parameter file]} with the name of the file listing |
---|
107 | the parameters. Alternatively, you can use the syntax |
---|
108 | \begin{quote} |
---|
109 | \texttt{Duchamp -f [FITS file]} |
---|
110 | \end{quote} |
---|
111 | where \texttt{[FITS file]} is the file you wish to search. In the latter |
---|
112 | case, all parameters will take their default values detailed in |
---|
113 | Appendix~\ref{app-param}. In either case, the program will then work |
---|
114 | away and give you the list of detections and their spectra. The |
---|
115 | program execution is summarised below, and detailed in |
---|
116 | \S\ref{sec-flow}. Information on inputs is in \S\ref{sec-param} and |
---|
117 | Appendix~\ref{app-param}, and descriptions of the output is in |
---|
118 | \S\ref{sec-output}. |
---|
119 | |
---|
120 | \subsection{A summary of the execution steps} |
---|
121 | |
---|
122 | The basic flow of the program is summarised here -- all steps are |
---|
123 | discussed in more detail in the following sections. |
---|
124 | \begin{enumerate} |
---|
125 | \item If the \texttt{-p} option is used, the parameter file given on |
---|
126 | the command line is read in, and the parameters absorbed. |
---|
127 | \item The FITS image is located and read in to memory. |
---|
128 | \item If requested, a FITS image with a previously reconstructed array |
---|
129 | is read in. |
---|
130 | \item If requested, blank pixels are trimmed from the edges, and |
---|
131 | the baseline of each spectrum is removed. |
---|
132 | \item If the reconstruction method is requested, and the reconstructed |
---|
133 | array has not been read in at Step 3 above, the cube is |
---|
134 | reconstructed using the \atrous\ wavelet method. |
---|
135 | \item Searching for objects then takes place, using the requested |
---|
136 | thresholding method. |
---|
137 | \item The list of objects is condensed by merging neighbouring objects |
---|
138 | and removing those deemed unacceptable. |
---|
139 | \item The baselines and trimmed pixels are replaced prior to output. |
---|
140 | \item The details of the detections are written to screen and to the |
---|
141 | requested output file. |
---|
142 | \item Maps showing the spatial location of the detections are written. |
---|
143 | \item The integrated spectra of each detection are written to a |
---|
144 | postscript file. |
---|
145 | \item If requested, the reconstructed array can be written to a new |
---|
146 | FITS file. |
---|
147 | \end{enumerate} |
---|
148 | |
---|
149 | \subsection{Guide to terminology} |
---|
150 | |
---|
151 | First, a brief note on the use of terminology in this guide. \duchamp\ |
---|
152 | is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is |
---|
153 | the Flexible Image Transport System -- see \citet{hanisch01} or |
---|
154 | websites such as |
---|
155 | \href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html} |
---|
156 | for details.} image arrays with three dimensions -- they are assumed |
---|
157 | to have the following form: the first two dimensions (referred to as |
---|
158 | $x$ and $y$) are spatial directions (that is, relating to the position |
---|
159 | on the sky), while the third dimension, $z$, is the spectral |
---|
160 | direction, which can correspond to frequency, wavelength, or |
---|
161 | velocity. The three dimensional analogue of pixels are ``voxels'', or |
---|
162 | volume cells -- a voxel is defined by a unique $(x,y,z)$ location and |
---|
163 | has a unique flux or intensity value associated with it. |
---|
164 | |
---|
165 | Each spatial pixel (a given $(x,y)$ coordinate) can be said to be a |
---|
166 | single spectrum, while a slice through the cube perpendicular to the |
---|
167 | spectral direction at a given $z$-value is a single channel (the 2-D |
---|
168 | image is a channel map). |
---|
169 | |
---|
170 | Detection involves locating a contiguous group of voxels with fluxes |
---|
171 | above a certain threshold. \duchamp\ makes no assumptions as to the |
---|
172 | size or shape of the detected features, other than having |
---|
173 | user-selected minimum size criteria. |
---|
174 | |
---|
175 | Features that are detected are assumed to be positive. The user can |
---|
176 | choose to search for negative features by setting an input parameter |
---|
177 | -- this inverts the cube prior to the search (see |
---|
178 | \S\ref{sec-detection} for details). |
---|
179 | |
---|
180 | Note that it is possible to run \duchamp\ on a two-dimensional image |
---|
181 | (\ie one with no frequency or velocity information), or indeed a |
---|
182 | one-dimensional array, and many of the features of the program will |
---|
183 | work fine. The focus, however, is on object detection in three |
---|
184 | dimensions. |
---|
185 | |
---|
186 | \subsection{Why \duchamp?} |
---|
187 | |
---|
188 | Well, it's important for a program to have a name, and the initial |
---|
189 | working title of \emph{cubefind} was somewhat uninspiring. I wanted to |
---|
190 | avoid the classic astronomical approach of designing a cute acronym, |
---|
191 | and since it is designed to work on cubes, I looked at naming it after |
---|
192 | a cubist. \emph{Picasso}, sadly, was already taken \citep{minchin99}, |
---|
193 | so I settled on naming it after Marcel Duchamp, another cubist, but |
---|
194 | also one of the first artists to work with ``found objects''. |
---|
195 | |
---|
196 | \section{User Inputs} |
---|
197 | \label{sec-param} |
---|
198 | |
---|
199 | Input to the program is provided by means of a parameter |
---|
200 | file. Parameters are listed in the file, followed by the value that |
---|
201 | should be assigned to them. The syntax used is \texttt{paramName |
---|
202 | value}. Parameter names are not case-sensitive, and lines in the input |
---|
203 | file that start with \texttt{\#} are ignored. If a parameter is listed |
---|
204 | more than once, the latter value is used, but otherwise the order in |
---|
205 | which the parameters are listed in the input file is arbitrary. |
---|
206 | |
---|
207 | If a parameter is not listed, the default value is assumed. The |
---|
208 | defaults are chosen to provide a good result (using the reconstruction |
---|
209 | method), so the user doesn't need to specify many new parameters in |
---|
210 | the input file. Note that the image file \textbf{must} be specified! The |
---|
211 | parameters that can be set are listed in Appendix~\ref{app-param}, |
---|
212 | with their default values in parentheses. |
---|
213 | |
---|
214 | The parameters with names starting with \texttt{flag} are stored as |
---|
215 | \texttt{bool} variables, and so are either \texttt{true = 1} or |
---|
216 | \texttt{false = 0}. \duchamp\ will only read them from the file as |
---|
217 | integers, and so they should be entered in the file as 0 or 1 (see |
---|
218 | example file in Appendix~\ref{app-input}). |
---|
219 | |
---|
220 | \section{What \duchamp\ is doing} |
---|
221 | \label{sec-flow} |
---|
222 | |
---|
223 | The execution flow of \duchamp\ is detailed here, indicating the |
---|
224 | main algorithmic steps that are used. The program is written in C/C++ |
---|
225 | and makes use of the {\sc cfitsio}, {\sc wcslib} and {\sc pgplot} |
---|
226 | libraries. |
---|
227 | |
---|
228 | %\subsection{Parameter input} |
---|
229 | % |
---|
230 | %The user provides parameters that govern the selection of files and |
---|
231 | %the parameters used by the various subroutines in the program. This is |
---|
232 | %done via a parameter file, and the parameters are stored in a C++ |
---|
233 | %class for use throughout the program. The form of the parameter file is |
---|
234 | %discussed in \S\ref{sec-param}, and the parameters themselves are |
---|
235 | %listed in Appendix~\ref{app-param}. |
---|
236 | |
---|
237 | \subsection{Image input} |
---|
238 | \label{sec-input} |
---|
239 | |
---|
240 | The cube is read in using basic {\sc cfitsio} commands, and stored as |
---|
241 | an array in a special C++ class. This class keeps track of |
---|
242 | the list of detected objects, as well as any reconstructed arrays that |
---|
243 | are made (see \S\ref{sec-recon}). The World Coordinate System (WCS) |
---|
244 | information for the cube is also obtained from the FITS header by {\sc |
---|
245 | wcslib} functions \citep{greisen02, calabretta02}, and this |
---|
246 | information, in the form of a \texttt{wcsprm} structure, is also stored |
---|
247 | in the same class. |
---|
248 | |
---|
249 | A sub-section of an image can be requested via the \texttt{subsection} |
---|
250 | parameter in the parameter file -- this can be a good idea if the cube |
---|
251 | has very noisy edges, which may produce many spurious detections. The |
---|
252 | generalised form of the subsection that is used by {\sc cfitsio} is |
---|
253 | \texttt{[x1:x2:dx,y1:y2:dy,z1:z2:dz]}, such that the x-coordinates run |
---|
254 | from \texttt{x1} to \texttt{x2} (inclusive), with steps of |
---|
255 | \texttt{dx}. The step value can be omitted (so a subsection of the |
---|
256 | form \texttt{[2:50,2:50,10:1000]} is still valid). \duchamp\ does not |
---|
257 | make use of any step value present in the subsection string, and any |
---|
258 | that are present are removed before the file is opened. |
---|
259 | |
---|
260 | If one wants the full range of a coordinate then replace the range |
---|
261 | with an asterisk, \eg \texttt{[2:50,2:50,*]}. If one wants to use a |
---|
262 | subsection, one must set \texttt{flagSubsection = 1}. A complete |
---|
263 | description of the section syntax can be found at the {\sc fitsio} web |
---|
264 | site |
---|
265 | \footnote{ |
---|
266 | \href{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}% |
---|
267 | {http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}}. |
---|
268 | |
---|
269 | \subsection{Image modification} |
---|
270 | \label{sec-modify} |
---|
271 | |
---|
272 | Several modifications to the cube can be made that improve the |
---|
273 | execution and efficiency of \duchamp\ (these are optional -- their |
---|
274 | use is indicated by the relevant flags set in the input parameter |
---|
275 | file). |
---|
276 | |
---|
277 | \subsubsection{Blank pixel removal} |
---|
278 | |
---|
279 | First, the cube is trimmed of any BLANK pixels that pad the image out |
---|
280 | to a rectangular shape. This is optional, its use determined by the |
---|
281 | \texttt{flagBlankPix} parameter. The value for these pixels is read from |
---|
282 | the FITS header (using the BLANK, BSCALE and BZERO keywords), but if |
---|
283 | these are not present then the value can be specified by the user in |
---|
284 | the parameter file using \texttt{blankPixValue}. |
---|
285 | |
---|
286 | This stage is particularly important for the reconstruction step, as |
---|
287 | lots of BLANK pixels on the edges will smooth out features in the |
---|
288 | wavelet calculation stage. The trimming will also reduce the size of |
---|
289 | the cube's array, speeding up the execution. The amount of trimming is |
---|
290 | recorded, and these pixels are added back in once the source-detection |
---|
291 | is completed (so that quoted pixel positions are applicable to the |
---|
292 | original cube). |
---|
293 | |
---|
294 | Rows and columns are trimmed one at a time until the first non-BLANK |
---|
295 | pixel is reached, so that the image remains rectangular. In practice, |
---|
296 | this means that there will be BLANK pixels left in the trimmed image |
---|
297 | (if the non-BLANK region is non-rectangular). However, these are |
---|
298 | ignored in all further calculations done on the cube. |
---|
299 | |
---|
300 | \subsubsection{Baseline removal} |
---|
301 | |
---|
302 | Second, the user may request the removal of baselines from the |
---|
303 | spectra, via the parameter \texttt{flagBaseline}. This may be necessary |
---|
304 | if there is a strong baseline ripple present, which can result in |
---|
305 | spurious detections at the high points of the ripple. The baseline is |
---|
306 | calculated from a wavelet reconstruction procedure (see |
---|
307 | \S\ref{sec-recon}) that keeps only the two largest scales. This is |
---|
308 | done separately for each spatial pixel (\ie for each spectrum in the |
---|
309 | cube), and the baselines are stored and added back in before any |
---|
310 | output is done. In this way the quoted fluxes and displayed spectra |
---|
311 | are as one would see from the input cube itself -- even though the |
---|
312 | detection (and reconstruction if applicable) is done on the |
---|
313 | baseline-removed cube. |
---|
314 | |
---|
315 | The presence of very strong signals (for instance, masers at several |
---|
316 | hundred Jy) can affect the determination of the baseline, leading to a |
---|
317 | large dip centred on the signal in the baseline-subtracted |
---|
318 | spectrum. To prevent this, the signal is trimmed prior to the |
---|
319 | reconstruction process at some standard threshold (at $8\sigma$ above |
---|
320 | the mean). The baseline determined should thus be representative of |
---|
321 | the true, signal-free baseline. Note that this trimming is only a |
---|
322 | temporary measure which does not affect the source-detection. |
---|
323 | |
---|
324 | \subsubsection{Ignoring bright Milky Way emission} |
---|
325 | |
---|
326 | Finally, a single set of contiguous channels can be ignored -- these |
---|
327 | may exhibit very strong emission, such as that from the Milky Way as |
---|
328 | seen in extragalactic \hi\ cubes (hence the references to ``Milky |
---|
329 | Way'' in relation to this task -- apologies to Galactic |
---|
330 | astronomers!). Such dominant channels will produce many detections |
---|
331 | that are unnecessary, uninteresting (if one is interested in |
---|
332 | extragalactic \hi) and large (in size and hence in memory usage), and |
---|
333 | so will slow the program down and detract from the interesting |
---|
334 | detections. The use of this feature is controlled by the |
---|
335 | \texttt{flagMW} parameter, and the exact channels concerned are able |
---|
336 | to be set by the user (using \texttt{maxMW} and \texttt{minMW} -- |
---|
337 | these give an inclusive range of channels). When employed, these |
---|
338 | channels are temporarily blanked out for the searching, and the |
---|
339 | scaling of the spectral output (see Fig.~\ref{fig-spect}) will not |
---|
340 | take them into account. They will be present in the reconstructed |
---|
341 | array, however, and so will be included in the saved FITS file (see |
---|
342 | \S\ref{sec-reconIO}). When the final spectra are plotted, the range of |
---|
343 | channels covered by these parameters is indicated by a green hashed |
---|
344 | box. |
---|
345 | |
---|
346 | \subsection{Image reconstruction} |
---|
347 | \label{sec-recon} |
---|
348 | |
---|
349 | The user can direct \duchamp\ to reconstruct the data cube using the |
---|
350 | \atrous\ wavelet procedure. A good description of the procedure can be |
---|
351 | found in \citet{starck02:book}. The reconstruction is an effective way |
---|
352 | of removing a lot of the noise in the image, allowing one to search |
---|
353 | reliably to fainter levels, and reducing the number of spurious |
---|
354 | detections. This is an optional step, but one that greatly enhances |
---|
355 | the source-detection process, with the payoff that it can be |
---|
356 | relatively time- and memory-intensive. |
---|
357 | |
---|
358 | \subsubsection{Algorithm} |
---|
359 | |
---|
360 | The steps in the \atrous\ reconstruction are as follows: |
---|
361 | \begin{enumerate} |
---|
362 | \item Set the reconstructed array to 0 everywhere. |
---|
363 | \item The input array is discretely convolved with a given filter |
---|
364 | function. This is determined from the parameter file via the |
---|
365 | \texttt{filterCode} parameter -- see Appendix~\ref{app-param} for |
---|
366 | details on the filters available. |
---|
367 | \item The wavelet coefficients are calculated by taking the difference |
---|
368 | between the convolved array and the input array. |
---|
369 | \item If the wavelet coefficients at a given point are above the |
---|
370 | requested threshold (given by \texttt{snrRecon} as the number of |
---|
371 | $\sigma$ above the mean and adjusted to the current scale -- see |
---|
372 | Appendix~\ref{app-scaling}), add these to the reconstructed array. |
---|
373 | \item The separation of the filter coefficients is doubled. (Note that |
---|
374 | this step provides the name of the procedure\footnote{\atrous\ means |
---|
375 | ``with holes'' in French.}, as gaps or holes are created in the |
---|
376 | filter coverage.) |
---|
377 | \item The procedure is repeated from step 2, using the convolved array |
---|
378 | as the input array. |
---|
379 | \item Continue until the required maximum number of scales is reached. |
---|
380 | \item Add the final smoothed (\ie convolved) array to the |
---|
381 | reconstructed array. This provides the ``DC offset'', as each of the |
---|
382 | wavelet coefficient arrays will have zero mean. |
---|
383 | \end{enumerate} |
---|
384 | |
---|
385 | The reconstruction has at least two iterations. The first iteration |
---|
386 | makes a first pass at the wavelet reconstruction (the process outlined |
---|
387 | in the 8 stages above), but the residual array will inevitably have |
---|
388 | some structure still in it, so the wavelet filtering is done on the |
---|
389 | residual, and any significant wavelet terms are added to the final |
---|
390 | reconstruction. This step is repeated until the change in the $\sigma$ |
---|
391 | of the background is less than some fiducial amount. |
---|
392 | |
---|
393 | It is important to note that the \atrous\ decomposition is an |
---|
394 | example of a ``redundant'' transformation. If no thresholding is |
---|
395 | performed, the sum of all the wavelet coefficient arrays and the final |
---|
396 | smoothed array is identical to the input array. The thresholding thus |
---|
397 | removes only the unwanted structure in the array. |
---|
398 | |
---|
399 | Note that any BLANK pixels that are still in the cube will not be |
---|
400 | altered by the reconstruction -- they will be left as BLANK so that |
---|
401 | the shape of the valid part of the cube is preserved. |
---|
402 | |
---|
403 | \subsubsection{Note on Statistics} |
---|
404 | |
---|
405 | The correct calculation of the reconstructed array needs good |
---|
406 | estimation of the underlying mean and standard deviation of the |
---|
407 | background noise distribution. These statistics are estimated using |
---|
408 | robust methods, to avoid corruption by strong outlying points. The |
---|
409 | mean of the distribution is actually estimated by the median, while |
---|
410 | the median absolute deviation from the median (MADFM) is calculated |
---|
411 | and corrected assuming Gaussianity to estimate the underlying standard |
---|
412 | deviation $\sigma$. The Gaussianity (or Normality) assumption is |
---|
413 | critical, as the MADFM does not give the same value as the usual rms |
---|
414 | or standard deviation value -- for a normal distribution |
---|
415 | $N(\mu,\sigma)$ we find MADFM$=0.6744888\sigma$. The difference |
---|
416 | between the MADFM and $\sigma$ is corrected for, so the user need only |
---|
417 | think in the usual multiples of $\sigma$ when setting |
---|
418 | \texttt{snrRecon}. See Appendix~\ref{app-madfm} for a derivation of |
---|
419 | this value. |
---|
420 | |
---|
421 | When thresholding the different wavelet scales, the value of $\sigma$ |
---|
422 | as measured from the wavelet array needs to be scaled to account for the |
---|
423 | increased amount of correlation between neighbouring pixels (due to |
---|
424 | the convolution). See Appendix~\ref{app-scaling} for details on this |
---|
425 | scaling. |
---|
426 | |
---|
427 | \subsubsection{User control of reconstruction parameters} |
---|
428 | |
---|
429 | The most important parameter for the user to select in relation to the |
---|
430 | reconstruction is the threshold for each wavelet array. This is set |
---|
431 | using the \texttt{snrRecon} parameter, and is given as a multiple of the |
---|
432 | rms (estimated by the MADFM) above the mean (which for the wavelet |
---|
433 | arrays should be approximately zero). There are several other |
---|
434 | parameters that can be altered as well that affect the outcome of the |
---|
435 | reconstruction. |
---|
436 | |
---|
437 | By default, the cube is reconstructed in three dimensions, using a |
---|
438 | 3-dimensional filter and 3-dimensional convolution. This can be |
---|
439 | altered, however, using the parameter \texttt{reconDim}. If set to 1, |
---|
440 | this means the cube is reconstructed by considering each spectrum |
---|
441 | separately, whereas \texttt{reconDim=2} will mean the cube is |
---|
442 | reconstructed by doing each channel map separately. The merits of |
---|
443 | these choices are discussed in \S\ref{sec-notes}, but it should be |
---|
444 | noted that a 2-dimensional reconstruction can be susceptible to edge |
---|
445 | effects if the spatial shape is not rectangular. |
---|
446 | |
---|
447 | The user can also select the minimum scale to be used in the |
---|
448 | reconstruction -- the first scale exhibits the highest frequency |
---|
449 | variations, and so ignoring this one can sometimes be beneficial in |
---|
450 | removing excess noise. The default, however, is to use all scales |
---|
451 | (\texttt{minscale = 1}). |
---|
452 | |
---|
453 | Finally, the filter that is used for the convolution can be selected |
---|
454 | by using \texttt{filterCode} and the relevant code number -- the |
---|
455 | choices are listed in Appendix~\ref{app-param}. A larger filter will |
---|
456 | give a better reconstruction, but take longer and use more memory when |
---|
457 | executing. When multi-dimensional reconstruction is selected, this |
---|
458 | filter is used to construct a 2- or 3-dimensional equivalent. |
---|
459 | |
---|
460 | \subsection{Reconstruction I/O} |
---|
461 | \label{sec-reconIO} |
---|
462 | |
---|
463 | The reconstruction stage can be relatively time-consuming, particularly |
---|
464 | for large cubes and reconstructions in 3-D. To get around this, \duchamp\ |
---|
465 | provides a shortcut to allow users to perform multiple searches (\eg with |
---|
466 | different thresholds) on the same reconstruction without calculating the |
---|
467 | reconstruction each time. |
---|
468 | |
---|
469 | The first step is to choose to save the reconstructed array as a FITS |
---|
470 | file by setting \texttt{flagOutputRecon = true}. The file will be saved |
---|
471 | in the same directory as the input image, so the user needs to have write |
---|
472 | permissions for that directory. |
---|
473 | |
---|
474 | The filename will be derived from the input filename, with extra |
---|
475 | information detailing the reconstruction that has been done. For |
---|
476 | example, suppose \texttt{image.fits} has been reconstructed using a |
---|
477 | 3-dimensional reconstruction with filter 2, thresholded at $4\sigma$ |
---|
478 | using all scales. The output filename will then be |
---|
479 | \texttt{image.RECON-3-2-4-1.fits} (\ie it uses the four parameters |
---|
480 | relevant for the \atrous\ reconstruction as listed in |
---|
481 | Appendix~\ref{app-param}). The new FITS file will also have these |
---|
482 | parameters as header keywords. If a subsection of the input image has |
---|
483 | been used (see \S\ref{sec-input}), the format of the output filename |
---|
484 | will be \texttt{image.sub.RECON-3-2-4-1.fits}, and the subsection that |
---|
485 | has been used is also stored in the FITS header. |
---|
486 | |
---|
487 | Likewise, the residual image, defined as the difference between the input |
---|
488 | and reconstructed arrays, can also be saved in the same manner by setting |
---|
489 | \texttt{flagOutputResid = true}. Its filename will be the same as above, |
---|
490 | with RESID replacing RECON. |
---|
491 | |
---|
492 | If a reconstructed image has been saved, it can be read in and used |
---|
493 | instead of redoing the reconstruction. To do so, the user should set |
---|
494 | \texttt{flagReconExists = true}. The user can indicate the name of the |
---|
495 | reconstructed FITS file using the \texttt{reconFile} parameter, or, if |
---|
496 | this is not specified, \duchamp\ searches for the file with the name |
---|
497 | as defined above. If the file is not found, the reconstruction is |
---|
498 | performed as normal. Note that to do this, the user needs to set |
---|
499 | \texttt{flagAtrous = true} (obviously, if this is \texttt{false}, the |
---|
500 | reconstruction is not needed). |
---|
501 | |
---|
502 | \subsection{Searching the image} |
---|
503 | \label{sec-detection} |
---|
504 | |
---|
505 | The image is searched for detections in two ways: spectrally (a |
---|
506 | 1-dimensional search in the spectrum in each spatial pixel), and |
---|
507 | spatially (a 2-dimensional search in the spatial image in each |
---|
508 | channel). In both cases, the algorithm finds connected pixels that are |
---|
509 | above the user-specified threshold. In the case of the spatial image |
---|
510 | search, the algorithm of \citet{lutz80} is used to raster scan through |
---|
511 | the image and connect groups of pixels on neighbouring rows. |
---|
512 | |
---|
513 | Note that this algorithm cannot be applied directly to a 3-dimensional |
---|
514 | case, as it requires that objects are completely nested in a row: that |
---|
515 | is, if you are scanning along a row, and one object finishes and |
---|
516 | another starts, you know that you will not get back to the first one |
---|
517 | (if at all) until the second is completely finished for that |
---|
518 | row. Three-dimensional data does not have this property, which is why |
---|
519 | we break up the searching into 1- and 2-dimensional cases. |
---|
520 | |
---|
521 | The determination of the threshold is done in one of two ways. The |
---|
522 | first way is a simple sigma-clipping, where a threshold is set at a |
---|
523 | fixed number $n$ of standard deviations above the mean, and pixels |
---|
524 | above this threshold are flagged as detected. The value of $n$ is set |
---|
525 | with the parameter \texttt{snrCut}. As before, the value of the |
---|
526 | standard deviation is estimated by the MADFM, and corrected by the |
---|
527 | ratio derived in Appendix~\ref{app-madfm}. |
---|
528 | |
---|
529 | The second method uses the False Discovery Rate (FDR) technique |
---|
530 | \citep{miller01,hopkins02}, whose basis we briefly detail here. The |
---|
531 | false discovery rate (given by the number of false detections divided |
---|
532 | by the total number of detections) is fixed at a certain value |
---|
533 | $\alpha$ (\eg $\alpha=0.05$ implies 5\% of detections are false |
---|
534 | positives). In practice, an $\alpha$ value is chosen, and the ensemble |
---|
535 | average FDR (\ie $\langle FDR \rangle$) when the method is used will |
---|
536 | be less than $\alpha$. One calculates $p$ -- the probability, |
---|
537 | assuming the null hypothesis is true, of obtaining a test statistic as |
---|
538 | extreme as the pixel value (the observed test statistic) -- for each |
---|
539 | pixel, and sorts them in increasing order. One then calculates $d$ |
---|
540 | where |
---|
541 | \[ |
---|
542 | d = \max_j \left\{ j : P_j < \frac{j\alpha}{c_N N} \right\}, |
---|
543 | \] |
---|
544 | and then rejects all hypotheses whose $p$-values are less than or equal |
---|
545 | to $P_d$. (So a $P_i<P_d$ will be rejected even if $P_i \geq |
---|
546 | j\alpha/c_N N$.) Note that ``reject hypothesis'' here means ``accept |
---|
547 | the pixel as an object pixel'' (\ie we are rejecting the null |
---|
548 | hypothesis that the pixel belongs to the background). |
---|
549 | |
---|
550 | The $c_N$ values here are normalisation constants that depend on the |
---|
551 | correlated nature of the pixel values. If all the pixels are |
---|
552 | uncorrelated, then $c_N=1$. If $N$ pixels are correlated, then their |
---|
553 | tests will be dependent on each other, and so $c_N = \sum_{i=1}^N |
---|
554 | i^{-1}$. \citet{hopkins02} consider real radio data, where the pixels |
---|
555 | are correlated over the beam. In this case the sum is made over the |
---|
556 | $N$ pixels that make up the beam. The value of $N$ is calculated from |
---|
557 | the FITS header (if the correct keywords -- BMAJ, BMIN -- are not |
---|
558 | present, a default value of 10 pixels is assumed). |
---|
559 | |
---|
560 | The theory behind the FDR method implies a direct connection between the |
---|
561 | choice of $\alpha$ and the fraction of detections that will be false |
---|
562 | positives. However, due to the merging process, this direct connection is |
---|
563 | lost when looking at the final number of detections -- see discussion in |
---|
564 | \S\ref{sec-notes}. The effect is that the number of false detections will |
---|
565 | be less than indicated by the $\alpha$ value used. |
---|
566 | |
---|
567 | If a reconstruction has been made, the residuals (defined as original |
---|
568 | $-$ reconstruction) are used to estimate the noise parameters of the |
---|
569 | cube. Otherwise they are estimated directly from the cube itself. In |
---|
570 | both cases, robust estimators are used as described above. |
---|
571 | |
---|
572 | Detections must have a minimum number of pixels to be counted. This |
---|
573 | minimum number is given by the input parameters \texttt{minPix} (for |
---|
574 | 2-dimensional searches) and \texttt{minChannels} (for 1-dimensional |
---|
575 | searches). |
---|
576 | |
---|
577 | The search only looks for positive features. If one is interested |
---|
578 | instead in negative features (such as absorption lines), set the |
---|
579 | parameter \texttt{flagNegative = true}. This will invert the cube (\ie |
---|
580 | multiply all pixels by $-1$) prior to the search, and then re-invert |
---|
581 | the cube (and the fluxes of any detections) after searching is |
---|
582 | complete. All outputs are done in the same manner as normal, so that |
---|
583 | fluxes of detections will be negative. |
---|
584 | |
---|
585 | \subsection{Merging detected objects} |
---|
586 | \label{sec-merger} |
---|
587 | |
---|
588 | The searching step produces a list of detected objects that will have many |
---|
589 | repeated detections of a given object -- for instance, spectral |
---|
590 | detections in adjacent pixels of the same object and/or spatial |
---|
591 | detections in neighbouring channels. These are then combined in an |
---|
592 | algorithm that matches all objects judged to be ``close''. This |
---|
593 | determination is made in one of two ways. |
---|
594 | |
---|
595 | One way is to define two thresholds -- one spatial and one in velocity |
---|
596 | -- and say that two objects should be merged if there is at least one |
---|
597 | pair of pixels that lie within these threshold distances of each |
---|
598 | other. These thresholds are specified by the parameters |
---|
599 | \texttt{threshSpatial} and \texttt{threshVelocity} (in units of pixels |
---|
600 | and channels respectively). |
---|
601 | |
---|
602 | Alternatively, the spatial requirement can be changed to say that |
---|
603 | there must be a pair of pixels that are \emph{adjacent} -- a stricter, |
---|
604 | but perhaps more realistic requirement, particularly when the spatial pixels |
---|
605 | have a large angular size (as is the case for \hi\ surveys). This |
---|
606 | method can be selected by setting the parameter |
---|
607 | \texttt{flagAdjacent} to 1 (\ie \texttt{true}) in the parameter file. The |
---|
608 | velocity thresholding is done in the same way as the first option. |
---|
609 | |
---|
610 | Once the detections have been merged, they may be ``grown''. This is a |
---|
611 | process of increasing the size of the detection by adding adjacent |
---|
612 | pixels that are above some secondary threshold. This threshold is |
---|
613 | lower than the one used for the initial detection, but above the noise |
---|
614 | level, so that faint pixels are only detected when they are close to a |
---|
615 | bright pixel. The value of this threshold is a possible input |
---|
616 | parameter (\texttt{growthCut}), with a default value of $1.5\sigma$. The |
---|
617 | use of the growth algorithm is controlled by the \texttt{flagGrowth} |
---|
618 | parameter -- the default value of which is \texttt{false}. If the |
---|
619 | detections are grown, they are sent through the merging algorithm a |
---|
620 | second time, to pick up any detections that now overlap or have grown |
---|
621 | over each other. |
---|
622 | |
---|
623 | Finally, to be accepted, the detections must span \emph{both} a minimum |
---|
624 | number of channels (to remove any spurious single-channel spikes that |
---|
625 | may be present), and a minimum number of spatial pixels. These |
---|
626 | numbers, as for the original detection step, are set with the |
---|
627 | \texttt{minChannels} and \texttt{minPix} parameters. The channel |
---|
628 | requirement means there must be at least one set of \texttt{minChannels} |
---|
629 | consecutive channels in the source for it to be accepted. |
---|
630 | |
---|
631 | \section{Outputs} |
---|
632 | \label{sec-output} |
---|
633 | |
---|
634 | \subsection{During execution} |
---|
635 | |
---|
636 | \duchamp\ provides the user with feedback whilst it is running, to |
---|
637 | keep the user informed on the progress of the analysis. Most of this |
---|
638 | consists of self-explanatory messages about the particular stage the |
---|
639 | program is up to. The relevant parameters are printed to the screen at |
---|
640 | the start (once the file has been successfully read in), so the user |
---|
641 | is able to make a quick check that the setup is correct (see |
---|
642 | Appendix~{app-input} for an example). |
---|
643 | |
---|
644 | If the cube is being trimmed (\S\ref{sec-modify}), the resulting |
---|
645 | dimensions are printed to indicate how much has been trimmed. If a |
---|
646 | reconstruction is being done, a continually updating message shows |
---|
647 | either the current iteration and scale, compared to the maximum scale |
---|
648 | (when \texttt{reconDim=3}), or a progress bar showing the amount of |
---|
649 | the cube that has been reconstructed (for smaller values of |
---|
650 | \texttt{reconDim}). |
---|
651 | |
---|
652 | During the searching algorithms, the progress through the 1D and 2D |
---|
653 | searches are shown. When the searches have completed, |
---|
654 | the number of objects found in both the 1D and 2D searches are |
---|
655 | reported (see \S\ref{sec-detection} for details). |
---|
656 | |
---|
657 | In the merging process (where multiple detections of the same object |
---|
658 | are combined -- see \S\ref{sec-merger}), two stages of output |
---|
659 | occur. The first is when each object in the list is compared with all |
---|
660 | others. The output shows two numbers: the first being how far through |
---|
661 | the list the current object is, and the second being the length of the |
---|
662 | list. As the algorithm proceeds, the first number should increase and |
---|
663 | the second should decrease (as objects are combined). When the numbers |
---|
664 | meet (\ie the whole list has been compared), the second phase begins, |
---|
665 | in which multiply-appearing pixels in each object are removed, as are |
---|
666 | objects not meeting the minimum channels requirement. During this |
---|
667 | phase, the total number of accepted objects is shown, which should |
---|
668 | steadily increase until all have been accepted or rejected. Note that |
---|
669 | these steps can be very quick for small numbers of detections. |
---|
670 | |
---|
671 | Since this continual printing to screen has some overhead of time and |
---|
672 | CPU involved, the user can elect to not print this information by |
---|
673 | setting the parameter \texttt{verbose = 0}. In this case, the user is |
---|
674 | still informed as to the steps being undertaken, but the details of |
---|
675 | the progress are not shown. |
---|
676 | |
---|
677 | \subsection{Results} |
---|
678 | |
---|
679 | \subsubsection{Table of Results} |
---|
680 | |
---|
681 | Finally, we get to the results -- the reason for running \duchamp\ in |
---|
682 | the first place. Once the detection list is finalised, it is sorted by |
---|
683 | the mean velocity of the detections (or, if there is no good WCS |
---|
684 | associated with the cube, by the mean Z-pixel position). The results |
---|
685 | are then printed to the screen and to the output file, given by the |
---|
686 | \texttt{OutFile} parameter. The results list, an example of which can be |
---|
687 | seen in Appendix~\ref{app-output}, contains the following columns |
---|
688 | (note that the title of the columns depending on WCS information will |
---|
689 | depend on the projection of the WCS): |
---|
690 | |
---|
691 | \begin{entry} |
---|
692 | \item[Obj\#] The ID number of the detection (simply the sequential |
---|
693 | count for the list, which is ordered by increasing velocity). |
---|
694 | \item[Name] The IAU-format name of the detection (derived from the WCS |
---|
695 | projection). |
---|
696 | \item[X] The average X-pixel position. |
---|
697 | \item[Y] The average Y-pixel position. |
---|
698 | \item[Z] The average Z-pixel position. |
---|
699 | \item[RA/GLON] The Right Ascension or Galactic Longitude of the centre |
---|
700 | of the object. |
---|
701 | \item[DEC/GLAT] The Declination or Galactic Latitude of the centre of |
---|
702 | the object. |
---|
703 | \item[VEL] The mean velocity of the object [units given by the |
---|
704 | \texttt{spectralUnits} parameter]. |
---|
705 | \item[w\_RA/w\_GLON] The width of the object in Right Ascension or |
---|
706 | Galactic Longitude [arcmin]. |
---|
707 | \item[w\_DEC/w\_GLAT] The width of the object in Declination Galactic |
---|
708 | Latitude [arcmin]. |
---|
709 | \item[w\_VEL] The full velocity width of the detection (max channel |
---|
710 | $-$ min channel, in velocity units [see note below]). |
---|
711 | \item[F\_int] The integrated flux over the object, in the units of |
---|
712 | flux times velocity, corrected for the beam if necessary. |
---|
713 | \item[F\_peak] The peak flux over the object, in the units of flux. |
---|
714 | \item[X1, X2] The minimum and maximum X-pixel coordinates. |
---|
715 | \item[Y1, Y2] The minimum and maximum Y-pixel coordinates. |
---|
716 | \item[Z1, Z2] The minimum and maximum Z-pixel coordinates. |
---|
717 | \item[Npix] The number of voxels (\ie distinct $(x,y,z)$ coordinates) |
---|
718 | in the detection. |
---|
719 | \item[Flag] Whether the detection has any warning flags (see below). |
---|
720 | \end{entry} |
---|
721 | The Name is derived from the WCS position. For instance, a source |
---|
722 | centred on the RA,Dec position 12$^h$53$^m$45$^s$, |
---|
723 | -36$^\circ$24$'$12$''$ will be called J125345$-$362412 (if the epoch |
---|
724 | is J2000) or B125345$-$362412 (if B1950). An alternative form is used |
---|
725 | for Galactic coordinates: a source centred on the position ($l$,$b$) = |
---|
726 | (323.1245, 5.4567) will be called G323.124$+$05.457. If the WCS is not |
---|
727 | valid (\ie is not present or does not have all the necessary |
---|
728 | information), the Name, RA, DEC, VEL and related columns are not |
---|
729 | printed, but the pixel coordinates are still provided. |
---|
730 | |
---|
731 | The velocity units can be specified by the user, using the parameter |
---|
732 | \texttt{spectralUnits} (enter it as a single string). The default value |
---|
733 | is km/s, which should be suitable for most users. These units are also |
---|
734 | used to give the units of integrated flux. |
---|
735 | |
---|
736 | The last column contains any warning flags about the detection. There |
---|
737 | are currently two options here. An `E' is printed if the detection is |
---|
738 | next to the edge of the image, meaning either the limit of the pixels, |
---|
739 | or the limit of the non-BLANK pixel region. An `N' is printed if the |
---|
740 | total flux, summed over all the (non-BLANK) pixels in the smallest box |
---|
741 | that completely encloses the detection, is negative. Note that this |
---|
742 | sum is likely to include non-detected pixels. It is of use in |
---|
743 | pointing out detections that lie next to strongly negative pixels, |
---|
744 | such as might arise due to interference -- the detected pixels might |
---|
745 | then also be due to the interference, so caution is advised. |
---|
746 | |
---|
747 | \subsubsection{Other results lists} |
---|
748 | |
---|
749 | Two alternative results files can also be requested. One option is a |
---|
750 | VOTable-format XML file, containing just the RA, Dec, Velocity and the |
---|
751 | corresponding widths of the detections, as well as the fluxes. The |
---|
752 | user should set \texttt{flagVOT = 1}, and put the desired filename in the |
---|
753 | parameter \texttt{votFile} -- note that the default is for it not to be |
---|
754 | produced. This file should be compatible with all Virtual Observatory |
---|
755 | tools (such as Aladin\footnote{ Aladin can be found on the web at |
---|
756 | \href{http://aladin.u-strasbg.fr/}{http://aladin.u-strasbg.fr/}}). The |
---|
757 | second option is an annotation file for use with the Karma toolkit of |
---|
758 | visualisation tools (in particular, with \texttt{kvis}). This will draw a |
---|
759 | circle at the position of each detection, and number it according to |
---|
760 | the Obj\# given above. To make use of this option, the user should |
---|
761 | set \texttt{flagKarma = 1}, and put the desired filename in the parameter |
---|
762 | \texttt{karmaFile} -- again, the default is for it not to be produced. |
---|
763 | |
---|
764 | As the program is running, it also (optionally) records the detections |
---|
765 | made in each individual spectrum or channel (see |
---|
766 | \S\ref{sec-detection} for details on this process). This is |
---|
767 | recorded in the file given by the parameter \texttt{LogFile}. This file |
---|
768 | does not include the columns \texttt{Name, RA, DEC, w\_RA, w\_DEC, VEL, |
---|
769 | w\_VEL}. This file is designed primarily for diagnostic purposes: \eg |
---|
770 | to see if a given set of pixels is detected in, say, one channel |
---|
771 | image, but does not survive the merging process. The list of pixels |
---|
772 | (and their fluxes) in the final detection list are also printed to |
---|
773 | this file, again for diagnostic purposes. This feature can be turned |
---|
774 | off by setting \texttt{flagLog = false}. (This may be a good idea if you |
---|
775 | are not interested in its contents, as it can be a large file.) |
---|
776 | |
---|
777 | \begin{figure}[t] |
---|
778 | \begin{center} |
---|
779 | \includegraphics[width=\textwidth]{example_spectrum} |
---|
780 | \end{center} |
---|
781 | \caption{\footnotesize An example of the spectrum output. Note several |
---|
782 | of the features discussed in the text: the red lines indicating the |
---|
783 | reconstructed spectrum; the blue dashed lines indicating the |
---|
784 | spectral extent of the detection; the green hashed area indicating |
---|
785 | the Milky Way channels that are ignored by the searching algorithm; |
---|
786 | the blue border showing its spatial extent on the 0th moment map; |
---|
787 | and the 15~arcmin-long scale bar.} |
---|
788 | \label{fig-spect} |
---|
789 | \end{figure} |
---|
790 | |
---|
791 | \begin{figure}[!t] |
---|
792 | \begin{center} |
---|
793 | \includegraphics[width=\textwidth]{example_moment_map} |
---|
794 | \end{center} |
---|
795 | \caption{\footnotesize An example of the moment map created by |
---|
796 | \duchamp. The full extent of the cube is covered, and the 0th moment |
---|
797 | of each object is shown (integrated individually over all the |
---|
798 | detected channels).} |
---|
799 | \label{fig-moment} |
---|
800 | \end{figure} |
---|
801 | |
---|
802 | \subsubsection{Graphical output -- spectra} |
---|
803 | |
---|
804 | As well as the output data file, a postscript file is created that |
---|
805 | shows the spectrum for each detection, together with a small cutout |
---|
806 | image (the 0th moment) and basic information about the detection (note |
---|
807 | that any flags are printed after the name of the detection, in the |
---|
808 | format \texttt{[E]}). If the cube was reconstructed, the spectrum from |
---|
809 | the reconstruction is shown in red, over the top of the original |
---|
810 | spectrum. The spectral extent of the detected object is indicated by |
---|
811 | two dashed blue lines, and the region covered by the ``Milky Way'' |
---|
812 | channels is shown by a green hashed box. |
---|
813 | |
---|
814 | The spectrum that is plotted is governed by the |
---|
815 | \texttt{spectralMethod} parameter. It can be either \texttt{peak}, |
---|
816 | where the spectrum is from the spatial pixel containing the |
---|
817 | detection's peak flux; or \texttt{sum}, where the spectrum is summed |
---|
818 | over all spatial pixels, and then corrected for the beam size. |
---|
819 | |
---|
820 | The spectral extent of the detection is indicated with blue lines, and |
---|
821 | a zoom is shown in a separate window. The cutout image can optionally |
---|
822 | include a border around the spatial pixels that are in the detection |
---|
823 | (turned on and off by the parameter \texttt{drawBorders} -- the |
---|
824 | default is \texttt{true}). It also includes a scale bar in the bottom |
---|
825 | left corner to indicate size -- it is 15~arcmin long (note that due to |
---|
826 | projection effects it may be a slightly different physical length from |
---|
827 | object to object). An example detection can be seen below in |
---|
828 | Fig.~\ref{fig-spect}. |
---|
829 | |
---|
830 | \subsubsection{Graphical output -- maps} |
---|
831 | |
---|
832 | Finally, a couple of images are optionally produced: a 0th moment map |
---|
833 | of the cube, combining just the detected channels in each object, |
---|
834 | showing the integrated flux in grey-scale; and a ``detection image'', |
---|
835 | a grey-scale image where the pixel values are the number of channels |
---|
836 | that spatial pixel is detected in. In both cases, if |
---|
837 | \texttt{drawBorders = true}, a border is drawn around the spatial |
---|
838 | extent of each detection. An example moment map is shown in |
---|
839 | Fig.~\ref{fig-moment}. The production or otherwise of these images is |
---|
840 | governed by the \texttt{flagMaps} parameter. |
---|
841 | |
---|
842 | The purpose of these images are to provide a visual guide to where the |
---|
843 | detections have been made, and, particularly in the case of the moment |
---|
844 | map, to provide an indication of the strength of the source. In both |
---|
845 | cases, the detections are numbered (in the same sense as the output |
---|
846 | list), and the spatial borders are marked out as for the cutout images |
---|
847 | in the spectra file. Both these images are saved as postscript files |
---|
848 | (given by the parameters \texttt{momentMap} and \texttt{detectionMap} |
---|
849 | respectively), with the latter also displayed in a {\sc pgplot} |
---|
850 | window (regardless of the state of \texttt{flagMaps}). |
---|
851 | |
---|
852 | \section{Notes and hints on the use of \duchamp} |
---|
853 | \label{sec-notes} |
---|
854 | |
---|
855 | In using \duchamp, the user has to make a number of decisions about |
---|
856 | the way the program runs. This section is designed to give the user |
---|
857 | some idea about what to choose. |
---|
858 | |
---|
859 | The main choice is whether or not to use the wavelet |
---|
860 | reconstruction. The main benefits of this are the marked reduction in |
---|
861 | the noise level, leading to regularly-shaped detections, and good |
---|
862 | reliability for faint sources. The main drawback with its use is the |
---|
863 | long execution time: to reconstruct a $170\times160\times1024$ |
---|
864 | (\hipass) cube often requires three iterations and takes about 20-25 |
---|
865 | minutes to run completely. Note that this is for the three-dimensional |
---|
866 | reconstruction: using \texttt{reconDim=1} makes the reconstruction |
---|
867 | quicker (the full program then takes about 6 minutes), but it is still |
---|
868 | the largest part of the time. |
---|
869 | |
---|
870 | The searching part of the procedure is much quicker: searching an |
---|
871 | un-reconstructed cube leads to execution times of only a couple of |
---|
872 | minutes. Alternatively, using the ability to read in previously-saved |
---|
873 | reconstructed arrays makes running the reconstruction more than once a |
---|
874 | more feasible prospect. |
---|
875 | |
---|
876 | On the positive side, the shape of the detections in a cube that has |
---|
877 | been reconstructed will be much more regular and smooth -- the ragged |
---|
878 | edges that objects in the raw cube possess are smoothed by the removal |
---|
879 | of most of the noise. This enables better determination of the shapes |
---|
880 | and characteristics of objects. |
---|
881 | |
---|
882 | A further point to consider when using the reconstruction is that if |
---|
883 | the two-dimensional reconstruction is chosen (\texttt{reconDim=2}), it |
---|
884 | can be susceptible to edge effects. If the valid area in the cube (\ie |
---|
885 | the part that is not BLANK) has non-rectangular edges, the convolution |
---|
886 | can produce artefacts in the reconstruction that mimic the edges and |
---|
887 | can lead (depending on the selection threshold) to some spurious |
---|
888 | sources. Caution is advised with such data -- the user is advised to |
---|
889 | check carefully the reconstructed cube for the presence of such |
---|
890 | artefacts. Note, however, that the 1- and 3-dimensional |
---|
891 | reconstructions are \emph{not} susceptible in the same way, since the |
---|
892 | spectral direction does not generally exhibit these BLANK edges, and |
---|
893 | so we recommend the use of either of these. |
---|
894 | |
---|
895 | If one chooses the reconstruction method, a further decision is |
---|
896 | required on the signal-to-noise cutoff used in determining acceptable |
---|
897 | wavelet coefficients. A larger value will remove more noise from the |
---|
898 | cube, at the expense of losing fainter sources, while a smaller value |
---|
899 | will include more noise, which may produce spurious detections, but |
---|
900 | will be more sensitive to faint sources. Values of less than about |
---|
901 | $3\sigma$ tend to not reduce the noise a great deal and can lead to |
---|
902 | many spurious sources (although this will depend on the nature of the |
---|
903 | cube). |
---|
904 | |
---|
905 | When it comes to searching, the FDR method produces more reliable results |
---|
906 | than simple sigma-clipping, particularly in the absence of reconstruction. |
---|
907 | However, it does not work in exactly the way one would expect for a |
---|
908 | given value of \texttt{alpha}. For instance, setting fairly liberal values |
---|
909 | of \texttt{alpha} (say, 0.1) will often lead to a much smaller fraction |
---|
910 | of false detections (\ie much less than 10\%). This is the effect of the |
---|
911 | merging algorithms, that combine the sources after the detection stage, |
---|
912 | and reject detections not meeting the minimum pixel or channel requirements. |
---|
913 | It is thus better to aim for larger \texttt{alpha} values than those derived |
---|
914 | from a straight conversion of the desired false detection rate. |
---|
915 | |
---|
916 | Finally, as \duchamp\ is still undergoing development, there are some |
---|
917 | elements that are not fully developed. In particular, it is not as |
---|
918 | clever as I would like at avoiding interference. The ability to place |
---|
919 | requirements on the minimum number of channels and pixels partially |
---|
920 | circumvents this problem, but work is being done to make \duchamp\ |
---|
921 | smarter at rejecting signals that are clearly (to a human eye at |
---|
922 | least) interference. See the following section for further |
---|
923 | improvements that are planned. |
---|
924 | |
---|
925 | %\section{Drawbacks of the current program} |
---|
926 | % |
---|
927 | %The program currently has a few problems/drawbacks/things to be aware |
---|
928 | %of that will hopefully be fixed in the future: |
---|
929 | %\begin{itemize} |
---|
930 | % |
---|
931 | %\item Narrow interference spikes are still getting found, particularly |
---|
932 | % if there is no reconstruction, or reconstruction with a relatively |
---|
933 | % low \texttt{snrRecon} (such as 2 or 3). Increasing the |
---|
934 | % \texttt{minChannels} parameter is one way to circumvent this, but |
---|
935 | % making the algorithm a bit more clever would be preferable. |
---|
936 | % |
---|
937 | %\item Sources that have strong continuum ripple and/or artefacts often |
---|
938 | % generate many spurious detections. This needs some work to avoid |
---|
939 | % \duchamp\ doing this, and until then users are advised to be aware |
---|
940 | % of the possibility. Strong continuum ripples may generate many |
---|
941 | % sources on the same spatial pixel, and this will be apparent on the |
---|
942 | % detection images. |
---|
943 | % |
---|
944 | %\item Spectra are integrated over every spatial pixel of the |
---|
945 | % detection, and this may dilute the actual detection, making it |
---|
946 | % harder to see \ie the apparent strength of the line as plotted may |
---|
947 | % not give a true indication of how strong it really is. |
---|
948 | % |
---|
949 | %%\item A caution on the merging part of the procedure. This can be time |
---|
950 | %% consuming if there are many detections that do not require merging |
---|
951 | %% -- in this case, the time will go like $N^2$ ($N$ = number of |
---|
952 | %% detections). If there are plenty of mergers, the size of the list |
---|
953 | %% reduces quickly, so the execution time will be less. |
---|
954 | % |
---|
955 | % |
---|
956 | %\end{itemize} |
---|
957 | |
---|
958 | |
---|
959 | %\section{Comparison with other software (to be developed further...)} |
---|
960 | % |
---|
961 | %\subsection{fred, by Matt Howlett} |
---|
962 | % |
---|
963 | %This is the program used in the \hipass\ analysis. It smoothes the |
---|
964 | %data spectrally with a boxcar filter of a size that varies over a |
---|
965 | %user-specified range, and then thresholds the data. |
---|
966 | % |
---|
967 | %Works effectively, but generally doesn't find as many sources as |
---|
968 | %\duchamp, particularly when the reconstruction is used. Sensitive to |
---|
969 | %faint, broad features that fall below the reconstruction threshold. |
---|
970 | % |
---|
971 | %Execution takes a long time, depending on the range of filter widths |
---|
972 | %that are used. |
---|
973 | % |
---|
974 | %\subsection{sfind} |
---|
975 | % |
---|
976 | %Hard to evaluate, as it does not (as far as I can see) output the |
---|
977 | %channel number at which detections are made, and does not merge |
---|
978 | %detections made at adjacent channels (\ie it just works in 2 |
---|
979 | %dimensions). |
---|
980 | % |
---|
981 | |
---|
982 | \section{Future Developments} |
---|
983 | |
---|
984 | This is both a list of planned improvements and a wish-list of |
---|
985 | features that would be nice to include (but are not planned in the |
---|
986 | immediate future). Let me know if there are items not on this list, or |
---|
987 | items on the list you would like prioritised. |
---|
988 | |
---|
989 | \begin{itemize} |
---|
990 | |
---|
991 | \item Better determination of the noise characteristics of |
---|
992 | spectral-line cubes, including understanding how the noise is |
---|
993 | generated and developing a model for it. \textbf{Planned.} |
---|
994 | |
---|
995 | \item Include more source analysis. Examples could be: shape |
---|
996 | information; measurements of HI mass; more variety of measurements |
---|
997 | of velocity width and profile. \textbf{Some planned.} |
---|
998 | |
---|
999 | \item Provide some indication of the significance of the detection |
---|
1000 | (\ie some S/N-like value). \textbf{Planned.} |
---|
1001 | |
---|
1002 | \item Improved ability to reject interference, possibly on the |
---|
1003 | spectral shape of features. \textbf{Planned.} |
---|
1004 | |
---|
1005 | \item Ability to separate (de-blend) distinct sources that have been |
---|
1006 | merged. \textbf{Planned.} |
---|
1007 | |
---|
1008 | \item Link to lists of possible counterparts (\eg via NED/SIMBAD/other |
---|
1009 | VO tools?). \textbf{Wish-list.} |
---|
1010 | |
---|
1011 | \item On-line web service interface, so a user can upload a cube and |
---|
1012 | get back a source-list. \textbf{Wish-list}. |
---|
1013 | |
---|
1014 | \item Embed \duchamp\ in a GUI, to move away from the text-based |
---|
1015 | interaction. \textbf{Wish-list}. |
---|
1016 | \end{itemize} |
---|
1017 | |
---|
1018 | |
---|
1019 | %\bibliographystyle{mn2e} |
---|
1020 | %\bibliographystyle{abbrvnat} |
---|
1021 | %\bibliography{mnrasmnemonic,sourceDetection} |
---|
1022 | \begin{thebibliography}{} |
---|
1023 | |
---|
1024 | \bibitem[\protect\citeauthoryear{{Calabretta} \& {Greisen}}{{Calabretta} \& |
---|
1025 | {Greisen}}{2002}]{calabretta02} |
---|
1026 | {Calabretta} M., {Greisen} E., 2002, A\&A, 395, 1077 |
---|
1027 | |
---|
1028 | \bibitem[\protect\citeauthoryear{{Greisen} \& {Calabretta}}{{Greisen} \& |
---|
1029 | {Calabretta}}{2002}]{greisen02} |
---|
1030 | {Greisen} E., {Calabretta} M., 2002, A\&A, 395, 1061 |
---|
1031 | |
---|
1032 | \bibitem[\protect\citeauthoryear{{Hanisch}, {Farris}, {Greisen}, {Pence}, |
---|
1033 | {Schlesinger}, {Teuben}, {Thompson} \& {Warnock}}{{Hanisch} |
---|
1034 | et~al.}{2001}]{hanisch01} |
---|
1035 | {Hanisch} R., {Farris} A., {Greisen} E., {Pence} W., {Schlesinger} B., |
---|
1036 | {Teuben} P., {Thompson} R., {Warnock} A., 2001, A\&A, 376, 359 |
---|
1037 | |
---|
1038 | \bibitem[\protect\citeauthoryear{{Hopkins}, {Miller}, {Connolly}, {Genovese}, |
---|
1039 | {Nichol} \& {Wasserman}}{{Hopkins} et~al.}{2002}]{hopkins02} |
---|
1040 | {Hopkins} A., {Miller} C., {Connolly} A., {Genovese} C., {Nichol} R., |
---|
1041 | {Wasserman} L., 2002, AJ, 123, 1086 |
---|
1042 | |
---|
1043 | \bibitem[\protect\citeauthoryear{Lutz}{Lutz}{1980}]{lutz80} |
---|
1044 | Lutz R., 1980, The Computer Journal, 23, 262 |
---|
1045 | |
---|
1046 | \bibitem[\protect\citeauthoryear{{Meyer} et~al.,}{{Meyer} |
---|
1047 | et~al.}{2004}]{meyer04:trunc} |
---|
1048 | {Meyer} M., et~al., 2004, MNRAS, 350, 1195 |
---|
1049 | |
---|
1050 | \bibitem[\protect\citeauthoryear{{Miller}, {Genovese}, {Nichol}, {Wasserman}, |
---|
1051 | {Connolly}, {Reichart}, {Hopkins}, {Schneider} \& {Moore}}{{Miller} |
---|
1052 | et~al.}{2001}]{miller01} |
---|
1053 | {Miller} C., {Genovese} C., {Nichol} R., {Wasserman} L., {Connolly} A., |
---|
1054 | {Reichart} D., {Hopkins} A., {Schneider} J., {Moore} A., 2001, AJ, 122, |
---|
1055 | 3492 |
---|
1056 | |
---|
1057 | \bibitem[\protect\citeauthoryear{Minchin}{Minchin}{1999}]{minchin99} |
---|
1058 | Minchin R., 1999, PASA, 16, 12 |
---|
1059 | |
---|
1060 | \bibitem[\protect\citeauthoryear{Starck \& Murtagh}{Starck \& |
---|
1061 | Murtagh}{2002}]{starck02:book} |
---|
1062 | Starck J.-L., Murtagh F., 2002, {``Astronomical Image and Data Analysis''}. |
---|
1063 | Springer |
---|
1064 | |
---|
1065 | \end{thebibliography} |
---|
1066 | |
---|
1067 | |
---|
1068 | \appendix |
---|
1069 | \newpage |
---|
1070 | \section{Obtaining and Installing \duchamp} |
---|
1071 | |
---|
1072 | The \duchamp\ web page can be found at the following location:\\ |
---|
1073 | \href{http://www.atnf.csiro.au/people/Matthew.Whiting/Duchamp}% |
---|
1074 | {http://www.atnf.csiro.au/people/Matthew.Whiting/Duchamp}\\ |
---|
1075 | Here you can find a gzipped tar archive of the source code that can be |
---|
1076 | downloaded and extracted, as well as this User's Guide in postscript |
---|
1077 | and hyperlinked PDF formats. |
---|
1078 | |
---|
1079 | \duchamp\ can be built on Unix systems by typing: |
---|
1080 | \begin{quote} |
---|
1081 | \texttt{% |
---|
1082 | > ./configure\\ |
---|
1083 | > make\\ |
---|
1084 | > make clean (optional -- to remove the object files)} |
---|
1085 | \end{quote} |
---|
1086 | |
---|
1087 | Run in this manner, \texttt{configure} should find all the necessary |
---|
1088 | libraries, but if the above-mentioned libraries have been installed in |
---|
1089 | non-standard locations, you can specify additional directories to look |
---|
1090 | in. There are separate options for library files (eg. libcpgplot.a) |
---|
1091 | and header files (eg. cpgplot.h). |
---|
1092 | |
---|
1093 | For example, if \textsc{wcslib} had been installed in |
---|
1094 | \texttt{/home/mduchamp/wcslib}, there are two libraries that are |
---|
1095 | likely to be in separate subdirectories: \texttt{C/} and |
---|
1096 | \texttt{pgsbox/}. Each subdirectory needs to be searched for library |
---|
1097 | and header files, so one could build Duchamp by typing: |
---|
1098 | \begin{quote} |
---|
1099 | \texttt{% |
---|
1100 | > ./configure $\backslash$ \\ |
---|
1101 | LIBDIRS="/home/mduchamp/wcslib/C /home/mduchamp/wcslib/pgsbox" |
---|
1102 | $\backslash$\\ |
---|
1103 | INCDIRS="/home/mduchamp/wcslib/C /home/mduchamp/wcslib/pgsbox"} |
---|
1104 | \end{quote} |
---|
1105 | And then just run make in the usual fashion: |
---|
1106 | \begin{quote} |
---|
1107 | \texttt{> make} |
---|
1108 | \end{quote} |
---|
1109 | |
---|
1110 | This will produce the executable \texttt{Duchamp}. There are two |
---|
1111 | possible ways to run it. The first is: |
---|
1112 | \begin{quote} |
---|
1113 | \texttt{> Duchamp -f [FITS file]} |
---|
1114 | \end{quote} |
---|
1115 | where \texttt{[FITS file]} is the file you wish to search. This method |
---|
1116 | simply uses the default values of all parameters. |
---|
1117 | |
---|
1118 | The second method allows some determination of the parameter values by |
---|
1119 | the user. Type: |
---|
1120 | \begin{quote} |
---|
1121 | \texttt{> Duchamp -p [parameter file]} |
---|
1122 | \end{quote} |
---|
1123 | where \texttt{[parameterFile]} is a file with the input parameters, |
---|
1124 | including the name of the cube you want to search. There are two |
---|
1125 | example input files included with the distribution. The smaller one, |
---|
1126 | \texttt{InputExample}, shows the typical parameters one might want to |
---|
1127 | set. The large one, \texttt{InputComplete}, lists all possible |
---|
1128 | parameters that can be entered, and a brief description of them. To |
---|
1129 | get going quickly, just replace the "your-file-here" in |
---|
1130 | \texttt{InputExample} with your image name, and type |
---|
1131 | \begin{quote} |
---|
1132 | \texttt{> Duchamp -p InputExample} |
---|
1133 | \end{quote} |
---|
1134 | |
---|
1135 | The following appendices provide details on the individual parameters, |
---|
1136 | and show examples of the output files that \duchamp\ produces. |
---|
1137 | |
---|
1138 | \newpage |
---|
1139 | \section{Available parameters} |
---|
1140 | \label{app-param} |
---|
1141 | |
---|
1142 | The full list of parameters that can be listed in the input file are |
---|
1143 | given here. If not listed, they take the default value given in |
---|
1144 | parentheses. Since the order of the parameters in the input file does |
---|
1145 | not matter, they are grouped here in logical sections. |
---|
1146 | |
---|
1147 | \subsection*{Input-output related} |
---|
1148 | \begin{entry} |
---|
1149 | \item[ImageFile (no default assumed)] The filename of the |
---|
1150 | data cube to be analysed. |
---|
1151 | \item[flagSubsection \texttt{[false]}] A flag to indicate whether one |
---|
1152 | wants a subsection of the requested image. |
---|
1153 | \item[Subsection \texttt{[ [*,*,*] ]}] The requested subsection, which |
---|
1154 | should be specified in the format \texttt{[x1:x2,y1:y2,z1:z2]}, where |
---|
1155 | the limits are inclusive. If the full range of a dimension is |
---|
1156 | required, use a \texttt{*}, \eg if you want the full spectral range of |
---|
1157 | a subsection of the image, use \texttt{[30:140,30:140,*]}. |
---|
1158 | \item[flagReconExists \texttt{[false]}] A flag to indicate whether the |
---|
1159 | reconstructed array has been saved by a previous run of \duchamp. If |
---|
1160 | set true, the reconstructed array will be read from the file given by |
---|
1161 | \texttt{reconFile}, rather than calculated directly. |
---|
1162 | \item[reconFile (no default assumed)] The FITS file that contains the |
---|
1163 | reconstructed array. If \texttt{flagReconExists} is true and this |
---|
1164 | parameter is not defined, the default file searched will be |
---|
1165 | determined by the \atrous\ parameters (see \S\ref{sec-recon}). |
---|
1166 | \item[OutFile \texttt{[duchamp-Results.txt]}] The file containing the |
---|
1167 | final list of detections. This also records the list of input |
---|
1168 | parameters. |
---|
1169 | \item[SpectraFile \texttt{[duchamp-Spectra.ps]}] The postscript file |
---|
1170 | containing the resulting integrated spectra and images of the |
---|
1171 | detections. |
---|
1172 | \item[flagLog \texttt{[true]}] A flag to indicate whether intermediate |
---|
1173 | detections should be logged. |
---|
1174 | \item[LogFile \texttt{[duchamp-Logfile.txt]}] The file in which intermediate |
---|
1175 | detections are logged. These are detections that have not been |
---|
1176 | merged. This is primarily for use in debugging and diagnostic |
---|
1177 | purposes -- normal use of the program will probably not require |
---|
1178 | this. |
---|
1179 | \item[flagOutputRecon \texttt{[false]}] A flag to say whether or not to |
---|
1180 | save the reconstructed cube as a FITS file. The filename will be |
---|
1181 | derived from the ImageFile -- the reconstruction of \texttt{image.fits} |
---|
1182 | will be saved as \texttt{image.RECON?.fits}, where \texttt{?} stands for |
---|
1183 | the value of \texttt{snrRecon} (see below). |
---|
1184 | \item[flagOutputResid \texttt{[false]}] As for \texttt{flagOutputRecon}, but |
---|
1185 | for the residual array -- the difference between the original cube |
---|
1186 | and the reconstructed cube. The filename will be \texttt{image.RESID?.fits}. |
---|
1187 | \item[flagVOT \texttt{[false]}] A flag to say whether to create a VOTable |
---|
1188 | file corresponding to the information in \texttt{outfile}. This will be |
---|
1189 | an XML file in the Virtual Observatory VOTable format. |
---|
1190 | \item[votFile \texttt{[duchamp-Results.xml]}] The VOTable file with the |
---|
1191 | list of final detections. Some input parameters are also recorded. |
---|
1192 | \item[flagKarma \texttt{[false]}] A flag to say whether to create a |
---|
1193 | Karma annotation file corresponding to the information in |
---|
1194 | \texttt{outfile}. This can be used as an overlay for the Karma |
---|
1195 | programs such as \texttt{kvis}. |
---|
1196 | \item[karmaFile \texttt{[duchamp-Results.ann]}] The Karma annotation |
---|
1197 | file showing the list of final detections. |
---|
1198 | \item[flagMaps \texttt{[true]}] A flag to say whether to save |
---|
1199 | postscript files showing the 0th moment map of the whole cube |
---|
1200 | (parameter \texttt{momentMap}) and the detection image |
---|
1201 | (\texttt{detectionMap}). |
---|
1202 | \item[momentMap \texttt{[duchamp-MomentMap.ps]}] A postscript file |
---|
1203 | containing a map of the 0th moment of the detected sources, as well |
---|
1204 | as pixel and WCS coordinates. |
---|
1205 | \item[detectionMap \texttt{[duchamp-DetectionMap.ps]}] A postscript |
---|
1206 | file showing each of the detected objects, coloured in greyscale by |
---|
1207 | the number of channels spanned by each pixel. Also shows pixel and WCS |
---|
1208 | coordinates. |
---|
1209 | \end{entry} |
---|
1210 | |
---|
1211 | \subsection*{Modifying the cube} |
---|
1212 | \begin{entry} |
---|
1213 | \item[flagBlankPix \texttt{[true]}] A flag to say whether to remove BLANK |
---|
1214 | pixels from the analysis -- these are pixels set to some particular |
---|
1215 | value because they fall outside the imaged area. |
---|
1216 | \item[blankPixValue \texttt{[-8.00061]}] The value of the BLANK pixels, |
---|
1217 | if this information is not contained in the FITS header (the usual |
---|
1218 | procedure is to obtain this value from the header information -- in |
---|
1219 | which case the value set by this parameter is ignored). |
---|
1220 | \item[flagMW \texttt{[false]}] A flag to say whether to ignore channels |
---|
1221 | contaminated by Milky Way (or other) emission -- the searching |
---|
1222 | algorithms will not look at these channels. |
---|
1223 | \item[maxMW \texttt{[112]}] The maximum channel number containing |
---|
1224 | ``Milky Way'' emission. |
---|
1225 | \item[minMW \texttt{[75]}] The minimum channel number containing |
---|
1226 | ``Milky Way'' emission. Note that the range specified by |
---|
1227 | \texttt{maxMW} and \texttt{minMW} is inclusive. |
---|
1228 | \item[flagBaseline \texttt{[false]}] A flag to say whether to remove the |
---|
1229 | baseline from each spectrum in the cube for the purposes of |
---|
1230 | reconstruction and detection. |
---|
1231 | \end{entry} |
---|
1232 | |
---|
1233 | \subsection*{Detection related} |
---|
1234 | |
---|
1235 | \subsubsection*{General detection} |
---|
1236 | \begin{entry} |
---|
1237 | \item[flagNegative \texttt{[false]}] A flag to indicate that the features |
---|
1238 | being searched for are negative. The cube will be inverted prior to |
---|
1239 | searching. |
---|
1240 | \item[snrCut \texttt{[3.]}] The cut-off value for thresholding, in terms |
---|
1241 | of number of $\sigma$ above the mean. |
---|
1242 | \item[flagGrowth \texttt{[false]}] A flag indicating whether or not to |
---|
1243 | grow the detected objects to a smaller threshold. |
---|
1244 | \item[growthCut \texttt{[2.]}] The smaller threshold using in growing |
---|
1245 | detections. In units of $\sigma$ above the mean. |
---|
1246 | \end{entry} |
---|
1247 | |
---|
1248 | \subsubsection*{\Atrous\ reconstruction} |
---|
1249 | \begin{entry} |
---|
1250 | \item [flagATrous \texttt{[true]}] A flag indicating whether or not to |
---|
1251 | reconstruct the cube using the \atrous\ wavelet |
---|
1252 | reconstruction. See \S\ref{sec-recon} for details. |
---|
1253 | \item[reconDim \texttt{[3]}] The number of dimensions to use in the |
---|
1254 | reconstruction. 1 means reconstruct each spectrum separately, 2 |
---|
1255 | means each channel map is done separately, and 3 means do the whole |
---|
1256 | cube in one go. |
---|
1257 | \item[scaleMin \texttt{[1]}] The minimum wavelet scale to be used in the |
---|
1258 | reconstruction. A value of 1 means ``use all scales''. |
---|
1259 | \item[snrRecon \texttt{[4]}] The thresholding cutoff used in the |
---|
1260 | reconstruction -- only wavelet coefficients this many $\sigma$ above |
---|
1261 | the mean (or greater) are included in the reconstruction. |
---|
1262 | \item[filterCode \texttt{[1]}] The code number of the filter to use in |
---|
1263 | the reconstruction. The options are: |
---|
1264 | \begin{itemize} |
---|
1265 | \item \textbf{1:} B$_3$-spline filter: coefficients = |
---|
1266 | $(\frac{1}{16}, \frac{1}{4}, \frac{3}{8}, \frac{1}{4}, \frac{1}{16})$ |
---|
1267 | \item \textbf{2:} Triangle filter: coefficients = $(\frac{1}{4}, \frac{1}{2}, \frac{1}{4})$ |
---|
1268 | \item \textbf{3:} Haar wavelet: coefficients = $(0, \frac{1}{2}, \frac{1}{2})$ |
---|
1269 | \end{itemize} |
---|
1270 | \end{entry} |
---|
1271 | |
---|
1272 | \subsubsection*{FDR method} |
---|
1273 | \begin{entry} |
---|
1274 | \item[flagFDR \texttt{[false]}] A flag indicating whether or not to use |
---|
1275 | the False Discovery Rate method in thresholding the pixels. |
---|
1276 | \item[alphaFDR \texttt{[0.01]}] The $\alpha$ parameter used in the FDR |
---|
1277 | analysis. The average number of false detections, as a fraction of the |
---|
1278 | total number, will be less than $\alpha$ (see \S\ref{sec-detection}). |
---|
1279 | \end{entry} |
---|
1280 | |
---|
1281 | \subsubsection*{Merging detections} |
---|
1282 | \begin{entry} |
---|
1283 | \item[minPix \texttt{[2]}] The minimum number of spatial pixels for a single |
---|
1284 | detection to be counted. |
---|
1285 | \item[minChannels \texttt{[3]}] The minimum number of consecutive |
---|
1286 | channels that must be present in a detection. |
---|
1287 | \item[flagAdjacent \texttt{[true]}] A flag indicating whether to use the |
---|
1288 | ``adjacent pixel'' criterion to decide whether to merge objects. If |
---|
1289 | not, the next two parameters are used to determine whether objects |
---|
1290 | are within the necessary thresholds. |
---|
1291 | \item[threshSpatial \texttt{[3.]}] The maximum allowed minimum spatial |
---|
1292 | separation (in pixels) between two detections for them to be merged |
---|
1293 | into one. Only used if \texttt{flagAdjacent = false}. |
---|
1294 | \item[threshVelocity \texttt{[7.]}] The maximum allowed minimum channel |
---|
1295 | separation between two detections for them to be merged into |
---|
1296 | one. |
---|
1297 | \end{entry} |
---|
1298 | |
---|
1299 | \subsubsection*{Other parameters} |
---|
1300 | \begin{entry} |
---|
1301 | \item[spectralMethod \texttt{[peak]}] This indicates which method is used |
---|
1302 | to plot the output spectra: \texttt{peak} means plot the spectrum |
---|
1303 | containing the detection's peak pixel; \texttt{sum} means sum the |
---|
1304 | spectra of each detected spatial pixel, and correct for the beam |
---|
1305 | size. Any other choice defaults to \texttt{peak}. |
---|
1306 | \item[spectralUnits \texttt{[km/s]}] The user can specify the units of |
---|
1307 | the spectral axis. Assuming the WCS of the FITS file is valid, the |
---|
1308 | spectral axis is transformed into velocity, and put into these units |
---|
1309 | for all output and for calculations such as the integrated flux of a |
---|
1310 | detection. |
---|
1311 | \item[drawBorders \texttt{[true]}] A flag indicating whether borders |
---|
1312 | are to be drawn around the detected objects in the moment maps |
---|
1313 | included in the output (see for example Fig.~\ref{fig-spect}). |
---|
1314 | \item[verbose \texttt{[true]}] A flag indicating whether to print the |
---|
1315 | progress of computationally-intensive algorithms (such as the |
---|
1316 | searching and merging) to screen. |
---|
1317 | \end{entry} |
---|
1318 | |
---|
1319 | |
---|
1320 | \newpage |
---|
1321 | \section{Example parameter files} |
---|
1322 | \label{app-input} |
---|
1323 | |
---|
1324 | This is what a typical parameter file would look like. |
---|
1325 | |
---|
1326 | \begin{verbatim} |
---|
1327 | imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1328 | logFile logfile.txt |
---|
1329 | outFile results.txt |
---|
1330 | spectraFile spectra.ps |
---|
1331 | flagSubsection 0 |
---|
1332 | flagOutputRecon 0 |
---|
1333 | flagOutputResid 0 |
---|
1334 | flagBlankPix 1 |
---|
1335 | flagMW 1 |
---|
1336 | minMW 75 |
---|
1337 | maxMW 112 |
---|
1338 | minPix 3 |
---|
1339 | flagGrowth 1 |
---|
1340 | growthCut 1.5 |
---|
1341 | flagATrous 0 |
---|
1342 | scaleMin 1 |
---|
1343 | snrRecon 4 |
---|
1344 | flagFDR 1 |
---|
1345 | alphaFDR 0.1 |
---|
1346 | numPixPSF 20 |
---|
1347 | snrCut 3 |
---|
1348 | threshSpatial 3 |
---|
1349 | threshVelocity 7 |
---|
1350 | \end{verbatim} |
---|
1351 | |
---|
1352 | Note that it is not necessary to include all these parameters in the |
---|
1353 | file, only those that need to be changed from the defaults (as listed |
---|
1354 | in Appendix~\ref{app-param}), which in this case would be very few. A |
---|
1355 | minimal parameter file might look like: |
---|
1356 | \begin{verbatim} |
---|
1357 | imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1358 | flagLog 0 |
---|
1359 | snrRecon 3 |
---|
1360 | snrCut 2.5 |
---|
1361 | minChannels 4 |
---|
1362 | \end{verbatim} |
---|
1363 | This will reconstruct the cube with a lower SNR value than the |
---|
1364 | default, select objects at a lower threshold, with a looser minimum |
---|
1365 | channel requirement, and not keep a log of the intermediate |
---|
1366 | detections. |
---|
1367 | |
---|
1368 | The following page demonstrates how the parameters are presented to the |
---|
1369 | user, both on the screen at execution time, and in the output and log |
---|
1370 | files. On each line, there is a description on the parameter, the relevant |
---|
1371 | parameter name that is used in the input file (if there is one that the |
---|
1372 | user can enter), and the value of the parameter being used. |
---|
1373 | \newpage |
---|
1374 | \begin{landscape} |
---|
1375 | Typical presentation of parameters in output and log files: |
---|
1376 | \begin{verbatim} |
---|
1377 | ---- Parameters ---- |
---|
1378 | Image to be analysed.........................[imageFile] = input.fits |
---|
1379 | Intermediate Logfile...........................[logFile] = duchamp-Logfile.txt |
---|
1380 | Final Results file.............................[outFile] = duchamp-Results.txt |
---|
1381 | Spectrum file..............................[spectraFile] = duchamp-Spectra.ps |
---|
1382 | 0th Moment Map...............................[momentMap] = duchamp-MomentMap.ps |
---|
1383 | Detection Map.............................[detectionMap] = duchamp-DetectionMap.ps |
---|
1384 | Saving reconstructed cube?.............[flagoutputrecon] = false |
---|
1385 | Saving residuals from reconstruction?..[flagoutputresid] = false |
---|
1386 | ------ |
---|
1387 | Searching for Negative features?..........[flagNegative] = false |
---|
1388 | Fixing Blank Pixels?......................[flagBlankPix] = true |
---|
1389 | Blank Pixel Value....................................... = -8.00061 |
---|
1390 | Removing Milky Way channels?....................[flagMW] = true |
---|
1391 | Milky Way Channels.......................[minMW - maxMW] = 75-112 |
---|
1392 | Beam Size (pixels)...................................... = 10.1788 |
---|
1393 | Removing baselines before search?.........[flagBaseline] = false |
---|
1394 | Minimum # Pixels in a detection.................[minPix] = 2 |
---|
1395 | Minimum # Channels in a detection..........[minChannels] = 3 |
---|
1396 | Growing objects after detection?............[flagGrowth] = false |
---|
1397 | Using A Trous reconstruction?...............[flagATrous] = true |
---|
1398 | Number of dimensions in reconstruction........[reconDim] = 3 |
---|
1399 | Minimum scale in reconstruction...............[scaleMin] = 1 |
---|
1400 | SNR Threshold within reconstruction...........[snrRecon] = 4 |
---|
1401 | Filter being used for reconstruction........[filterCode] = 1 (B3 spline function) |
---|
1402 | Using FDR analysis?............................[flagFDR] = false |
---|
1403 | SNR Threshold...................................[snrCut] = 3 |
---|
1404 | Using Adjacent-pixel criterion?...........[flagAdjacent] = true |
---|
1405 | Max. velocity separation for merging....[threshVelocity] = 7 |
---|
1406 | Method of spectral plotting.............[spectralMethod] = peak |
---|
1407 | \end{verbatim} |
---|
1408 | |
---|
1409 | \newpage |
---|
1410 | \section{Example results file} |
---|
1411 | \label{app-output} |
---|
1412 | This the typical content of an output file, after running \duchamp\ |
---|
1413 | with the parameters illustrated on the previous page. |
---|
1414 | |
---|
1415 | {\scriptsize |
---|
1416 | \begin{verbatim} |
---|
1417 | Results of the \duchamp\ source finder: Tue May 23 14:51:38 2006 |
---|
1418 | ---- Parameters ---- |
---|
1419 | (... omitted for clarity -- see previous page for examples...) |
---|
1420 | -------------------- |
---|
1421 | Total number of detections = 25 |
---|
1422 | -------------------- |
---|
1423 | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
---|
1424 | Obj# Name X Y Z RA DEC VEL w_RA w_DEC w_VEL F_int F_peak X1 X2 Y1 Y2 Z1 Z2 Npix Flag |
---|
1425 | [km/s] [arcmin] [arcmin] [km/s] [Jy km/s] [Jy/beam] [pix] |
---|
1426 | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
---|
1427 | 1 J0618-2532 30.2 86.0 113.3 06:18:12.54 -25:32:44.79 208.502 45.17 34.61 26.383 24.394 0.350 25 35 82 90 112 114 137 E |
---|
1428 | 2 J0609-2156 59.5 140.6 114.6 06:09:19.66 -21:56:31.20 225.572 44.39 31.47 65.957 16.128 0.213 55 65 137 144 113 118 153 |
---|
1429 | 3 J0545-2143 141.2 143.2 114.8 05:45:51.71 -21:43:36.20 228.470 19.61 16.66 26.383 2.412 0.090 139 143 142 145 114 116 29 |
---|
1430 | 4 J0617-2633 33.3 70.8 115.6 06:17:25.52 -26:33:33.83 238.736 65.02 30.10 26.383 9.776 0.117 26 41 68 75 115 117 104 E |
---|
1431 | 5 J0601-2500 86.2 94.9 117.9 06:01:39.54 -25:00:32.46 269.419 27.99 24.02 26.383 3.920 0.124 83 89 92 97 117 119 44 |
---|
1432 | 6 J0602-2547 84.0 83.1 118.0 06:02:18.29 -25:47:31.69 270.319 20.01 19.99 26.383 2.999 0.118 82 86 81 85 117 119 34 |
---|
1433 | 7 J0547-2448 133.0 97.2 118.7 05:47:52.53 -24:48:38.16 279.113 19.72 12.54 26.383 1.474 0.074 131 135 96 98 118 120 21 |
---|
1434 | 8 J0606-2719 71.1 60.0 121.3 06:06:10.99 -27:19:48.61 314.090 52.36 39.59 39.574 14.268 0.150 65 77 55 64 120 123 154 |
---|
1435 | 9 J0611-2137 52.4 145.3 162.5 06:11:20.92 -21:37:29.57 857.955 32.39 23.49 118.722 43.178 0.410 49 56 142 147 158 167 265 E |
---|
1436 | 10 J0600-2859 89.7 35.3 202.4 06:00:34.08 -28:59:00.43 1383.160 23.93 24.10 171.487 24.439 0.173 87 92 33 38 196 209 271 |
---|
1437 | 11 J0558-2638 95.4 70.3 223.1 05:58:53.03 -26:38:45.91 1656.140 11.93 12.07 92.339 1.045 0.063 94 96 69 71 220 227 18 |
---|
1438 | 12 J0617-2723 34.7 58.3 227.4 06:17:07.07 -27:23:50.65 1712.868 16.75 23.53 290.209 8.529 0.093 33 36 56 61 215 237 118 |
---|
1439 | 13 J0558-2525 95.8 88.6 231.7 05:58:49.27 -25:25:33.60 1770.134 27.87 24.16 237.444 12.863 0.115 92 98 86 91 221 239 175 |
---|
1440 | 14 J0600-2141 88.8 144.4 232.5 06:00:54.02 -21:41:57.06 1780.188 27.96 24.13 224.252 30.743 0.166 86 92 142 147 222 239 344 E |
---|
1441 | 15 J0615-2634 40.0 70.8 232.6 06:15:25.50 -26:34:20.04 1782.214 12.44 15.69 52.765 2.084 0.068 39 41 69 72 231 235 31 |
---|
1442 | 16 J0604-2606 76.0 78.4 233.0 06:04:41.13 -26:06:21.19 1787.226 24.13 23.87 211.061 23.563 0.155 73 78 76 81 225 241 278 |
---|
1443 | 17 J0601-2340 87.9 114.9 235.8 06:01:08.83 -23:40:19.37 1824.122 31.95 28.09 237.444 82.380 0.297 85 92 112 118 227 245 647 |
---|
1444 | 18 J0615-2235 38.2 130.5 254.5 06:15:32.09 -22:35:37.24 2070.934 12.29 11.70 105.531 1.555 0.070 37 39 129 131 249 257 24 |
---|
1445 | 19 J0617-2305 31.4 122.8 258.1 06:17:33.45 -23:05:28.94 2118.752 12.34 11.65 26.383 1.022 0.062 30 32 122 124 257 259 16 |
---|
1446 | 20 J0612-2149 49.6 142.2 270.3 06:12:11.04 -21:49:29.72 2279.926 16.27 15.73 395.740 15.156 0.101 48 51 141 144 257 287 204 |
---|
1447 | 21 J0616-2133 35.3 146.0 300.6 06:16:15.78 -21:33:09.69 2679.148 20.22 7.47 224.252 3.014 0.127 33 37 145 146 294 311 28 E |
---|
1448 | 22 J0555-2956 107.3 20.9 367.6 05:55:08.02 -29:56:09.08 3562.236 19.71 20.30 39.574 5.891 0.169 105 109 19 23 366 369 58 |
---|
1449 | 23 J0557-2246 99.8 128.2 434.0 05:57:43.77 -22:46:42.95 4438.776 11.88 16.12 105.531 1.703 0.167 99 101 127 130 430 438 17 N |
---|
1450 | 24 J0616-2648 38.1 67.2 546.8 06:16:02.10 -26:48:35.49 5926.464 12.35 11.67 26.383 1.276 0.064 37 39 66 68 546 548 18 |
---|
1451 | 25 J0552-2916 117.0 30.5 727.0 05:52:13.64 -29:16:58.02 8303.952 11.59 20.25 303.400 35.523 0.479 116 118 28 32 716 739 111 |
---|
1452 | \end{verbatim} |
---|
1453 | } |
---|
1454 | Note that the |
---|
1455 | width of the table can make it hard to read. A good trick for those |
---|
1456 | using UNIX/Linux is to make use of the \texttt{a2ps} command. The |
---|
1457 | following works well, producing a postscript file \texttt{results.ps}: |
---|
1458 | \\\verb|a2ps -1 -r -f8 -o duchamp-Results.ps duchamp-Results.txt| |
---|
1459 | |
---|
1460 | %\end{landscape} |
---|
1461 | |
---|
1462 | \newpage |
---|
1463 | \section{Example VOTable output} |
---|
1464 | \label{app-votable} |
---|
1465 | This is part of the VOTable, in XML format, corresponding to the |
---|
1466 | output file in Appendix~\ref{app-output} (the indentation has been |
---|
1467 | removed to make it fit on the page). |
---|
1468 | |
---|
1469 | %\begin{landscape} |
---|
1470 | {\scriptsize |
---|
1471 | \begin{verbatim} |
---|
1472 | <?xml version="1.0"?> |
---|
1473 | <VOTABLE version="1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
---|
1474 | xsi:noNamespaceSchemaLocation="http://www.ivoa.net/xml/VOTable/VOTable/v1.1"> |
---|
1475 | <COOSYS ID="J2000" equinox="J2000." epoch="J2000." system="eq_FK5"/> |
---|
1476 | <RESOURCE name="Duchamp Output"> |
---|
1477 | <TABLE name="Detections"> |
---|
1478 | <DESCRIPTION>Detected sources and parameters from running the Duchamp source finder.</DESCRIPTION> |
---|
1479 | <PARAM name="FITS file" datatype="char" ucd="meta.file;meta.fits" value="/DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits"/> |
---|
1480 | <PARAM name="Threshold" datatype="float" ucd="stat.snr" value="2.5"> |
---|
1481 | <PARAM name="ATrous note" datatype="char" ucd="meta.note" value="The a trous reconstruction method was used, with the following parameters."> |
---|
1482 | <PARAM name="ATrous Dimension" datatype="int" ucd="meta.code;stat" value="3"> |
---|
1483 | <PARAM name="ATrous Cut" datatype="float" ucd="stat.snr" value="4"> |
---|
1484 | <PARAM name="ATrous Minimum Scale" datatype="int" ucd="stat.param" value="1"> |
---|
1485 | <PARAM name="ATrous Filter" datatype="char" ucd="meta.code;stat" value="B3 spline function"> |
---|
1486 | <FIELD name="ID" ID="col1" ucd="meta.id" datatype="int" width="4"/> |
---|
1487 | <FIELD name="Name" ID="col2" ucd="meta.id;meta.main" datatype="char" arraysize="14"/> |
---|
1488 | <FIELD name="RA" ID="col3" ucd="pos.eq.ra;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/> |
---|
1489 | <FIELD name="Dec" ID="col4" ucd="pos.eq.dec;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/> |
---|
1490 | <FIELD name="w_RA" ID="col3" ucd="phys.angSize;pos.eq.ra" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/> |
---|
1491 | <FIELD name="w_Dec" ID="col4" ucd="phys.angSize;pos.eq.dec" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/> |
---|
1492 | <FIELD name="Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc" datatype="float" width="9" precision="3" unit="km/s"/> |
---|
1493 | <FIELD name="w_Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc;spect.line.width" datatype="float" width="8" precision="3" unit="km/s"/> |
---|
1494 | <FIELD name="Integrated_Flux" ID="col4" ucd="phys.flux;spect.line.intensity" datatype="float" width="10" precision="3" unit="km/s"/> |
---|
1495 | <DATA> |
---|
1496 | <TABLEDATA> |
---|
1497 | <TR> |
---|
1498 | <TD> 1</TD><TD> J0609-2200</TD><TD> 92.410416</TD><TD>-22.013390</TD><TD> 48.50</TD><TD> 39.42</TD><TD> 213.061</TD><TD> 65.957</TD><TD> 17.572</TD> |
---|
1499 | </TR> |
---|
1500 | <TR> |
---|
1501 | <TD> 2</TD><TD> J0608-2605</TD><TD> 92.042633</TD><TD>-26.085157</TD><TD> 44.47</TD><TD> 39.47</TD><TD> 233.119</TD><TD> 39.574</TD><TD> 4.144</TD> |
---|
1502 | </TR> |
---|
1503 | <TR> |
---|
1504 | <TD> 3</TD><TD> J0606-2724</TD><TD> 91.637840</TD><TD>-27.412022</TD><TD> 52.48</TD><TD> 47.57</TD><TD> 302.213</TD><TD> 39.574</TD><TD> 17.066</TD> |
---|
1505 | </TR> |
---|
1506 | (... table truncated for clarity ...) |
---|
1507 | </TABLEDATA> |
---|
1508 | </DATA> |
---|
1509 | </TABLE> |
---|
1510 | </RESOURCE> |
---|
1511 | </VOTABLE> |
---|
1512 | \end{verbatim} |
---|
1513 | } |
---|
1514 | \end{landscape} |
---|
1515 | |
---|
1516 | \newpage |
---|
1517 | \section{Example Karma Annotation File output} |
---|
1518 | \label{app-karma} |
---|
1519 | |
---|
1520 | This is the format of the Karma Annotation file, showing the locations |
---|
1521 | of the detected objects. This can be loaded by the plotting tools of |
---|
1522 | the Karma package (for instance, \texttt{kvis}) as an overlay on the FITS |
---|
1523 | file. |
---|
1524 | |
---|
1525 | \begin{verbatim} |
---|
1526 | # Duchamp Source Finder results for |
---|
1527 | # cube /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits |
---|
1528 | COLOR RED |
---|
1529 | COORD W |
---|
1530 | CIRCLE 92.3376 -21.9475 0.403992 |
---|
1531 | TEXT 92.3376 -21.9475 1 |
---|
1532 | CIRCLE 91.9676 -26.0193 0.37034 |
---|
1533 | TEXT 91.9676 -26.0193 2 |
---|
1534 | CIRCLE 91.5621 -27.3459 0.437109 |
---|
1535 | TEXT 91.5621 -27.3459 3 |
---|
1536 | CIRCLE 92.8285 -21.6344 0.269914 |
---|
1537 | TEXT 92.8285 -21.6344 4 |
---|
1538 | CIRCLE 90.1381 -28.9838 0.234179 |
---|
1539 | TEXT 90.1381 -28.9838 5 |
---|
1540 | CIRCLE 89.72 -26.6513 0.132743 |
---|
1541 | TEXT 89.72 -26.6513 6 |
---|
1542 | CIRCLE 94.2743 -27.4003 0.195175 |
---|
1543 | TEXT 94.2743 -27.4003 7 |
---|
1544 | CIRCLE 92.2739 -21.6941 0.134538 |
---|
1545 | TEXT 92.2739 -21.6941 8 |
---|
1546 | CIRCLE 89.7133 -25.4259 0.232252 |
---|
1547 | TEXT 89.7133 -25.4259 9 |
---|
1548 | CIRCLE 90.2206 -21.6993 0.266247 |
---|
1549 | TEXT 90.2206 -21.6993 10 |
---|
1550 | CIRCLE 93.8581 -26.5766 0.163153 |
---|
1551 | TEXT 93.8581 -26.5766 11 |
---|
1552 | CIRCLE 91.176 -26.1064 0.234356 |
---|
1553 | TEXT 91.176 -26.1064 12 |
---|
1554 | CIRCLE 90.2844 -23.6716 0.299509 |
---|
1555 | TEXT 90.2844 -23.6716 13 |
---|
1556 | CIRCLE 93.8774 -22.581 0.130925 |
---|
1557 | TEXT 93.8774 -22.581 14 |
---|
1558 | CIRCLE 94.3882 -23.0934 0.137108 |
---|
1559 | TEXT 94.3882 -23.0934 15 |
---|
1560 | CIRCLE 93.0491 -21.8223 0.202928 |
---|
1561 | TEXT 93.0491 -21.8223 16 |
---|
1562 | CIRCLE 94.0685 -21.5603 0.168456 |
---|
1563 | TEXT 94.0685 -21.5603 17 |
---|
1564 | CIRCLE 86.0568 -27.6095 0.101113 |
---|
1565 | TEXT 86.0568 -27.6095 18 |
---|
1566 | CIRCLE 88.7932 -29.9453 0.202624 |
---|
1567 | TEXT 88.7932 -29.9453 19 |
---|
1568 | \end{verbatim} |
---|
1569 | |
---|
1570 | \newpage |
---|
1571 | \section{Robust statistics for a Normal distribution} |
---|
1572 | \label{app-madfm} |
---|
1573 | |
---|
1574 | The Normal, or Gaussian, distribution for mean $\mu$ and standard |
---|
1575 | deviation $\sigma$ can be written as |
---|
1576 | \[ |
---|
1577 | f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\ e^{-(x-\mu)^2/2\sigma^2}. |
---|
1578 | \] |
---|
1579 | |
---|
1580 | When one has a purely Gaussian signal, it is straightforward to |
---|
1581 | estimate $\sigma$ by calculating the standard deviation (or rms) of |
---|
1582 | the data. However, if there is a small amount of signal present on top |
---|
1583 | of Gaussian noise, and one wants to estimate the $\sigma$ for the |
---|
1584 | noise, the presence of the large values from the signal can bias the |
---|
1585 | estimator to higher values. |
---|
1586 | |
---|
1587 | An alternative way is to use the median ($m$) and median absolute deviation |
---|
1588 | from the median ($s$) to estimate $\mu$ and $\sigma$. The median is the |
---|
1589 | middle of the distribution, defined for a continuous distribution by |
---|
1590 | \[ |
---|
1591 | \int_{-\infty}^{m} f(x) \diff x = \int_{m}^{\infty} f(x) \diff x. |
---|
1592 | \] |
---|
1593 | From symmetry, we quickly see that for the continuous Normal |
---|
1594 | distribution, $m=\mu$. We consider the case henceforth of $\mu=0$, |
---|
1595 | without loss of generality. |
---|
1596 | |
---|
1597 | To find $s$, we find the distribution of the absolute deviation from |
---|
1598 | the median, and then find the median of that distribution. This |
---|
1599 | distribution is given by |
---|
1600 | \begin{eqnarray*} |
---|
1601 | g(x) &= &{\mbox{\rm distribution of }} |x|\\ |
---|
1602 | &= &f(x) + f(-x),\ x\ge0\\ |
---|
1603 | &= &\sqrt{\frac{2}{\pi\sigma^2}}\, e^{-x^2/2\sigma^2},\ x\ge0. |
---|
1604 | \end{eqnarray*} |
---|
1605 | So, the median absolute deviation from the median, $s$, is given by |
---|
1606 | \[ |
---|
1607 | \int_{0}^{s} g(x) \diff x = \int_{s}^{\infty} g(x) \diff x. |
---|
1608 | \] |
---|
1609 | Now, $\int_{0}^{\infty}e^{-x^2/2\sigma^2} \diff x = \sqrt{\pi\sigma^2/2}$, and |
---|
1610 | so $\int_{s}^{\infty} e^{-x^2/2\sigma^2} \diff x = |
---|
1611 | \sqrt{\pi\sigma^2/2} - \int_{0}^{s} e^{-\frac{x^2}{2\sigma^2}} \diff x |
---|
1612 | $. Hence, to find $s$ we simply solve the following equation (setting $\sigma=1$ for |
---|
1613 | simplicity -- equivalent to stating $x$ and $s$ in units of $\sigma$): |
---|
1614 | \[ |
---|
1615 | \int_{0}^{s}e^{-x^2/2} \diff x - \sqrt{\pi/8} = 0. |
---|
1616 | \] |
---|
1617 | This is hard to solve analytically (no nice analytic solution exists |
---|
1618 | for the finite integral that I'm aware of), but straightforward to |
---|
1619 | solve numerically, yielding the value of $s=0.6744888$. Thus, to |
---|
1620 | estimate $\sigma$ for a Normally distributed data set, one can calculate |
---|
1621 | $s$, then divide by 0.6744888 (or multiply by 1.4826042) to obtain the |
---|
1622 | correct estimator. |
---|
1623 | |
---|
1624 | Note that this is different to solutions quoted elsewhere, |
---|
1625 | specifically in \citet{meyer04:trunc}, where the same robust estimator |
---|
1626 | is used but with an incorrect conversion to standard deviation -- they |
---|
1627 | assume $\sigma = s\sqrt{\pi/2}$. This, in fact, is the conversion used |
---|
1628 | to convert the \emph{mean} absolute deviation from the mean to the |
---|
1629 | standard deviation. This means that the cube noise in the \hipass\ |
---|
1630 | catalogue (their parameter Rms$_{\rm cube}$) should be 18\% larger |
---|
1631 | than quoted. |
---|
1632 | |
---|
1633 | \section{How Gaussian noise changes with wavelet scale.} |
---|
1634 | \label{app-scaling} |
---|
1635 | |
---|
1636 | The key element in the wavelet reconstruction of an array is the |
---|
1637 | thresholding of the individual wavelet coefficient arrays. This is |
---|
1638 | usually done by choosing a level to be some number of standard |
---|
1639 | deviations above the mean value. |
---|
1640 | |
---|
1641 | However, since the wavelet arrays are produced by convolving the input |
---|
1642 | array by an increasingly large filter, the pixels in the coefficient |
---|
1643 | arrays become increasingly correlated as the scale of the filter |
---|
1644 | increases. This results in the measured standard deviation from a |
---|
1645 | given coefficient array decreasing with increasing scale. To calculate |
---|
1646 | this, we need to take into account how many other pixels each pixel in |
---|
1647 | the convolved array depends on. |
---|
1648 | |
---|
1649 | To demonstrate, suppose we have a 1-D array with $N$ pixel values |
---|
1650 | given by $F_i,\ i=1,...,N$, and we convolve it with the B$_3$-spline |
---|
1651 | filter, defined by the set of coefficients |
---|
1652 | $\{1/16,1/4,3/8,1/4,1/16\}$. The flux of the $i$th pixel in the |
---|
1653 | convolved array will be |
---|
1654 | \[ |
---|
1655 | F'_i = \frac{1}{16}F_{i-2} + \frac{1}{4}F_{i-1} + \frac{3}{8}F_{i} |
---|
1656 | + \frac{1}{4}F_{i+1} + \frac{1}{16}F_{i+2} |
---|
1657 | \] |
---|
1658 | and the flux of the corresponding pixel in the wavelet array will be |
---|
1659 | \[ |
---|
1660 | W'_i = F_i - F'_i = \frac{-1}{16}F_{i-2} - \frac{1}{4}F_{i-1} + \frac{5}{8}F_{i} |
---|
1661 | - \frac{1}{4}F_{i+1} - \frac{1}{16}F_{i+2} |
---|
1662 | \] |
---|
1663 | Now, assuming each pixel has the same standard deviation |
---|
1664 | $\sigma_i=\sigma$, we can work out the standard deviation for the |
---|
1665 | wavelet array: |
---|
1666 | \[ |
---|
1667 | \sigma'_i = \sigma \sqrt{\left(\frac{1}{16}\right)^2 + \left(\frac{1}{4}\right)^2 |
---|
1668 | + \left(\frac{5}{8}\right)^2 + \left(\frac{1}{4}\right)^2 + \left(\frac{1}{16}\right)^2} |
---|
1669 | = 0.72349\ \sigma |
---|
1670 | \] |
---|
1671 | Thus, the first scale wavelet coefficient array will have a standard |
---|
1672 | deviation of 72.3\% of the input array. This procedure can be followed |
---|
1673 | to calculate the necessary values for all scales, dimensions and |
---|
1674 | filters used by \duchamp. |
---|
1675 | |
---|
1676 | Calculating these values is clearly a critical step in performing the |
---|
1677 | reconstruction. \citet{starck02:book} did so by simulating data sets |
---|
1678 | with Gaussian noise, taking the wavelet transform, and measuring the |
---|
1679 | value of $\sigma$ for each scale. We take a different approach, by |
---|
1680 | calculating the scaling factors directly from the filter coefficients |
---|
1681 | by taking the wavelet transform of an array made up of a 1 in the |
---|
1682 | central pixel and 0s everywhere else. The scaling value is then |
---|
1683 | derived by taking the square root of the sum (in quadrature) of all |
---|
1684 | the wavelet coefficient values at each scale. We give the scaling |
---|
1685 | factors for the three filters available to \duchamp\ on the following |
---|
1686 | page. These values are hard-coded into \duchamp, so no on-the-fly |
---|
1687 | calculation of them is necessary. |
---|
1688 | |
---|
1689 | Memory limitations prevent us from calculating factors for large |
---|
1690 | scales, particularly for the three-dimensional case (hence the -- |
---|
1691 | symbols in the tables). To calculate factors for |
---|
1692 | higher scales than those available, we note the following |
---|
1693 | relationships apply for large scales to a sufficient level of precision: |
---|
1694 | \begin{itemize} |
---|
1695 | \item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{2}$. |
---|
1696 | \item 2-D: factor(scale $i$) = factor(scale $i-1$)$/2$. |
---|
1697 | \item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{8}$. |
---|
1698 | \end{itemize} |
---|
1699 | |
---|
1700 | \newpage |
---|
1701 | \begin{itemize} |
---|
1702 | \item \textbf{B$_3$-Spline Function:} $\{1/16,1/4,3/8,1/4,1/16\}$ |
---|
1703 | |
---|
1704 | \begin{tabular}{llll} |
---|
1705 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1706 | 1 & 0.723489806 & 0.890796310 & 0.956543592\\ |
---|
1707 | 2 & 0.285450405 & 0.200663851 & 0.120336499\\ |
---|
1708 | 3 & 0.177947535 & 0.0855075048 & 0.0349500154\\ |
---|
1709 | 4 & 0.122223156 & 0.0412474444 & 0.0118164242\\ |
---|
1710 | 5 & 0.0858113122 & 0.0204249666 & 0.00413233507\\ |
---|
1711 | 6 & 0.0605703043 & 0.0101897592 & 0.00145703714\\ |
---|
1712 | 7 & 0.0428107206 & 0.00509204670 & 0.000514791120\\ |
---|
1713 | 8 & 0.0302684024 & 0.00254566946 & --\\ |
---|
1714 | 9 & 0.0214024008 & 0.00127279050 & --\\ |
---|
1715 | 10 & 0.0151336781 & 0.000636389722 & --\\ |
---|
1716 | 11 & 0.0107011079 & 0.000318194170 & --\\ |
---|
1717 | 12 & 0.00756682272 & -- & --\\ |
---|
1718 | 13 & 0.00535055108 & -- & --\\ |
---|
1719 | %14 & 0.00378341085 & -- & --\\ |
---|
1720 | %15 & 0.00267527545 & -- & --\\ |
---|
1721 | %16 & 0.00189170541 & -- & --\\ |
---|
1722 | %17 & 0.00133763772 & -- & --\\ |
---|
1723 | %18 & 0.000945852704 & -- & -- |
---|
1724 | \end{tabular} |
---|
1725 | |
---|
1726 | \item \textbf{Triangle Function:} $\{1/4,1/2,1/4\}$ |
---|
1727 | |
---|
1728 | \begin{tabular}{llll} |
---|
1729 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1730 | 1 & 0.612372436 & 0.800390530 & 0.895954449 \\ |
---|
1731 | 2 & 0.330718914 & 0.272878894 & 0.192033014\\ |
---|
1732 | 3 & 0.211947812 & 0.119779282 & 0.0576484078\\ |
---|
1733 | 4 & 0.145740298 & 0.0577664785 & 0.0194912393\\ |
---|
1734 | 5 & 0.102310944 & 0.0286163283 & 0.00681278387\\ |
---|
1735 | 6 & 0.0722128185 & 0.0142747506 & 0.00240175885\\ |
---|
1736 | 7 & 0.0510388224 & 0.00713319703 & 0.000848538128 \\ |
---|
1737 | 8 & 0.0360857673 & 0.00356607618 & 0.000299949455 \\ |
---|
1738 | 9 & 0.0255157615 & 0.00178297280 & -- \\ |
---|
1739 | 10 & 0.0180422389 & 0.000891478237 & -- \\ |
---|
1740 | 11 & 0.0127577667 & 0.000445738098 & -- \\ |
---|
1741 | 12 & 0.00902109930 & 0.000222868922 & -- \\ |
---|
1742 | 13 & 0.00637887978 & -- & -- \\ |
---|
1743 | %14 & 0.00451054902 & -- & -- \\ |
---|
1744 | %15 & 0.00318942978 & -- & -- \\ |
---|
1745 | %16 & 0.00225527449 & -- & -- \\ |
---|
1746 | %17 & 0.00159471988 & -- & -- \\ |
---|
1747 | %18 & 0.000112763724 & -- & -- |
---|
1748 | |
---|
1749 | \end{tabular} |
---|
1750 | |
---|
1751 | \item \textbf{Haar Wavelet:} $\{0,1/2,1/2\}$ |
---|
1752 | |
---|
1753 | \begin{tabular}{llll} |
---|
1754 | Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline |
---|
1755 | 1 & 0.707167810 & 0.433012702 & 0.935414347 \\ |
---|
1756 | 2 & 0.500000000 & 0.216506351 & 0.330718914\\ |
---|
1757 | 3 & 0.353553391 & 0.108253175 & 0.116926793\\ |
---|
1758 | 4 & 0.250000000 & 0.0541265877 & 0.0413398642\\ |
---|
1759 | 5 & 0.176776695 & 0.0270632939 & 0.0146158492\\ |
---|
1760 | 6 & 0.125000000 & 0.0135316469 & 0.00516748303 |
---|
1761 | |
---|
1762 | \end{tabular} |
---|
1763 | |
---|
1764 | |
---|
1765 | \end{itemize} |
---|
1766 | |
---|
1767 | \end{document} |
---|