Context Navigation

Guide.tex @ 87

Visit:

Last change on this file since 87 was 87, checked in by Matthew Whiting, 18 years ago

Added code to produce warning flags for detections: for either edge location
or negative enclosed flux. Summary of changes:
Cubes/cubes.cc -- three new routines
Cubes/cubes.hh -- prototypes for new routines. New isBlank functions.
Detection/outputDetection.cc -- output of warning flags.
mainDuchamp.cc -- calling of new routines. Other minor changes.
docs/Guide.tex -- explanation of new warning flags. Other minor changes.
docs/example_spectrum.pdf -- shows the new formatting.
docs/example_spectrum.ps -- ditto
InputComplete? -- all values are now the same as the default param values

File size: 75.6 KB

Rev	Line
[18]	1	\documentclass[12pt,a4paper]{article}
	2
	3	%Define a test for doing PDF format -- use different code below
	4	\newif\ifPDF
	5	\ifx\pdfoutput\undefined\PDFfalse
	6	\else\ifnum\pdfoutput > 0\PDFtrue
	7	\else\PDFfalse
	8	\fi
	9	\fi
	10
[3]	11	\textwidth=161 mm
[71]	12	\textheight=245 mm
	13	\topmargin=-15 mm
[3]	14	\oddsidemargin=0 mm
	15	\parindent=6 mm
	16
[18]	17	\usepackage[sort]{natbib}
	18	\usepackage{lscape}
	19	\bibpunct[,]{(}{)}{;}{a}{}{,}
	20
[3]	21	\newcommand{\eg}{e.g.\ }
	22	\newcommand{\ie}{i.e.\ }
	23	\newcommand{\hi}{H{\sc i}}
	24	\newcommand{\hipass}{{\sc hipass}}
	25	\newcommand{\progname}{{\tt Duchamp}}
	26	\newcommand{\diff}{{\rm d}}
	27	\newcommand{\entrylabel}[1]{\mbox{\textsf{\bf{#1:}}}\hfil}
	28	\newenvironment{entry}
	29	{\begin{list}{}%
	30	{\renewcommand{\makelabel}{\entrylabel}%
	31	\setlength{\labelwidth}{30mm}%
	32	\setlength{\labelsep}{5pt}%
	33	\setlength{\itemsep}{2pt}%
	34	\setlength{\parsep}{2pt}%
	35	\setlength{\leftmargin}{35mm}%
	36	}%
	37	}%
	38	{\end{list}}
	39
[18]	40
[3]	41	\title{A Guide to the {\it Duchamp} Source Finding Software}
[18]	42	\author{Matthew Whiting\\
	43	%{\small \href{mailto:Matthew.Whiting@csiro.au}{Matthew.Whiting@csiro.au}}\\
[15]	44	Australia Telescope National Facility\\CSIRO}
[3]	45	%\date{January 2006}
	46	\date{}
	47
[18]	48	% If we are creating a PDF, use different options for graphicx, hyperref.
	49	\ifPDF
	50	\usepackage[pdftex]{graphicx,color}
	51	\usepackage[pdftex]{hyperref}
	52	\hypersetup{colorlinks=true,%
[80]	53	citecolor=red,%
	54	filecolor=red,%
	55	linkcolor=red,%
	56	urlcolor=red,%
	57	}
[18]	58	\else
	59	\usepackage[dvips]{graphicx}
	60	\usepackage[dvips]{hyperref}
	61	\fi
	62
[80]	63	\pagestyle{headings}
[3]	64	\begin{document}
	65
	66	\maketitle
[80]	67	\thispagestyle{empty}
	68	\begin{figure}[!h]
	69	\begin{center}
	70	\includegraphics[width=\textwidth]{cover_image}
	71	\end{center}
	72	\end{figure}
	73
	74	\newpage
[3]	75	\tableofcontents
	76
	77	\newpage
	78	\section{Introduction and getting going quickly}
	79
	80	This document gives details on the use of the program Duchamp. This
[80]	81	has been designed to provide a source-detection facility for
[3]	82	spectral-line data cubes. The basic execution of Duchamp is to read
	83	in a FITS data cube, find sources in the cube, and produce a text
	84	file of positions, velocities and fluxes of the detections, as well as
	85	a postscript file of the spectra of each detection.
	86
	87	So, you have a FITS cube, and you want to find the sources in it. What
	88	do you do? The first step is to make an input file that contains the
	89	list of parameters. Brief and detailed examples are shown in
	90	Appendix~\ref{app-input}. This provides the input file name, the various
	91	output files, and defines various parameters that control the
	92	execution.
	93
[85]	94	The standard way to run Duchamp is by the command
[3]	95	\begin{quote}
	96	{\tt Duchamp -p [parameter file]}
	97	\end{quote}
	98	replacing {\tt [parameter file]} with the name of the file you have
[85]	99	just created/copied. Alternatively, you can use the syntax
	100	\begin{quote}
	101	{\tt Duchamp -f [FITS file]}
	102	\end{quote}
	103	where {\tt [FITS file]} is the file you wish to search. In the latter
	104	case, the rest of the parameters will take their default values
	105	detailed in Appendix~\ref{app-param}. In either case, the program will
	106	then work away and give you the list of detections and their
[87]	107	spectra. The program execution is summarised below, and detailed in
[85]	108	\S\ref{sec-flow}. Information on inputs is in \S\ref{sec-param} and
	109	Appendix~\ref{app-param}, and descriptions of the output is in
	110	\S\ref{sec-output}.
[3]	111
	112	\subsection{A summary of the execution steps}
	113
	114	The basic flow of the program is summarised here. All these steps are
	115	discussed in more detail in the following sections, so read on if
[18]	116	you have questions!
[3]	117	\begin{enumerate}
	118	\item The parameter file given on the command line is read in, and the
	119	parameters absorbed.
	120	\item From the parameter file, the FITS image is located and read in
	121	to memory.
[80]	122	\item If requested, a FITS image with a previously reconstructed array
	123	is read in.
[3]	124	\item If requested, blank pixels are trimmed from the edges, and
	125	channels corresponding to bright (\eg Galactic) emission are
	126	excised.
	127	\item If requested, the baseline of each spectrum is removed.
[80]	128	\item If the reconstruction method is requested, and the reconstructed
	129	array has not been read in at Step 3 above, the cube is
[3]	130	reconstructed using the {\it {\' a} trous} wavelet method.
	131	\item Searching for objects then takes place, using the requested
	132	thresholding method.
	133	\item The list of objects is trimmed by merging neighbouring objects
	134	and removing those deemed unacceptable.
	135	\item The baselines and trimmed pixels are replaced prior to output.
	136	\item The details on the detections are written to screen and to the
	137	requested output file.
	138	\item Maps showing the spatial location of the detections are written.
	139	\item The integrated spectra of each detection are written to a
	140	postscript file.
	141	\item If requested, the reconstructed array can be written to a new
	142	FITS file.
	143	\end{enumerate}
	144
	145	\subsection{Guide to terminology}
	146
[18]	147	First, a brief note on the use of terminology in this guide. Duchamp
	148	is designed to work on FITS ``cubes''. These are FITS\footnote{FITS is
	149	the Flexible Image Transport System -- see \citet{hanisch01} or
	150	websites such as
	151	\href{http://fits.cv.nrao.edu/FITS.html}{http://fits.cv.nrao.edu/FITS.html}
	152	for details.} image arrays with three dimensions -- they are assumed
	153	to have the following form: the first two dimensions (referred to as
	154	$x$ and $y$) are spatial directions (that is, relating to the position
	155	on the sky), while the third dimension, $z$, is the spectral
	156	direction, which can correspond to frequency, wavelength, or velocity.
[3]	157
	158	Each spatial pixel (a given $(x,y)$ coordinate) can be said to be a
	159	single spectrum, while a slice through the cube perpendicular to the
	160	spectral direction at a given $z$-value is a single channel (the 2-D
	161	image is a channel map).
	162
[60]	163	Features that are detected are assumed to be positive. The user can
	164	choose to search for negative features by setting an input parameter
	165	-- this inverts the cube prior to the search (see
	166	\S~\ref{sec-detection} for details).
[24]	167
[18]	168	Note that it is possible to run Duchamp on a two-dimensional image
	169	(\ie one with no frequency or velocity information), or indeed a
	170	one-dimensional array, and many of the features of the program will
	171	work fine. The focus, however, is on object detection in three
	172	dimensions.
	173
[15]	174	\subsection{Why ``Duchamp''?}
	175
	176	Well, it's important for a program to have a name, and it certainly
[80]	177	beats the initial working title of ``cubefind''. I had planned to call
	178	it ``Picasso'' (as in the father of cubism), but sadly this had
	179	already been used before \citep{minchin99}. So I settled on naming it
	180	after Marcel Duchamp, another cubist, but also one of the first
	181	artists to work with ``found objects''.
[15]	182
[3]	183	\section{User Inputs}
	184	\label{sec-param}
	185
	186	Input to the program is provided by means of a parameter file. Parameters
	187	are listed in the file, followed by the value that should be assigned
	188	to them. The syntax used is {\tt paramName value}. The file is not
	189	case-sensitive, and lines in the input file that start with {\tt \#} are
	190	ignored. If a parameter is listed more than once, the latter value is
	191	used, but otherwise the order in which the parameters are listed in the
	192	input file is arbitrary.
	193
	194	If a parameter is not listed, the default value is assumed. The
	195	defaults are chosen to provide a good result (using the reconstruction
[24]	196	method), so the user doesn't need to specify many new parameters in
	197	the input file. Note that the image file {\bf must} be specified! The
	198	parameters that can be set are listed in Appendix~\ref{app-param},
	199	with their default values in parentheses.
[3]	200
	201	The 'flag' parameters are stored as {\tt bool} variables, and so are
[18]	202	either {\tt true = 1} or {\tt false = 0}. Currently the program only
[3]	203	reads them from the file as integers, and so they should be entered in
	204	the file as 0 or 1 (see example file in Appendix~\ref{app-input}).
	205
	206	\section{What the program is doing}
	207	\label{sec-flow}
	208
	209	The execution flow of the program is detailed here, indicating the
	210	main algorithmic steps that are used. The program is written in C/C++
	211	and makes use of the {\sc cfitsio}, {\sc wcslib} and {\sc pgplot}
	212	libraries.
	213
	214	%\subsection{Parameter input}
	215	%
	216	%The user provides parameters that govern the selection of files and
	217	%the parameters used by the various subroutines in the program. This is
	218	%done via a parameter file, and the parameters are stored in a C++
	219	%class for use throughout the program. The form of the parameter file is
	220	%discussed in \S\ref{sec-param}, and the parameters themselves are
	221	%listed in Appendix~\ref{app-param}.
	222
	223	\subsection{Image input}
	224
	225	The cube is read in using basic {\sc cfitsio} commands, and stored as
	226	an array in a special C++ class structure. This class keeps track of
	227	the list of detected objects, as well as any reconstructed arrays that
[18]	228	are made (see \S\ref{sec-recon}). The World Coordinate System (WCS)
	229	information for the cube is also obtained from the FITS header by {\sc
	230	wcslib} functions \citep{greisen02, calabretta02}, and this
	231	information, in the form of a {\tt wcsprm} structure, is also stored
	232	in the same class.
[3]	233
[80]	234	A sub-section of an image can be requested via the {\tt subsection}
	235	parameter in the parameter file -- this can be a good idea if the cube
	236	has very noisy edges, which may produce many spurious detections. The
	237	generalised form of the subsection that is used by {\sc cfitsio} is
	238	{\tt [x1:x2:dx,y1:y2:dy,z1:z2:dz]}, such that the x-coordinates run
	239	from {\tt x1} to {\tt x2} (inclusive), with steps of {\tt dx}. The
	240	step value can be omitted (so a subsection of the form {\tt
	241	[2:50,2:50,10:1000]} is still valid). Duchamp does not at this stage
	242	deal with the presence of steps in the subsection string, and any that
	243	are present are removed before the file is opened.
[3]	244
[80]	245	If one wants the full range of a coordinate then replace the range
	246	with an asterisk, \eg {\tt [2:50,2:50,*]}. If one wants to use just a
[3]	247	subsection, one must set {\tt flagSubsection = 1}. A complete
	248	description of the section syntax can be found at the {\sc fitsio} web
	249	site
[15]	250	\footnote{
	251	\href{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}%
	252	{http://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c\_user/node90.html}}.
[3]	253
	254	\subsection{Image modification}
	255	\label{sec-modify}
	256
	257	Several modifications to the cube can be made that improve the
	258	execution and efficiency of Duchamp (these are optional -- their
	259	use is indicated by the relevant flags set in the input parameter
	260	file).
	261
	262	\subsubsection{Milky-Way removal}
	263
	264	First, a single set of contiguous channels can be removed -- these may
	265	exhibit very strong emission, such as that from the Milky Way as seen
	266	in extragalactic \hi\ cubes (hence the references to ``Milky Way'' in
	267	relation to this task -- apologies to Galactic astronomers!). Such
	268	dominant channels will both produce many unnecessary, uninteresting
[24]	269	and large (in size and hence in memory usage) detections, and will
	270	also affect any reconstruction that is performed (see next
	271	section). The use of this feature is controlled by the {\tt flagMW}
	272	parameter, and the exact channels concerned are able to be set by the
	273	user (using {\tt maxMW} and {\tt minMW}). When employed, the flux in
	274	these channels is set to zero. The information in those channels is
	275	not kept.
[3]	276
	277	\subsubsection{Blank pixel removal}
	278
[24]	279	Second, the cube is trimmed of any BLANK pixels that pad the image
[3]	280	out to a rectangular shape. This is also optional, being determined by
	281	the {\tt flagBlankPix} parameter. The value for these pixels is read from
	282	the FITS header (using the BLANK, BSCALE and BZERO keywords), but if
	283	these are not present then the value can be specified by the user in
	284	the parameter file. If these blank pixels are stored as NaNs, then a
	285	normal number will be substituted (allowing these pixels to be
[24]	286	accurately removed without adverse effects). [NOTE: this appears not
	287	to be working correctly at time of writing. If your data has
	288	unspecified BLANKs, be wary...]
[3]	289
	290	This stage is particularly important for the reconstruction step, as
	291	lots of BLANK pixels on the edges will smooth out features in the
	292	wavelet calculation stage. The trimming will also reduce the size of
	293	the cube's array, speeding up the execution. The amount of trimming is
	294	recorded, and these pixels are added back in once the source-detection
	295	is completed (so that quoted pixel positions are applicable to the
	296	original cube).
	297
	298	Rows and columns are trimmed one at a time until the first non-BLANK
	299	pixel is reached, so that the image remains rectangular. In practice,
	300	this means that there will be BLANK pixels left in the trimmed image
	301	(if the non-BLANK region is non-rectangular). However, these are
	302	ignored in all further calculations done on the cube.
	303
	304	\subsubsection{Baseline removal}
	305
	306	Finally, the user may request the removal of baselines from the
	307	spectra, via the parameter {\tt flagBaseline}. This may be necessary
	308	if there is a strong baseline ripple present, which can result in
	309	spurious detections on the high points of the ripple. The baseline is
	310	calculated from a wavelet reconstruction procedure (see
	311	\S\ref{sec-recon}) that keeps only the two largest scales. This is
	312	done separately for each spatial pixel (\ie for each spectrum in the
	313	cube), and the baselines are stored and added back in before any
	314	output is done. In this way the quoted fluxes and displayed spectra
[24]	315	are as one would see from the input cube itself -- even though the
	316	detection (and reconstruction if applicable) is done on the
	317	baseline-removed cube.
[3]	318
[43]	319	The presence of very strong signals (for instance, masers at several
	320	hundred Jy) can affect the determination of the baseline, leading to a
	321	large dip centred on the signal in the baseline-subtracted
	322	spectrum. To prevent this, the signal is trimmed prior to the
[80]	323	reconstruction process at some standard threshold (at $8\sigma$ above
	324	the mean). The baseline determined should thus be representative of
	325	the true, signal-free baseline. Note that this trimming is only a
[43]	326	temporary measure which does not affect the source-detection.
	327
[3]	328	\subsection{Image reconstruction}
	329	\label{sec-recon}
	330
	331	This is an optional step. The user can direct Duchamp to
	332	reconstruct the data cube using the {\it {\`a} trous} wavelet
	333	procedure. A good description of the procedure can be found in
	334	\citet{starck02:book}. This is an effective way of removing a
	335	lot of the noise in the image, but at this stage is relatively time-
[24]	336	and memory-intensive. The steps in the procedure are as follows:
[3]	337	\begin{enumerate}
	338	\item Set the reconstructed array to 0 everywhere.
	339	\item The cube is discretely convolved with a given filter
	340	function. This is determined from the parameter file via the {\tt
	341	filterCode} parameter -- see Appendix~\ref{app-param} for details on
	342	the filters available.
	343	\item The wavelet coefficients are calculated by taking the difference
	344	between the convolved array and the input array.
	345	\item If the wavelet coefficients at a given point are above the
	346	threshold requested (given by {\tt snrRecon} as the number of
	347	$\sigma$ above the mean and adjusted to the current scale), add
	348	these to the reconstructed array.
	349	\item The separation of the filter coefficients is doubled.
	350	\item The procedure is repeated from step 2, using the convolved array
	351	as the input array.
	352	\item Continue until the required maximum number of scales is reached.
[24]	353	\item Add the final smoothed (\ie convolved) array to the
	354	reconstructed array. This provides the ``DC offset'', as each of the
	355	wavelet coefficient arrays will have zero mean.
[3]	356	\end{enumerate}
	357
[71]	358	Note that any BLANK pixels that are still in the cube will not be
	359	altered by the reconstruction -- they will be left as BLANK so that
	360	the shape of the valid part of the cube is preserved.
	361
[24]	362	It is important to note that the {\it {\`a} trous} decomposition is
	363	an example of a ``redundant'' transformation. If no thresholding is
	364	performed, the sum of all the wavelet coefficient arrays and the final
	365	smoothed array is identical to the input array. The thresholding thus
	366	removes only the unwanted structure in the array.
[3]	367
	368	The statistics of the cube are estimated using robust methods, to
	369	avoid corruption by strong outlying points. The mean is actually
	370	estimated by the median, while the median absolute deviation from the
	371	median (MADFM) is calculated and corrected assuming Gaussianity to
	372	estimate the standard deviation $\sigma$. The Gaussianity (or
	373	Normality) assumption is critical, as the MADFM does not give the same
	374	value as the usual rms or standard deviation value -- for a normal
	375	distribution $N(\mu,\sigma)$ we find MADFM$=0.6744888\sigma$. The
	376	difference between the MADFM and $\sigma$ is corrected for, so the
	377	user need only think in the usual multiples of $\sigma$ when setting
	378	{\tt snrRecon}. See Appendix~\ref{app-madfm} for a derivation of this
	379	value.
	380
	381	When thresholding the different wavelet scales, the value of $\sigma$
	382	as measured from the input array needs to be scaled to account for the
	383	increased amount of correlation between neighbouring pixels (due to
[80]	384	the convolution). See Appendix~\ref{app-scaling} for details on this
[24]	385	scaling.
[3]	386
	387	The user can also select the minimum scale to be used in the
	388	reconstruction -- the first scale exhibits the highest frequency
	389	variations, and so ignoring this one can sometimes be beneficial in
	390	removing excess noise. The default, however, is to use all scales
	391	({\tt minscale = 1}).
	392
	393	The reconstruction has at least two iterations. The first iteration
	394	makes a first pass at the wavelet reconstruction (the process outlined
	395	in the 8 stages above), but the residual array will inevitably have
	396	some structure still in it, so the wavelet filtering is done on the
	397	residual, and any significant wavelet terms are added to the final
	398	reconstruction. This step is repeated until the change in the $\sigma$
	399	of the background is less than some fiducial amount.
	400
[80]	401	\subsection{Reconstruction I/O}
	402
	403	The reconstruction stage can be relatively time-consuming,
	404	particularly for large cubes. Duchamp thus has a shortcut to allow
	405	users to quickly do multiple searches (\eg with different thresholds)
	406	on the same reconstruction.
	407
	408	The first step is to select to save the reconstructed image as a
[3]	409	FITS file -- at the moment this is just saved in the same directory as
	410	the input file, so it won't work if the user does not have write
[71]	411	permissions on that directory. The name of the file will be derived
	412	from the input file, in the following manner: if the input file is
	413	{\tt image.fits}, the reconstructed array will be saved in {\tt
	414	image.RECON?.fits}, where {\tt ?} stands for the value of {\tt
	415	snrRecon} (for instance, if {\tt snrRecon}$=4$, it will be {\tt
	416	image.RECON4.fits}, and if {\tt snrRecon}$=4.5$, it will be {\tt
	417	image.RECON4.5.fits}). To save the reconstructed array, set {\tt
	418	flagOutputRecon = true}.
[3]	419
[71]	420	Likewise, the residual image, defined as the difference between the
	421	input image and the reconstructed image, can also be saved in the same
	422	manner -- its filename will be {\tt image.RESID?.fits}. This is done
	423	by setting {\tt flagOutputResid = true}.
[3]	424
[80]	425	If a reconstructed image has been saved, it can be read in and used
	426	instead of redoing the reconstruction. To do so, the user should set
	427	{\tt flagReconExists = true}. The user can indicate the name of the
	428	reconstructed FITS file using the {\tt reconFile} parameter, or, if
	429	this is not specified, Duchamp searches for the file {\tt
	430	image.RECON?.fits} (as defined above). If the file is not found, the
	431	reconstruction is performed as normal. Note that to do this, the user
	432	needs to set {\tt flagAtrous = true} (obviously, if this is {\tt
	433	false}, the reconstruction is not needed).
[71]	434
[3]	435	\subsection{Searching the image}
	436	\label{sec-detection}
	437
[24]	438	The image is searched for detections in two ways: spectrally (a
	439	1-dimensional search in the spectrum in each spatial pixel), and
	440	spatially (a 2-dimensional search in the spatial image in each
	441	channel). In both cases, the algorithm finds connected pixels that are
	442	above the user-specified threshold. In the case of the spatial image
	443	search, the algorithm of \citet{lutz80} is used to raster scan through
	444	the image and connect groups of pixels on neighbouring rows.
[3]	445
	446	Note that this algorithm cannot be applied directly to a 3-dimensional
	447	case, as it requires that objects are completely nested in a row: that
	448	is, if you are scanning along a row, and one object finishes and
	449	another starts, you know that you will not get back to the first one
	450	(if at all) until the second is finished for that
	451	row. Three-dimensional data does not have this property, which is why
	452	we break up the searching into 1- and 2-dimensional cases.
	453
	454	The determination of the threshold is done in one of two ways. The
[80]	455	first way is a simple sigma-clipping, where a threshold is set at
	456	$n\sigma$ above the mean and pixels above this threshold are
	457	flagged as detected. The value of $n$ is set with the parameter {\tt
	458	snrCut}. As before, the value for $\sigma$ is estimated by
[3]	459	the MADFM, and corrected by the ratio derived in
	460	Appendix~\ref{app-madfm}.
	461
	462	The second method uses the False Discovery Rate (FDR) technique
	463	\citep{miller01,hopkins02}, whose basis we briefly detail here. The
	464	false discovery rate (given by the number of false detections divided
	465	by the total number of detections) is fixed at a certain value
	466	$\alpha$ (\eg $\alpha=0.05$ implies 5\% of detections are false
	467	positives). In practice, an $\alpha$ value is chosen, and the ensemble
	468	average FDR (\ie $<FDR>$) when the method is used will be less than
	469	$\alpha$. One calculates $p$ -- the probability, assuming the null
	470	hypothesis is true, of obtaining a test statistic as extreme as the
	471	pixel value (the observed test statistic) -- for each pixel, and sorts
	472	them in increasing order. One then calculates $d$ where
	473	\[
[18]	474	d = \max_j \left\{ j : P_j < \frac{j\alpha}{c_N N} \right\},
[3]	475	\]
[18]	476	and then rejects all hypotheses whose $p$-values are less than or equal
[3]	477	to $P_d$. (So a $P_i<P_d$ will be rejected even if $P_i \geq
	478	j\alpha/c_N N$.) Note that ``reject hypothesis'' here means ``accept
	479	the pixel as an object pixel'' (\ie we are rejecting the null
	480	hypothesis that the pixel belongs to the background).
	481
	482	The $c_N$ values here are normalisation constants that depend on the
	483	correlated nature of the pixel values. If all the pixels are
	484	uncorrelated, then $c_N=1$. If $N$ pixels are correlated, then their
	485	tests will be dependent on each other, and so $c_N = \sum_{i=1}^N
	486	i^{-1}$. \citet{hopkins02} consider real radio data, where the pixels
	487	are correlated over the beam. In this case the sum is made over the
	488	$N$ pixels that make up the beam. The value of $N$ is calculated from
[24]	489	the FITS header (if the correct keywords -- BMAJ, BMIN -- are not
	490	present, a default value of 10 pixels is assumed).
[3]	491
	492	If a reconstruction has been made, the residuals (defined as original
	493	$-$ reconstruction) are used to estimate the noise parameters of the
	494	cube. Otherwise they are estimated directly from the cube itself. In
	495	both cases, the median is used as a robust estimator of the mean
	496	value, although the $\sigma$ is estimated by the standard deviation
	497	(of the residual array, in the case of the reconstruction, but of the
	498	original array otherwise).
	499
[24]	500	Detections must have a minimum number of pixels to be counted. This
	501	minimum number is given by the input parameters {\tt minPix} (for
	502	2-dimensional searches) and {\tt minChannels} (for 1-dimensional
[60]	503	searches).
[24]	504
[60]	505	The search only looks for positive features. If one is interested
	506	instead in negative features (such as absorption lines), set the
	507	parameter {\tt flagNegative = true}. This will invert the cube (\ie
[80]	508	multiply all pixels by $-1$) prior to the search, and then re-invert
	509	the cube (and the fluxes of any detections) after searching is
	510	complete. All outputs are done in the same manner as normal, so that
	511	fluxes of detections will be negative.
[60]	512
[3]	513	\subsection{Merging detected objects}
	514	\label{sec-merger}
	515
	516	The searching step produces a list of detections that will have many
	517	repeated detections of a given object -- for instance, spectral
[18]	518	detections in adjacent pixels of the same object and/or spatial
	519	detections in neighbouring channels. These are then combined in an
	520	algorithm that matches all objects judged to be ``close''. This
	521	determination is made in one of two ways.
[3]	522
	523	One way is to define two thresholds -- one spatial and one in velocity
	524	-- and say that two objects should be merged if there is at least one
	525	pair of pixels that lie within these threshold distances of each
	526	other. These thresholds are specified by the parameters {\tt
	527	threshSpatial} and {\tt threshVelocity} (in units of pixels and
	528	channels respectively).
	529
	530	Alternatively, the spatial requirement can be changed to say that
	531	there must be a pair of pixels that are {\it adjacent} -- a stricter,
	532	but more realistic requirement, particularly when the spatial pixels
	533	have a large angular size (as is the case for \hi\ surveys). This
	534	method can be selected by setting the parameter
	535	{\tt flagAdjacent} to 1 (\ie {\tt true}) in the parameter file. The
	536	velocity thresholding is done in the same way as the first option.
	537
[18]	538	Once the detections have been merged, they may be ``grown''. This is a
	539	process of increasing the size of the detection by adding adjacent
	540	pixels that are above some secondary threshold. This threshold is
	541	lower than the one used for the initial detection, but above the noise
	542	level, so that faint pixels are only detected when they are close to a
	543	bright pixel. The value of this threshold is a possible input
	544	parameter ({\tt growthCut}), with a default value of $1.5\sigma$. The
	545	use of the growth algorithm is controlled by the {\tt flagGrowth}
	546	parameter -- the default value of which is {\tt false}. If the
	547	detections are grown, they are sent through the merging algorithm a
[24]	548	second time, to pick up any detections that now overlap or have grown
[18]	549	over each other.
[3]	550
[24]	551	Finally, to be accepted, the detections must span {\it both} a minimum
[18]	552	number of channels (to remove any spurious single-channel spikes that
[24]	553	may be present), and a minimum number of spatial pixels. These
	554	numbers, as for the original detection step, are set with the {\tt
	555	minChannels} and {\tt minPix} parameters. The channel requirement
	556	means there must be at least one set of this many consecutive channels
	557	in the source for it to be accepted.
[18]	558
[3]	559	\section{Outputs}
	560	\label{sec-output}
	561
	562	\subsection{During execution}
	563
	564	Duchamp provides the user with feedback whilst it is running, to
	565	keep the user informed on the progress of the analysis. Most of this
	566	consists of self-explanatory messages about the particular stage the
	567	program is up to. The relevant parameters are printed to the screen at
	568	the start (once the file has been successfully read in), so the user
	569	is able to make a quick check that the setup is correct.
	570
	571	If the cube is being trimmed (\S\ref{sec-modify}), the resulting
	572	dimensions are printed to indicate how much has been trimmed. If a
	573	reconstruction is being done, a continually updating message shows the
	574	current iteration and scale (compared to the maximum scale).
	575
	576	During the searching algorithms, the progress through the 1D and 2D
	577	searches are shown. When the searches have completed,
	578	the number of objects found in both the 1D and 2D searches are
	579	reported (see \S\ref{sec-detection} for details).
	580
	581	In the merging process (where multiple detections of the same object
	582	are combined -- see \S\ref{sec-merger}), two stages of output
	583	occur. The first is when each object in the list is compared with all
	584	others. The output shows two numbers: the first being how far through
	585	the list we are, and the second being the length of the list. As the
	586	algorithm proceeds, the first number should increase and the second
[80]	587	should decrease (as objects are combined). When the numbers meet (\ie
	588	the whole list has been compared), the second phase begins, in which
	589	multiply-appearing pixels in each object are removed, as are objects
	590	not meeting the minimum channels requirement. During this phase, the
	591	total number of accepted objects is shown, which should steadily
	592	increase until all have been accepted or rejected. Note that these
	593	steps can be very quick for small numbers of detections.
[3]	594
	595	Since this continual printing to screen has some overhead of time and
	596	CPU involved, the user can elect to not print this information by
	597	setting the parameter {\tt verbose = 0}. In this case, the user is
	598	still informed as to the steps being undertaken, but the details of
	599	the progress are not shown.
	600
	601	\subsection{Results}
	602
[15]	603	Finally, we get to the results -- the reason for running Duchamp in
[18]	604	the first place. Once the detection list is finalised, it is sorted by
[80]	605	the mean velocity of the detections (or, if there is no good WCS
	606	associated with the cube, by the mean Z-pixel position). The results
	607	are then printed to the screen and to the output file, denoted by the
	608	{\tt OutFile} parameter. The results list, an example of which can be
	609	seen in Appendix~\ref{app-output}, contains the following columns
	610	(note that the title of the columns depending on WCS information will
	611	depend on the projection of the WCS):
[15]	612
[3]	613	\begin{entry}
	614	\item[Obj\#] The ID number of the detection (simply the sequential
	615	count for the list, which is ordered by increasing velocity).
[80]	616	\item[Name] The IAU-format name of the detection (based on the WCS
	617	projection).
[3]	618	\item[X] The average X-pixel position.
	619	\item[Y] The average Y-pixel position.
	620	\item[Z] The average Z-pixel position.
[80]	621	\item[RA/GLON] The Right Ascension or Galactic Longitude of the centre
	622	of the object.
	623	\item[DEC/GLAT] The Declination or Galactic Latitude of the centre of
	624	the object.
[85]	625	\item[w\_RA/w\_GLON] The width of the object in Right Ascension or
	626	Galactic Longitude [arcmin].
[80]	627	\item[w\_DEC/w\_GLAT] The width of the object in Declination Galactic
	628	Latitude [arcmin].
[3]	629	\item[VEL] The mean velocity of the object [km/s].
	630	\item[w\_VEL] The full velocity width of the detection (max channel
	631	$-$ min channel, in velocity units [km/s]).
[87]	632	\item[F\_tot] The integrated flux over the object, in the units of
	633	flux times velocity (\eg Jy km/s).
	634	\item[F\_peak] The peak flux over the object, in the units of flux.
[3]	635	\item[X1, X2] The minimum and maximum X-pixel coordinates.
	636	\item[Y1, Y2] The minimum and maximum Y-pixel coordinates.
	637	\item[Z1, Z2] The minimum and maximum Z-pixel coordinates.
	638	\item[Npix] The number of pixels \& channels (\ie distinct $(x,y,z)$
	639	coordinates) in the detection.
[87]	640	\item[Flag] Whether the detection has any warning flags (see below).
[3]	641	\end{entry}
[80]	642	The Name is derived from the WCS position. For instance, the (RA,Dec)
	643	position 12$^h$53$^m$45$^s$, -36$^\circ$24$'$12$''$ will be called
	644	J1253$-$3624 (if the epoch is J2000) or B1253$-$3624 (if B1950). An
	645	alternative form is used for Galactic coordinates: the position
	646	($l$,$b$) = (323.1245, 5.4567) will be called G323.12$+$05.45. If the
	647	WCS is not valid (\ie is not present or does not have all the
	648	necessary information), the Name, RA, DEC, VEL and related columns are
	649	not printed, but the pixel coordinates are still provided.
[3]	650
[18]	651	\begin{figure}[t]
	652	\begin{center}
	653	\includegraphics[width=\textwidth]{example_spectrum}
	654	\end{center}
	655	\caption{\footnotesize An example of the spectrum output. Note several
	656	of the features discussed in the text: the removal of the Milky Way
	657	emission around 0 km/s; the red lines indicating the reconstructed
	658	spectrum; the blue dashed lines indicating the spectral extent of
[85]	659	the detection; the blue border showing its spatial extent on the
	660	0th moment map; and the 15~arcmin-long scale bar.}
[18]	661	\label{fig-spect}
	662	\end{figure}
[3]	663
[87]	664	The last column contains any warning flags about the detection. There
	665	are currently two options here. An `E' is printed if the detection is
	666	next to the edge of the image, meaning either the limit of the pixels,
	667	or the limit of the non-BLANK pixel region. An `N' is printed if the
	668	total flux, summed over all the (non-BLANK) pixels in the smallest box
	669	that completely encloses the detection, is negative. Note that this
	670	sum will possibly include non-detected pixels. It is of use in
	671	pointing out detections that lie next to strongly negative pixels,
	672	such as might arise due to interference -- the detected pixels might
	673	then also be due to the interference, so caution is advised.
	674
[21]	675	Two alternative results files can also be requested. One option is a
	676	VOTable-format XML file, containing just the RA, Dec, Velocity and the
	677	corresponding widths of the detections, as well as the fluxes. The
	678	user should set {\tt flagVOT = 1}, and put the desired filename in the
	679	parameter {\tt votFile} -- note that the default is for it not to be
	680	produced. This file should be compatible with all Virtual Observatory
	681	tools (such as Aladin\footnote{ Aladin can be found on the web at
	682	\href{http://aladin.u-strasbg.fr/}{http://aladin.u-strasbg.fr/}}). The
	683	second option is an annotation file for use with the Karma toolkit of
	684	visualisation tools (in particular, with {\tt kvis}). This will draw a
	685	circle at the position of each detection, and number it according to
	686	the Obj\# given above. To use, the user should set {\tt flagKarma = 1},
	687	and put the desired filename in the parameter {\tt karmaFile} -- again,
	688	the default is for it not to be produced.
[18]	689
[3]	690	As the program is running, it also (optionally) records the detections
	691	made in each individual spectrum or channel (see
	692	\S\ref{sec-detection} for details on this process). This is
	693	recorded in the file denoted by the parameter {\tt LogFile}. This file
	694	does not include the columns {\tt Name, RA, DEC, w\_RA, w\_DEC, VEL,
	695	w\_VEL}. This file is designed primarily for diagnostic purposes: \eg
	696	to see if a given set of pixels is detected in, say, one channel
	697	image, but does not survive the merging process. The list of pixels
	698	(and their fluxes) in the final detection list are also printed to
	699	this file, again for diagnostic purposes. This feature can be turned
	700	off by setting {\tt flagLog = false}. (This may be a good idea if you
	701	are not interested in its contents, as it can be a large file.)
	702
[18]	703	\begin{figure}[!t]
	704	\begin{center}
	705	\includegraphics[width=\textwidth]{example_moment_map}
	706	\end{center}
	707	\caption{\footnotesize An example of the moment map created by
	708	Duchamp. The full extent of the cube is covered, and the 0th moment
	709	of each object is shown (integrated individually over all the
	710	detected channels).}
	711	\label{fig-moment}
	712	\end{figure}
	713
[3]	714	As well as the output data file, a postscript file is created that
[80]	715	shows the spectrum for each detection, together with a small cutout
[87]	716	image (0th moment) and basic information about the detection (note
	717	that any flags are printed after the name of the detection, in the
	718	format {\tt [E]}). If the
	719	cube was reconstructed, the spectrum from the reconstruction is shown
	720	in red, over the top of the original spectrum. The spectrum that is
[45]	721	plotted is governed by the {\tt spectralMethod} parameter. It can be
[80]	722	either {\tt peak}, where the spectrum is from the spatial pixel
	723	containing the detection's peak flux; or {\tt sum}, where the spectrum
	724	is summed over all spatial pixels, and then corrected for the beam
	725	size.
[3]	726
[45]	727	The spectral extent of the detection is indicated with blue lines, and
	728	a zoom is shown in a separate window. The cutout image can optionally
	729	include a border around the spatial pixels that are in the detection
	730	(turned on and off by the parameter {\tt drawBorders}). It also
	731	includes a scale bar in the bottom left corner to indicate size -- it
[85]	732	is 15~arcmin long (note that due to projection effects it may be a
	733	slightly different physical length from object to object). An example
	734	detection can be seen below in Fig.~\ref{fig-spect}.
[45]	735
[17]	736	Finally, a couple of images are optionally produced: a 0th moment map
[3]	737	of the cube, combining just the detected channels in each object,
	738	showing the integrated flux in grey-scale; and a ``detection image'',
	739	a grey-scale image where the pixel values are the number of channels
	740	that spatial pixel is detected in. In both cases, if {\tt drawBorders =
	741	true}, a border is drawn around the spatial extent of each
	742	detection. An example moment map is shown in Fig.~\ref{fig-moment}.
	743	The production or otherwise of these images is governed by the {\tt
	744	flagMaps} parameter.
	745
	746	The purpose of these images are to provide a visual guide to where the
	747	detections have been made, and, particularly in the case of the moment
	748	map, to provide an indication of the strength of the source. In both
	749	cases, the detections are numbered (in the same way as the output
	750	list), and the spatial borders are marked out as for the cutout images
	751	in the spectra file. Both these images are saved as postscript files
	752	(given by the parameters {\tt momentMap} and {\tt detectionMap}
	753	respectively), with the latter also displayed in a {\sc pgplot}
	754	window (regardless of the state of {\tt flagMaps}).
	755
	756	\section{Notes and hints on the use of Duchamp}
	757
	758	In using Duchamp, the user has to make a number of decisions about
	759	the way the program runs. This section is designed to give the user
[80]	760	some idea about what to choose.
[3]	761
	762	The main choice is whether or not to use the wavelet
	763	reconstruction. The main benefits of this are the marked reduction in
	764	the noise level, leading to regularly-shaped detections, and good
	765	reliability for faint sources. The main drawback with its use is the
	766	long execution time: to reconstruct a $170\times160\times1024$
	767	(\hipass) cube often requires three iterations and takes about 20-25
	768	minutes. The searching part of the procedure is much quicker (although
	769	see the note on merging, below), so if one uses the FDR method on the
[71]	770	un-reconstructed cube, the execution time is only a couple of
	771	minutes. Alternatively, using the ability to read in previously-saved
	772	reconstructed arrays makes running the reconstruction more than once a
	773	more feasible prospect.
[3]	774
[80]	775	%A further drawback with the reconstruction is that it is susceptible
	776	%to edge effects. If the valid area in the cube (\ie the part that is
	777	%not BLANK) has very curved edges (such as the \hipass\ polar cap cube,
	778	%H001, which has a roughly circular shape after gridding), the
	779	%convolution can produce artefacts in the reconstruction that mimic the
	780	%edges and can lead (depending on the selection threshold) to some
	781	%spurious sources. Caution is advised with such data -- the user is
	782	%advised to check carefully the reconstructed cube for the presence of
	783	%such artefacts.
[3]	784
	785	If one chooses the reconstruction method, a further decision is
	786	required on the signal-to-noise cutoff used in determining acceptable
	787	wavelet coefficients. A larger value will remove more noise from the
	788	cube, at the expense of losing fainter sources, while a smaller value
	789	will include more noise, which may produce spurious detections, but
	790	will be more sensitive to faint sources. Values of less than about
	791	$3\sigma$ tend to not reduce the noise a great deal and can lead to
[71]	792	many spurious sources (although this will depend on the nature of the
	793	cube).
[3]	794
	795	The FDR method certainly produces more reliable results than a simple
	796	sigma-clipping (\ie thresholding at some number of $\sigma$ above the
[80]	797	mean), particularly if no reconstruction is done. However, at this
	798	point it does not seem to be giving the sensitivity expected for the
	799	supplied value of {\tt alpha} (\ie it is not finding as many sources
	800	as expected). Work is being done to assess this, and to judge whether
	801	there is a real problem (such as with the determination of the
	802	statistics), or simply a result of working in 3 dimensions as opposed
	803	to 2.
[3]	804
	805	A further point to bear in mind is that the shape of the detections in
	806	a cube that has been reconstructed will be much more regular and
	807	smooth -- the ragged edges that objects in the raw cube possess are
	808	smoothed by the removal of most of the noise.
	809
[24]	810	Finally, as Duchamp is still undergoing development, there are some
	811	elements that are not fully developed. In particular, it is not as
	812	clever as I would like at avoiding interference. The ability to place
	813	requirements on the minimum number of channels and pixels partially
	814	circumvents this problem, but work is being done to make Duchamp
	815	smarter at rejecting signals that are clearly (to a human eye at
	816	least) interference. See the following section for further
	817	improvements that are planned.
[3]	818
[24]	819	%\section{Drawbacks of the current program}
	820	%
	821	%The program currently has a few problems/drawbacks/things to be aware
	822	%of that will hopefully be fixed in the future:
	823	%\begin{itemize}
	824	%
	825	%\item Narrow interference spikes are still getting found, particularly
	826	% if there is no reconstruction, or reconstruction with a relatively
	827	% low {\tt snrRecon} (such as 2 or 3). Increasing the {\tt
	828	% minChannels} parameter is one way to circumvent this, but making the
	829	% algorithm a bit more clever would be preferable.
	830	%
	831	%\item Sources that have strong continuum ripple and/or artefacts often
	832	% generate many spurious detections. This needs some work to avoid
	833	% Duchamp doing this, and until then users are advised to be aware
	834	% of the possibility. Strong continuum ripples may generate many
	835	% sources on the same spatial pixel, and this will be apparent on the
	836	% detection images.
	837	%
	838	%\item Spectra are integrated over every spatial pixel of the
	839	% detection, and this may dilute the actual detection, making it
	840	% harder to see \ie the apparent strength of the line as plotted may
	841	% not give a true indication of how strong it really is.
	842	%
	843	%%\item A caution on the merging part of the procedure. This can be time
	844	%% consuming if there are many detections that do not require merging
	845	%% -- in this case, the time will go like $N^2$ ($N$ = number of
	846	%% detections). If there are plenty of mergers, the size of the list
	847	%% reduces quickly, so the execution time will be less.
	848	%
	849	%
	850	%\end{itemize}
[3]	851
	852
	853	%\section{Comparison with other software (to be developed further...)}
	854	%
	855	%\subsection{fred, by Matt Howlett}
	856	%
	857	%This is the program used in the \hipass\ analysis. It smoothes the
	858	%data spectrally with a boxcar filter of a size that varies over a
	859	%user-specified range, and then thresholds the data.
	860	%
	861	%Works effectively, but generally doesn't find as many sources as
	862	%Duchamp, particularly when the reconstruction is used. Sensitive to
	863	%faint, broad features that fall below the reconstruction threshold.
	864	%
	865	%Execution takes a long time, depending on the range of filter widths
	866	%that are used.
	867	%
	868	%\subsection{sfind}
	869	%
	870	%Hard to evaluate, as it does not (as far as I can see) output the
	871	%channel number at which detections are made, and does not merge
	872	%detections made at adjacent channels (\ie it just works in 2
	873	%dimensions).
	874	%
	875
	876	\section{Future Developments}
	877
	878	This is both a list of planned improvements and a wish-list of
	879	features that would be nice to include (but are not planned in the
[71]	880	immediate future). Let me know if there are items not on this list, or
	881	items on the list you would like prioritised.
[3]	882
	883	\begin{itemize}
	884
[24]	885	\item More varied output formats. {\bf Planned.}
	886
[3]	887	\item Better determination of the noise characteristics of
	888	spectral-line cubes, including understanding how the noise is
	889	generated and developing a model for it. {\bf Planned.}
	890
	891	\item Include more source analysis. Examples could be: shape
	892	information; measurements of HI mass; better measurements of
	893	velocity width and profile... {\bf Some planned.}
	894
	895	\item Provide some indication of the significance of the detection
	896	(\ie some S/N-like value). {\bf Planned.}
	897
[24]	898	\item Improved ability to reject interference, possibly on the
	899	spectral shape of features. {\bf Planned.}
	900
[71]	901	\item Ability to separate (de-blend) distinct sources that have been
	902	merged. {\bf Planned.}
	903
[3]	904	\item Link to lists of possible counterparts (\eg via NED/SIMBAD/other
	905	VO tools?). {\bf Wishlist.}
	906
	907	\item At this point, the ``Milky Way'' channels are discarded and set
	908	to zero. It may be that users would like to have those put back in
	909	the final cube after the source detection is done, so at some point
	910	this option may be added. {\bf Wishlist -- if needed.}
	911
	912	\end{itemize}
	913
	914
	915	%\bibliographystyle{mn2e}
	916	%\bibliographystyle{abbrvnat}
	917	%\bibliography{mnrasmnemonic,sourceDetection}
	918	\begin{thebibliography}{}
	919
	920	\bibitem[\protect\citeauthoryear{{Calabretta} \& {Greisen}}{{Calabretta} \&
	921	{Greisen}}{2002}]{calabretta02}
	922	{Calabretta} M., {Greisen} E., 2002, A\&A, 395, 1077
	923
	924	\bibitem[\protect\citeauthoryear{{Greisen} \& {Calabretta}}{{Greisen} \&
	925	{Calabretta}}{2002}]{greisen02}
	926	{Greisen} E., {Calabretta} M., 2002, A\&A, 395, 1061
	927
[18]	928	\bibitem[\protect\citeauthoryear{{Hanisch}, {Farris}, {Greisen}, {Pence},
	929	{Schlesinger}, {Teuben}, {Thompson} \& {Warnock}}{{Hanisch}
	930	et~al.}{2001}]{hanisch01}
	931	{Hanisch} R., {Farris} A., {Greisen} E., {Pence} W., {Schlesinger} B.,
	932	{Teuben} P., {Thompson} R., {Warnock} A., 2001, A\&A, 376, 359
	933
[3]	934	\bibitem[\protect\citeauthoryear{{Hopkins}, {Miller}, {Connolly}, {Genovese},
	935	{Nichol} \& {Wasserman}}{{Hopkins} et~al.}{2002}]{hopkins02}
	936	{Hopkins} A., {Miller} C., {Connolly} A., {Genovese} C., {Nichol} R.,
	937	{Wasserman} L., 2002, AJ, 123, 1086
	938
	939	\bibitem[\protect\citeauthoryear{Lutz}{Lutz}{1980}]{lutz80}
	940	Lutz R., 1980, The Computer Journal, 23, 262
	941
	942	\bibitem[\protect\citeauthoryear{{Meyer} et~al.,}{{Meyer}
	943	et~al.}{2004}]{meyer04:trunc}
	944	{Meyer} M., et~al., 2004, MNRAS, 350, 1195
	945
	946	\bibitem[\protect\citeauthoryear{{Miller}, {Genovese}, {Nichol}, {Wasserman},
	947	{Connolly}, {Reichart}, {Hopkins}, {Schneider} \& {Moore}}{{Miller}
	948	et~al.}{2001}]{miller01}
	949	{Miller} C., {Genovese} C., {Nichol} R., {Wasserman} L., {Connolly} A.,
	950	{Reichart} D., {Hopkins} A., {Schneider} J., {Moore} A., 2001, AJ, 122,
	951	3492
	952
	953	\bibitem[\protect\citeauthoryear{Minchin}{Minchin}{1999}]{minchin99}
	954	Minchin R., 1999, PASA, 16, 12
	955
	956	\bibitem[\protect\citeauthoryear{Starck \& Murtagh}{Starck \&
	957	Murtagh}{2002}]{starck02:book}
	958	Starck J.-L., Murtagh F., 2002, {``Astronomical Image and Data Analysis''}.
	959	Springer
	960
	961	\end{thebibliography}
	962
	963
	964	\appendix
	965	\newpage
	966	\section{Available parameters}
	967	\label{app-param}
	968
	969	The full list of parameters that can be listed in the input file are
	970	given here. If not listed, they take the default value given in
	971	parentheses. Since the order of the parameters in the input file does
	972	not matter, they are grouped here in logical sections.
	973
	974	\subsection*{Input-output related}
	975	\begin{entry}
	976	\item[ImageFile (no default assumed)] The filename of the
	977	data cube to be analysed.
[71]	978	\item[flagSubsection {\tt [false]}] A flag to indicate whether one
	979	wants a subsection of the requested image.
	980	\item[Subsection {\tt [ [,,*] ]}] The requested subsection, which
	981	should be specified in the format {\tt [x1:x2,y1:y2,z1:z2]}, where
	982	the limits are inclusive. If the full range of a dimension is
	983	required, use a {\tt *}, \eg if you want the full spectral range of
	984	a subsection of the image, use {\tt [30:140,30:140,*]}.
	985	\item[flagReconExists {\tt [false]}] A flag to indicate whether the
	986	reconstructed array has been saved by a previous run of Duchamp. If
	987	set true, the reconstructed array will be read from the file given by
	988	{\tt reconFile}, rather than calculated directly.
	989	\item[reconFile (no default assumed)] The FITS file that contains the
	990	reconstructed array. If {\tt flagReconExists} is true and this
	991	parameter is not defined, the default file searched will be
	992	determined by the {\`a} trous parameters (see \S\ref{sec-recon}).
[87]	993	\item[OutFile {\tt [duchamp-Results.txt]}] The file containing the
	994	final list of detections. This also records the list of input
[3]	995	parameters.
[80]	996	\item[SpectraFile {\tt [duchamp-Spectra.ps]}] The postscript file
[3]	997	containing the resulting integrated spectra and images of the
	998	detections.
	999	\item[flagLog {\tt [true]}] A flag to indicate whether intermediate
	1000	detections should be logged.
[80]	1001	\item[LogFile {\tt [duchamp-Logfile.txt]}] The file in which intermediate
[3]	1002	detections are logged. These are detections that have not been
	1003	merged. This is primarily for use in debugging and diagnostic
	1004	purposes -- normal use of the program will probably not require
	1005	this.
	1006	\item[flagOutputRecon {\tt [false]}] A flag to say whether or not to
	1007	save the reconstructed cube as a FITS file. The filename will be
	1008	derived from the ImageFile -- the reconstruction of {\tt image.fits}
	1009	will be saved as {\tt image.RECON?.fits}, where {\tt ?} stands for
	1010	the value of {\tt snrRecon} (see below).
	1011	\item[flagOutputResid {\tt [false]}] As for {\tt flagOutputRecon}, but
	1012	for the residual array -- the difference between the original cube
	1013	and the reconstructed cube. The filename will be {\tt
	1014	image.RESID?.fits}.
	1015	\item[flagVOT {\tt [false]}] A flag to say whether to create a VOTable
	1016	file corresponding to the information in {\tt outfile}. This will be
	1017	an XML file in the Virtual Observatory VOTable format.
[80]	1018	\item[votFile {\tt [duchamp-Results.xml]}] The VOTable file with the
[3]	1019	list of final detections. Some input parameters are also recorded.
[20]	1020	\item[flagKarma {\tt [false]}] A flag to say whether to create a Karma
	1021	annotation file corresponding to the information in {\tt
	1022	outfile}. This can be used as an overlay for the Karma programs such
	1023	as {\tt kvis}.
[80]	1024	\item[karmaFile {\tt [duchamp-Results.ann]}] The Karma annotation
[20]	1025	file showing the list of final detections.
[3]	1026	\item[flagMaps {\tt [true]}] A flag to say whether to save postscript
[17]	1027	files showing the 0th moment map of the whole cube (parameter {\tt
[3]	1028	momentMap}) and the detection image ({\tt detectionMap}).
[80]	1029	\item[momentMap {\tt [duchamp-MomentMap.ps]}] A postscript file
[17]	1030	containing a map of the 0th moment of the detected sources, as well
[3]	1031	as pixel and WCS coordinates.
[80]	1032	\item[detectionMap {\tt [duchamp-DetectionMap.ps]}] A postscript
[3]	1033	file showing each of the detected objects, coloured in greyscale by
	1034	the number of channels they span. Also shows pixel and WCS
	1035	coordinates.
	1036	\end{entry}
	1037
	1038	\subsection*{Modifying the cube}
	1039	\begin{entry}
	1040	\item[flagBlankPix {\tt [true]}] A flag to say whether to remove BLANK
	1041	pixels from the analysis -- these are pixels set to some particular
	1042	value because they fall outside the imaged area.
[15]	1043	\item[blankPixValue {\tt [-8.00061]}] The value of the BLANK pixels,
	1044	if this information is not contained in the FITS header (the usual
[3]	1045	procedure is to obtain this value from the header information -- in
	1046	which case the value set by this parameter is ignored).
[55]	1047	\item[flagMW {\tt [false]}] A flag to say whether to remove channels
[3]	1048	contaminated by Milky Way (or other) emission -- the flux in these
	1049	channels is currently just set to 0.
	1050	\item[maxMW {\tt [112]}] The maximum channel for the Milky Way
[15]	1051	emission.
[3]	1052	\item[minMW {\tt [75]}] The minimum channel for the Milky Way
	1053	emission. Note that the channels specified by {\tt maxMW} and {\tt
	1054	minMW} are assumed to be Milky Way channels (\ie the range is
	1055	inclusive).
	1056	\item[flagBaseline {\tt [false]}] A flag to say whether to remove the
	1057	baseline from each spectrum in the cube for the purposes of
	1058	reconstruction and detection.
	1059	\end{entry}
	1060
	1061	\subsection*{Detection related}
	1062
	1063	\subsubsection*{General detection}
	1064	\begin{entry}
[80]	1065	\item[flagNegative {\tt [false]}] A flag to indicate that the features
	1066	being searched for are negative. The cube will be inverted prior to
	1067	searching.
[3]	1068	\item[snrCut {\tt [3.]}] The cut-off value for thresholding, in terms
	1069	of number of $\sigma$ above the mean.
[87]	1070	\item[flagGrowth {\tt [false]}] A flag indicating whether or not to
[3]	1071	grow the detected objects to a smaller threshold.
[80]	1072	\item[growthCut {\tt [2.]}] The smaller threshold using in growing
[3]	1073	detections. In units of $\sigma$ above the mean.
	1074	\end{entry}
	1075
	1076	\subsubsection*{{\` a} trous reconstruction}
	1077	\begin{entry}
	1078	\item [flagATrous {\tt [true]}] A flag indicating whether or not to
	1079	reconstruct the cube using the {\it {\`a} trous} wavelet
	1080	reconstruction. Currently does this in 3-dimensions. See
	1081	\S\ref{sec-recon} for details.
	1082	\item[scaleMin {\tt [1]}] The minimum wavelet scale to be used in the
	1083	reconstruction. A value of 1 means ``use all scales''.
	1084	\item[snrRecon {\tt [4]}] The thresholding cutoff used in the
	1085	reconstruction -- only wavelet coefficients this many $\sigma$ above
	1086	the mean (or greater) are included in the reconstruction.
	1087	\item[filterCode {\tt [2]}] The code number of the filter to use in
	1088	the reconstruction. The options are:
	1089	\begin{itemize}
	1090	\item {\bf 1:} B$_3$-spline filter: coefficients =
	1091	$(\frac{1}{16}, \frac{1}{4}, \frac{3}{8}, \frac{1}{4}, \frac{1}{16})$
	1092	\item {\bf 2:} Triangle filter: coefficients = $(\frac{1}{4}, \frac{1}{2}, \frac{1}{4})$
	1093	\item {\bf 3:} Haar wavelet: coefficients = $(0, \frac{1}{2}, \frac{1}{2})$
	1094	\end{itemize}
	1095	\end{entry}
	1096
	1097	\subsubsection*{FDR method}
	1098	\begin{entry}
	1099	\item[flagFDR {\tt [false]}] A flag indicating whether or not to use
	1100	the False Discovery Rate method in thresholding the pixels.
	1101	\item[alphaFDR {\tt [0.01]}] The $\alpha$ parameter used in the FDR
	1102	analysis. The average number of false detections, as a fraction of the
	1103	total number, will be less than $\alpha$ (see \S\ref{sec-detection}).
	1104	\end{entry}
	1105
	1106	\subsubsection*{Merging detections}
	1107	\begin{entry}
[17]	1108	\item[minPix {\tt [2]}] The minimum number of spatial pixels for a single
	1109	detection to be counted.
[3]	1110	\item[minChannels {\tt [3]}] The minimum number of consecutive
	1111	channels that must be present in the detection for it to be accepted
	1112	by the Merging algorithm.
	1113	%The minimum number of channels that a
	1114	% detection must span for it to be accepted by the Merging algorithm.
[80]	1115	\item[flagAdjacent {\tt [true]}] A flag indicating whether to use the
	1116	``adjacent pixel'' criterion to decide whether to merge objects. If
	1117	not, the next two parameters are used to determine whether objects
	1118	are within the necessary thresholds.
[3]	1119	\item[threshSpatial {\tt [3.]}] The maximum allowed minimum spatial
	1120	separation (in pixels) between two detections for them to be merged
	1121	into one. Only used if {\tt flagAdjacent = false}.
	1122	\item[threshVelocity {\tt [7.]}] The maximum allowed minimum channel
	1123	separation between two detections for them to be merged into
	1124	one. %Only used if {\tt flagAdjacent = false}.
	1125	\end{entry}
	1126
	1127	\subsubsection*{Other parameters}
	1128	\begin{entry}
[80]	1129	\item[spectralMethod {\tt [peak]}] This indicates which method is used
	1130	to plot the output spectra: {\tt peak} means plot the spectrum
	1131	containing the detection's peak pixel; {\tt sum} means sum the
	1132	spectra of each detected spatial pixel, and correct for the beam
	1133	size. Any other choice defaults to {\tt peak}.
[3]	1134	\item[drawBorders {\tt [true]}] A flag indicating whether borders
	1135	are to be drawn around the detected objects in the moment maps
	1136	included in the output (see for example Fig.~\ref{fig-spect}).
	1137	\item[verbose {\tt [true]}] A flag indicating whether to print the
	1138	progress of computationally-intensive algorithms (such as the
	1139	searching and merging) to screen.
	1140	\end{entry}
	1141
	1142
	1143	\newpage
	1144	\section{Example parameter files}
	1145	\label{app-input}
	1146
	1147	This is what a typical parameter file would look like.
	1148
	1149	\begin{verbatim}
	1150	imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits
[80]	1151	logFile logfile.txt
	1152	outFile results.txt
[3]	1153	spectraFile spectra.ps
	1154	flagSubsection 0
	1155	flagOutputRecon 0
	1156	flagOutputResid 0
	1157	flagBlankPix 1
	1158	flagMW 1
	1159	minMW 75
	1160	maxMW 112
	1161	minPix 3
	1162	flagGrowth 1
	1163	growthCut 1.5
	1164	flagATrous 0
	1165	scaleMin 1
	1166	snrRecon 4
	1167	flagFDR 1
	1168	alphaFDR 0.1
	1169	numPixPSF 20
	1170	snrCut 3
	1171	threshSpatial 3
	1172	threshVelocity 7
	1173	\end{verbatim}
	1174
	1175	Note that it is not necessary to include all these parameters in the
	1176	file, only those that need to be changed from the defaults (as listed
	1177	in Appendix~\ref{app-param}), which in this case would be very few. A
	1178	minimal parameter file might look like:
	1179	\begin{verbatim}
	1180	imageFile /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits
	1181	flagLog 0
	1182	snrRecon 3
	1183	snrCut 2.5
[87]	1184	minChannels 4
[3]	1185	\end{verbatim}
	1186	This will reconstruct the cube with a lower SNR value than the
	1187	default, select objects at a lower threshold, with a looser minimum
	1188	channel requirement, and not keep a log of the intermediate
	1189	detections.
	1190
	1191	The following page demonstrates how the parameters are presented to
	1192	the user, both on the screen at execution time and in the output and
	1193	log files:
	1194	\newpage
	1195	\begin{landscape}
	1196	Presentation of parameters in output and log files:
	1197	\begin{verbatim}
[18]	1198	---- Parameters ----
[3]	1199	Image to be analysed = /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits
[87]	1200	Intermediate Logfile = duchamp-Logfile.txt
	1201	Final Results file = duchamp-Results.txt
	1202	Spectrum file = duchamp-Spectra.ps
	1203	VOTable file = duchamp-Results.xml
	1204	0th Moment Map = duchamp-MomentMap.ps
	1205	Detection Map = duchamp-DetectionMap.ps
[3]	1206	Saving reconstructed cube? = false
	1207	Saving residuals from reconstruction? = false
[18]	1208	------
[87]	1209	Searching for Negative features? = false
[3]	1210	Fixing Blank Pixels? = true
	1211	Blank Pixel Value = -8.00061
	1212	Removing Milky Way channels? = true
	1213	Milky Way Channels = 75-112
[18]	1214	Beam Size (pixels) = 10.1788
	1215	Removing baselines before search? = false
	1216	Minimum # Pixels in a detection = 2
	1217	Growing objects after detection? = false
[3]	1218	Using A Trous reconstruction? = true
	1219	Minimum scale in reconstruction = 1
[18]	1220	SNR Threshold within reconstruction = 4
	1221	Filter being used for reconstruction = B3 spline function
[3]	1222	Using FDR analysis? = false
[18]	1223	SNR Threshold = 2.5
[3]	1224	Using Adjacent-pixel criterion? = true
[87]	1225	Max. velocity separation for merging = 7
[3]	1226	Min. # channels for merging = 4
[87]	1227	Method of spectral plotting = peak
[3]	1228	\end{verbatim}
	1229
	1230	\newpage
	1231	\section{Example output file}
	1232	\label{app-output}
	1233	This the typical content of an output file, after running Duchamp
[80]	1234	with the parameters illustrated on the previous page.
[3]	1235
	1236	{\scriptsize
	1237	\begin{verbatim}
[87]	1238	Results of the Duchamp source finder: Tue May 23 14:51:38 2006
[18]	1239	---- Parameters ----
[3]	1240
	1241	(... omitted for clarity -- see previous page for examples...)
	1242
[18]	1243	--------------------
	1244	Total number of detections = 23
	1245	--------------------
[87]	1246	Obj# Name X Y Z RA DEC w_RA w_DEC VEL w_VEL F_tot F_peak X1 X2 Y1 Y2 Z1 Z2 Npix Flag
	1247	-----------------------------------------------------------------------------------------------------------------------------------------------------------
	1248	1 J0609-2156 59.4 140.6 114.7 06:09:21.03 -21:56:51.08 48.48 39.45 226.253 65.957 17.572 0.213 55 66 136 145 113 118 185
	1249	2 J0607-2601 65.2 79.6 116.2 06:07:52.21 -26:01:09.34 44.44 39.50 246.310 39.574 4.144 0.100 60 70 76 85 115 118 50
	1250	3 J0606-2720 70.8 59.8 121.4 06:06:14.90 -27:20:45.24 52.45 47.59 315.404 39.574 17.066 0.150 65 77 53 64 120 123 213
	1251	4 J0611-2138 52.5 145.1 162.5 06:11:18.85 -21:38:03.71 32.39 23.49 856.919 118.722 44.394 0.410 49 56 142 147 158 167 303 E
	1252	5 J0600-2859 89.7 35.3 202.4 06:00:33.13 -28:59:01.59 23.92 28.10 1383.476 184.678 26.573 0.173 87 92 32 38 195 209 319
	1253	6 J0558-2639 95.5 70.2 222.6 05:58:52.79 -26:39:04.56 15.93 12.10 1650.508 105.531 1.925 0.063 94 97 69 71 219 227 35
	1254	7 J0617-2724 34.8 58.3 227.5 06:17:05.84 -27:24:00.93 20.75 23.42 1714.993 303.400 11.414 0.093 33 37 56 61 215 238 176
	1255	8 J0609-2141 60.3 144.4 229.6 06:09:05.74 -21:41:38.75 16.14 11.82 1742.470 105.531 1.476 0.068 59 62 143 145 225 233 25
	1256	9 J0558-2525 95.7 88.6 231.1 05:58:51.19 -25:25:33.12 27.87 24.16 1762.632 250.635 16.930 0.115 92 98 86 91 220 239 257
	1257	10 J0600-2141 88.9 144.4 232.3 06:00:52.94 -21:41:57.48 31.95 24.15 1777.848 224.252 34.030 0.166 86 93 142 147 222 239 415 E
	1258	11 J0615-2634 40.0 70.8 232.6 06:15:25.93 -26:34:35.73 16.54 19.58 1782.224 52.765 2.757 0.068 38 41 69 73 231 235 44
	1259	12 J0604-2606 75.9 78.4 233.1 06:04:42.24 -26:06:22.98 28.12 23.86 1788.258 224.252 27.059 0.155 73 79 76 81 225 242 352
	1260	13 J0601-2340 88.0 114.9 235.7 06:01:08.27 -23:40:17.66 35.94 32.09 1822.941 263.826 85.132 0.297 84 92 112 119 226 246 724
	1261	14 J0615-2234 38.2 130.6 253.6 06:15:30.57 -22:34:51.69 12.38 15.71 2059.721 118.722 2.317 0.070 37 39 129 132 248 257 40
	1262	15 J0617-2305 31.4 122.8 258.0 06:17:33.18 -23:05:36.24 16.45 15.54 2117.104 39.574 1.424 0.062 30 33 121 124 256 259 23
	1263	16 J0612-2149 49.5 142.3 271.1 06:12:11.78 -21:49:20.22 24.35 19.58 2290.167 395.740 20.712 0.101 47 52 140 144 257 287 318
	1264	17 J0616-2133 35.2 145.9 300.0 06:16:16.44 -21:33:36.96 20.21 7.47 2671.799 224.252 3.851 0.127 33 37 145 146 294 311 40 E
	1265	18 J0544-2736 144.0 54.9 325.4 05:44:13.62 -27:36:34.24 3.57 12.13 3006.575 39.574 0.436 0.057 144 144 54 56 324 327 7 E
	1266	19 J0555-2956 107.2 20.7 367.5 05:55:10.37 -29:56:43.13 19.65 24.31 3561.004 39.574 6.482 0.169 105 109 18 23 366 369 72
	1267	20 J0558-2321 96.0 119.6 532.1 05:58:47.64 -23:21:17.38 11.91 16.09 5733.479 52.765 1.287 0.051 95 97 118 121 530 534 27
	1268	21 J0616-2649 37.9 67.0 547.0 06:16:04.62 -26:49:18.33 12.35 11.67 5929.923 39.574 1.637 0.064 37 39 66 68 546 549 25
	1269	22 J0619-2252 25.1 125.9 724.2 06:19:21.57 -22:52:13.98 12.38 11.61 8267.304 39.573 0.698 0.059 24 26 125 127 723 726 13 E
	1270	23 J0552-2916 116.9 30.5 727.0 05:52:15.05 -29:16:49.65 11.59 20.25 8304.033 303.400 35.834 0.479 116 118 28 32 716 739 132
[3]	1271	\end{verbatim}
	1272	}
[80]	1273	Note that the
	1274	width of the table can make it hard to read. A good trick for those
	1275	using UNIX/Linux is to make use of the {\tt a2ps} command. The
	1276	following works well, producing a postscript file {\tt results.ps}:
[87]	1277	\\\verb\|a2ps -1 -r -f8 -o duchamp-Results.ps duchamp-Results.txt\|
[3]	1278
	1279	%\end{landscape}
	1280
	1281	\newpage
	1282	\section{Example VOTable output}
	1283	\label{app-votable}
	1284	This is part of the VOTable, in XML format, corresponding to the
[20]	1285	output file in Appendix~\ref{app-output} (the indentation has been removed to make it fit on the page!).
[3]	1286
	1287	%\begin{landscape}
	1288	{\scriptsize
	1289	\begin{verbatim}
	1290	<?xml version="1.0"?>
	1291	<VOTABLE version="1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	1292	xsi:noNamespaceSchemaLocation="http://www.ivoa.net/xml/VOTable/VOTable/v1.1">
[20]	1293	<COOSYS ID="J2000" equinox="J2000." epoch="J2000." system="eq_FK5"/>
	1294	<RESOURCE name="Duchamp Output">
	1295	<TABLE name="Detections">
	1296	<DESCRIPTION>Detected sources and parameters from running the Duchamp source finder.</DESCRIPTION>
	1297	<PARAM name="FITS file" datatype="char" ucd="meta.file;meta.fits" value="/DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits"/>
[87]	1298	<PARAM name="Threshold" datatype="float" ucd="stat.snr" value="2.5">
	1299	<PARAM name="ATrous note" datatype="char" ucd="meta.note" value="The a trous reconstruction method was used, with the following parameters.">
	1300	<PARAM name="ATrous Cut" datatype="float" ucd="stat.snr" value="4">
	1301	<PARAM name="ATrous Minimum Scale" datatype="int" ucd="stat.param" value="1">
	1302	<PARAM name="ATrous Filter" datatype="char" ucd="meta.code;stat" value="B3 spline function">
[20]	1303	<FIELD name="ID" ID="col1" ucd="meta.id" datatype="int" width="4"/>
	1304	<FIELD name="Name" ID="col2" ucd="meta.id;meta.main" datatype="char" arraysize="14"/>
	1305	<FIELD name="RA" ID="col3" ucd="pos.eq.ra;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/>
	1306	<FIELD name="Dec" ID="col4" ucd="pos.eq.dec;meta.main" ref="J2000" datatype="float" width="10" precision="6" unit="deg"/>
	1307	<FIELD name="w_RA" ID="col3" ucd="phys.angSize;pos.eq.ra" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/>
	1308	<FIELD name="w_Dec" ID="col4" ucd="phys.angSize;pos.eq.dec" ref="J2000" datatype="float" width="7" precision="2" unit="arcmin"/>
	1309	<FIELD name="Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc" datatype="float" width="9" precision="3" unit="km/s"/>
	1310	<FIELD name="w_Vel" ID="col4" ucd="phys.veloc;src.dopplerVeloc;spect.line.width" datatype="float" width="8" precision="3" unit="km/s"/>
	1311	<FIELD name="Integrated_Flux" ID="col4" ucd="phys.flux;spect.line.intensity" datatype="float" width="10" precision="3" unit="km/s"/>
	1312	<DATA>
	1313	<TABLEDATA>
	1314	<TR>
	1315	<TD> 1</TD><TD> J0609-2200</TD><TD> 92.410416</TD><TD>-22.013390</TD><TD> 48.50</TD><TD> 39.42</TD><TD> 213.061</TD><TD> 65.957</TD><TD> 17.572</TD>
	1316	</TR>
	1317	<TR>
	1318	<TD> 2</TD><TD> J0608-2605</TD><TD> 92.042633</TD><TD>-26.085157</TD><TD> 44.47</TD><TD> 39.47</TD><TD> 233.119</TD><TD> 39.574</TD><TD> 4.144</TD>
	1319	</TR>
	1320	<TR>
	1321	<TD> 3</TD><TD> J0606-2724</TD><TD> 91.637840</TD><TD>-27.412022</TD><TD> 52.48</TD><TD> 47.57</TD><TD> 302.213</TD><TD> 39.574</TD><TD> 17.066</TD>
	1322	</TR>
[3]	1323	(... table truncated for clarity ...)
[20]	1324	</TABLEDATA>
	1325	</DATA>
	1326	</TABLE>
	1327	</RESOURCE>
[3]	1328	</VOTABLE>
	1329	\end{verbatim}
	1330	}
	1331	\end{landscape}
	1332
[87]	1333	\newpage
	1334	\section{Example Karma Annotation File output}
	1335	\label{app-karma}
	1336
	1337	This is the format of the Karma Annotation file, showing the locations
	1338	of the detected objects. This can be loaded by the plotting tools of
	1339	the Karma package (for instance, {\tt kvis}) as an overlay on the FITS
	1340	file.
	1341
	1342	\begin{verbatim}
	1343	# Duchamp Source Finder results for
	1344	# cube /DATA/SITAR_1/whi550/cubes/H201_abcde_luther_chop.fits
	1345	COLOR RED
	1346	COORD W
	1347	CIRCLE 92.3376 -21.9475 0.403992
	1348	TEXT 92.3376 -21.9475 1
	1349	CIRCLE 91.9676 -26.0193 0.37034
	1350	TEXT 91.9676 -26.0193 2
	1351	CIRCLE 91.5621 -27.3459 0.437109
	1352	TEXT 91.5621 -27.3459 3
	1353	CIRCLE 92.8285 -21.6344 0.269914
	1354	TEXT 92.8285 -21.6344 4
	1355	CIRCLE 90.1381 -28.9838 0.234179
	1356	TEXT 90.1381 -28.9838 5
	1357	CIRCLE 89.72 -26.6513 0.132743
	1358	TEXT 89.72 -26.6513 6
	1359	CIRCLE 94.2743 -27.4003 0.195175
	1360	TEXT 94.2743 -27.4003 7
	1361	CIRCLE 92.2739 -21.6941 0.134538
	1362	TEXT 92.2739 -21.6941 8
	1363	CIRCLE 89.7133 -25.4259 0.232252
	1364	TEXT 89.7133 -25.4259 9
	1365	CIRCLE 90.2206 -21.6993 0.266247
	1366	TEXT 90.2206 -21.6993 10
	1367	CIRCLE 93.8581 -26.5766 0.163153
	1368	TEXT 93.8581 -26.5766 11
	1369	CIRCLE 91.176 -26.1064 0.234356
	1370	TEXT 91.176 -26.1064 12
	1371	CIRCLE 90.2844 -23.6716 0.299509
	1372	TEXT 90.2844 -23.6716 13
	1373	CIRCLE 93.8774 -22.581 0.130925
	1374	TEXT 93.8774 -22.581 14
	1375	CIRCLE 94.3882 -23.0934 0.137108
	1376	TEXT 94.3882 -23.0934 15
	1377	CIRCLE 93.0491 -21.8223 0.202928
	1378	TEXT 93.0491 -21.8223 16
	1379	CIRCLE 94.0685 -21.5603 0.168456
	1380	TEXT 94.0685 -21.5603 17
	1381	CIRCLE 86.0568 -27.6095 0.101113
	1382	TEXT 86.0568 -27.6095 18
	1383	CIRCLE 88.7932 -29.9453 0.202624
	1384	TEXT 88.7932 -29.9453 19
	1385	\end{verbatim}
	1386
	1387	\newpage
[80]	1388	\section{Installing Duchamp (README file)}
[71]	1389	\begin{verbatim}
	1390	There is an executable (Duchamp) that has been compiled on a Debian
	1391	Linux kernel 2.6.8-2-686, with gcc version 3.3.5 (Debian 1:3.3.5-13)
	1392
	1393	If that is no good to you, you can compile it yourself using the
	1394	Makefile included in this directory (sorry for not having a configure
	1395	script or similar yet!).
	1396
	1397	Duchamp uses three main external libraries: pgplot, cfitsio and
	1398	wcslib. You will need to set the paths for the base directory and
	1399	three libraries, as they are currently configured for my use and will
	1400	not be of much use to you! These are:
	1401
	1402	BASE --> the current directory
	1403	PGDIR --> where the pgplot libraries (and header files) are located
	1404	CFITSIODIR --> where the header file fitsio.h is
	1405	CFITSIOLDIR --> where the cfitsio library is located (libcfitsio.a)
	1406	WCSDIR --> where the wcslib header files are
	1407	WCSLDIR --> where the wcslib library is located (libwcs.a)
	1408
	1409	If you do not have the libraries, they can be downloaded from the
	1410	following locations:
	1411	PGPlot -- http://www.astro.caltech.edu/~tjp/pgplot/
	1412	cfitsio -- http://heasarc.gsfc.nasa.gov/docs/software/fitsio/fitsio.html
	1413	wcslib -- http://www.atnf.csiro.au/people/Mark.Calabretta/WCS/index.html
	1414
	1415	Once you've set up the Makefile correctly, then simply typing
	1416	> make duchamp
	1417	will compile the program.
	1418
	1419	To run it, you need to use the syntax
	1420	> Duchamp -p parameterFile
	1421	where parameterFile is a file with the input parameters, including the
	1422	name of the cube you want to search.
	1423
	1424	There are two example input files included with the distribution. The
	1425	smaller one, InputExample, shows the typical parameters one might want
	1426	to set. The large one, InputComplete, lists all parameters that can be
	1427	entered, and a brief description of them. Refer to the documentation
	1428	for further details.
	1429
	1430	To get going quickly, just replace the "your-file-here" in
	1431	InputExample with your image name, and type
	1432	> Duchamp -p InputExample
	1433	and you're off!
	1434	\end{verbatim}
	1435
[3]	1436	\section{Robust statistics for a Normal distribution}
	1437	\label{app-madfm}
	1438
	1439	The Normal, or Gaussian, distribution for mean $\mu$ and standard
	1440	deviation $\sigma$ can be written as
	1441	\[
	1442	f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\ e^{-(x-\mu)^2/2\sigma^2}.
	1443	\]
	1444
	1445	When one has a purely Gaussian signal, it is straightforward to
	1446	estimate $\sigma$ by calculating the standard deviation (or rms) of
	1447	the data. However, if there is a small amount of signal present on top
	1448	of Gaussian noise, and one wants to estimate the $\sigma$ for the
	1449	noise, the presence of the large values from the signal can bias the
	1450	estimator to higher values.
	1451
	1452	An alternative way is to use the median ($m$) and median absolute deviation
	1453	from the median ($s$) to estimate $\mu$ and $\sigma$. The median is the
	1454	middle of the distribution, defined for a continuous distribution by
	1455	\[
	1456	\int_{-\infty}^{m} f(x) \diff x = \int_{m}^{\infty} f(x) \diff x.
	1457	\]
	1458	From symmetry, we quickly see that for the continuous Normal
	1459	distribution, $m=\mu$. We consider the case henceforth of $\mu=0$,
	1460	without loss of generality.
	1461
	1462	To find $s$, we find the distribution of the absolute deviation from
	1463	the median, and then find the median of that distribution. This
	1464	distribution is given by
	1465	\begin{eqnarray*}
	1466	g(x) &= &{\mbox{\rm distribution of }} \|x\|\\
[18]	1467	&= &f(x) + f(-x),\ x\ge0\\
[37]	1468	&= &\sqrt{\frac{2}{\pi\sigma^2}}\, e^{-x^2/2\sigma^2},\ x\ge0.
[3]	1469	\end{eqnarray*}
[18]	1470	So, the median absolute deviation from the median, $s$, is given by
[3]	1471	\[
	1472	\int_{0}^{s} g(x) \diff x = \int_{s}^{\infty} g(x) \diff x.
	1473	\]
	1474	Now, $\int_{0}^{\infty}e^{-x^2/2\sigma^2} \diff x = \sqrt{\pi\sigma^2/2}$, and
	1475	so $\int_{s}^{\infty} e^{-x^2/2\sigma^2} \diff x =
	1476	\sqrt{\pi\sigma^2/2} - \int_{0}^{s} e^{-\frac{x^2}{2\sigma^2}} \diff x
	1477	$. Hence, to find $s$ we simply solve the following equation (setting $\sigma=1$ for
	1478	simplicity -- equivalent to stating $x$ and $s$ in units of $\sigma$):
	1479	\[
	1480	\int_{0}^{s}e^{-x^2/2} \diff x - \sqrt{\pi/8} = 0.
	1481	\]
	1482	This is hard to solve analytically (no nice analytic solution exists
	1483	for the finite integral that I'm aware of), but straightforward to
	1484	solve numerically, yielding the value of $s=0.6744888$. Thus, to
	1485	estimate $\sigma$ for a Normally distributed data set, one can calculate
	1486	$s$, then divide by 0.6744888 (or multiply by 1.4826042) to obtain the
	1487	correct estimator.
	1488
	1489	Note that this is different to solutions quoted elsewhere,
[37]	1490	specifically in \citet{meyer04:trunc}, where the same robust estimator
	1491	is used but with an incorrect conversion to standard deviation -- they
	1492	assume $\sigma = s\sqrt{\pi/2}$. This, in fact, is the conversion used
	1493	to convert the {\it mean} absolute deviation from the mean to the
	1494	standard deviation. This means that the cube noise in the \hipass\
[80]	1495	catalogue (their parameter Rms$_{\rm cube}$) should be 18\% larger
	1496	than quoted.
[3]	1497
[24]	1498	\section{How Gaussian noise changes with wavelet scale.}
	1499	\label{app-scaling}
	1500
	1501	The key element in the wavelet reconstruction of an array is the
	1502	thresholding of the individual wavelet coefficient arrays. This is
	1503	usually done by choosing a level to be some number of standard
	1504	deviations above the mean value.
	1505
	1506	However, since the wavelet arrays are produced by convolving the input
	1507	array by an increasingly large filter, the pixels in the coefficient
	1508	arrays become increasingly correlated as the scale of the filter
	1509	increases. This results in the measured standard deviation from a
	1510	given coefficient array decreasing with increasing scale. To calculate
	1511	this, we need to take into account how many other pixels each pixel in
	1512	the convolved array depends on.
	1513
	1514	To demonstrate, suppose we have a 1-D array with $N$ pixel values
	1515	given by $F_i,\ i=1,...,N$, and we convolve it with the B$_3$-spline
[87]	1516	filter, defined by the set of coefficients
	1517	$\{1/16,1/4,3/8,1/4,1/16\}$. The flux of the $i$th pixel in the
	1518	convolved array will be
[24]	1519	\[
	1520	F'_i = \frac{1}{16}F_{i-2} + \frac{1}{16}F_{i-2} + \frac{3}{8}F_{i}
	1521	+ \frac{1}{4}F_{i-1} + \frac{1}{16}F_{i+2}
	1522	\]
	1523	and the flux of the corresponding pixel in the wavelet array will be
	1524	\[
	1525	W'_i = F_i - F'_i = \frac{1}{16}F_{i-2} + \frac{1}{16}F_{i-2} + \frac{5}{8}F_{i}
	1526	+ \frac{1}{4}F_{i-1} + \frac{1}{16}F_{i+2}
	1527	\]
	1528	Now, assuming each pixel has the same standard deviation
	1529	$\sigma_i=\sigma$, we can work out the standard deviation for the
	1530	coefficient array:
	1531	\[
	1532	\sigma'_i = \sigma \sqrt{\left(\frac{1}{16}\right)^2 + \left(\frac{1}{4}\right)^2
	1533	+ \left(\frac{5}{8}\right)^2 + \left(\frac{1}{4}\right)^2 + \left(\frac{1}{16}\right)^2}
	1534	= 0.72349\ \sigma
	1535	\]
	1536	Thus, the first scale wavelet coefficient array will have a standard
	1537	deviation of 72.3\% of the input array. This procedure can be followed
	1538	to calculate the necessary values for all scales, dimensions and
	1539	filters used by Duchamp.
	1540
	1541	Calculating these values is, therefore, a critical step in performing
	1542	the reconstruction. \citet{starck02:book} did so by simulating data sets
	1543	with Gaussian noise, taking the wavelet transform, and measuring the
	1544	value of $\sigma$ for each scale. We take a different approach, by
	1545	calculating the scaling factors directly from the filter coefficients
	1546	by taking the wavelet transform of an array made up of a 1 in the
	1547	central pixel and 0s everywhere else. The scaling value is then
	1548	derived by adding in quadrature all the wavelet coefficient values at
	1549	each scale. We give the scaling factors for the three filters
	1550	available to Duchamp on the following page. These values are
	1551	hard-coded into Duchamp, so no on-the-fly calculation of them is
	1552	necessary.
	1553
	1554	Memory limitations prevent us from calculating factors for large
	1555	scales, particularly for the three-dimensional case (hence the --
	1556	symbols in the tables). To calculate factors for
	1557	higher scales than those available, we note the following
	1558	relationships apply for large scales to a sufficient level of precision:
	1559	\begin{itemize}
	1560	\item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{2}$.
	1561	\item 2-D: factor(scale $i$) = factor(scale $i-1$)$/2$.
	1562	\item 1-D: factor(scale $i$) = factor(scale $i-1$)$/\sqrt{8}$.
	1563	\end{itemize}
	1564
	1565	\newpage
	1566	\begin{itemize}
	1567	\item {\bf B$_3$-Spline Function:} $\{1/16,1/4,3/8,1/4,1/16\}$
	1568
	1569	\begin{tabular}{llll}
	1570	Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline
	1571	1 & 0.723489806 & 0.890796310 & 0.956543592\\
	1572	2 & 0.285450405 & 0.200663851 & 0.120336499\\
	1573	3 & 0.177947535 & 0.0855075048 & 0.0349500154\\
	1574	4 & 0.122223156 & 0.0412474444 & 0.0118164242\\
	1575	5 & 0.0858113122 & 0.0204249666 & 0.00413233507\\
	1576	6 & 0.0605703043 & 0.0101897592 & 0.00145703714\\
	1577	7 & 0.0428107206 & 0.00509204670 & 0.000514791120\\
	1578	8 & 0.0302684024 & 0.00254566946 & --\\
	1579	9 & 0.0214024008 & 0.00127279050 & --\\
	1580	10 & 0.0151336781 & 0.000636389722 & --\\
	1581	11 & 0.0107011079 & 0.000318194170 & --\\
	1582	12 & 0.00756682272 & -- & --\\
	1583	13 & 0.00535055108 & -- & --\\
	1584	%14 & 0.00378341085 & -- & --\\
	1585	%15 & 0.00267527545 & -- & --\\
	1586	%16 & 0.00189170541 & -- & --\\
	1587	%17 & 0.00133763772 & -- & --\\
	1588	%18 & 0.000945852704 & -- & --
	1589	\end{tabular}
	1590
	1591	\item {\bf Triangle Function:} $\{1/4,1/2,1/4\}$
	1592
	1593	\begin{tabular}{llll}
	1594	Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline
	1595	1 & 0.612372436 & 0.800390530 & 0.895954449 \\
	1596	2 & 0.330718914 & 0.272878894 & 0.192033014\\
	1597	3 & 0.211947812 & 0.119779282 & 0.0576484078\\
	1598	4 & 0.145740298 & 0.0577664785 & 0.0194912393\\
	1599	5 & 0.102310944 & 0.0286163283 & 0.00681278387\\
	1600	6 & 0.0722128185 & 0.0142747506 & 0.00240175885\\
	1601	7 & 0.0510388224 & 0.00713319703 & 0.000848538128 \\
	1602	8 & 0.0360857673 & 0.00356607618 & 0.000299949455 \\
	1603	9 & 0.0255157615 & 0.00178297280 & -- \\
	1604	10 & 0.0180422389 & 0.000891478237 & -- \\
	1605	11 & 0.0127577667 & 0.000445738098 & -- \\
	1606	12 & 0.00902109930 & 0.000222868922 & -- \\
	1607	13 & 0.00637887978 & -- & -- \\
	1608	%14 & 0.00451054902 & -- & -- \\
	1609	%15 & 0.00318942978 & -- & -- \\
	1610	%16 & 0.00225527449 & -- & -- \\
	1611	%17 & 0.00159471988 & -- & -- \\
	1612	%18 & 0.000112763724 & -- & --
	1613
	1614	\end{tabular}
	1615
	1616	\item {\bf Haar Wavelet:} $\{0,1/2,1/2\}$
	1617
	1618	\begin{tabular}{llll}
	1619	Scale & 1 dimension & 2 dimension & 3 dimension\\ \hline
	1620	1 & 0.707167810 & 0.433012702 & 0.935414347 \\
	1621	2 & 0.500000000 & 0.216506351 & 0.330718914\\
	1622	3 & 0.353553391 & 0.108253175 & 0.116926793\\
	1623	4 & 0.250000000 & 0.0541265877 & 0.0413398642\\
	1624	5 & 0.176776695 & 0.0270632939 & 0.0146158492\\
	1625	6 & 0.125000000 & 0.0135316469 & 0.00516748303
	1626
	1627	\end{tabular}
	1628
	1629
	1630	\end{itemize}
	1631
	1632	\end{document}
	1633
	1634
	1635
	1636
	1637
	1638
	1639
	1640
	1641
	1642
	1643
	1644
	1645
	1646
	1647
	1648
	1649
	1650
	1651
	1652
	1653
	1654
	1655
	1656
	1657
	1658
	1659
	1660
	1661
	1662
	1663
	1664
	1665

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: trunk/docs/Guide.tex @ 87

Download in other formats: