Context Navigation

source: trunk/docs/app-stats.tex @ 258

Visit:

Last change on this file since 258 was 258, checked in by Matthew Whiting, 17 years ago
Merging pixel-map-branch revisions 236:257 back into trunk. The use of the PixelMap? functions is sorted now, so we put everything back into a uniform location.
File size: 2.7 KB

Rev	Line
[158]	1	\secA{Robust statistics for a Normal distribution}
	2	\label{app-madfm}
	3
	4	The Normal, or Gaussian, distribution for mean $\mu$ and standard
	5	deviation $\sigma$ can be written as
	6	\[
	7	f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\ e^{-(x-\mu)^2/2\sigma^2}.
	8	\]
	9
	10	When one has a purely Gaussian signal, it is straightforward to
	11	estimate $\sigma$ by calculating the standard deviation (or rms) of
	12	the data. However, if there is a small amount of signal present on top
	13	of Gaussian noise, and one wants to estimate the $\sigma$ for the
	14	noise, the presence of the large values from the signal can bias the
	15	estimator to higher values.
	16
	17	An alternative way is to use the median ($m$) and median absolute
	18	deviation from the median ($s$) to estimate $\mu$ and $\sigma$. The
	19	median is the middle of the distribution, defined for a continuous
	20	distribution by
	21	\[
[231]	22	\int_{-\infty}^{m} f(x) dx = \int_{m}^{\infty} f(x) dx.
[158]	23	\]
	24	From symmetry, we quickly see that for the continuous Normal
	25	distribution, $m=\mu$. We consider the case henceforth of $\mu=0$,
	26	without loss of generality.
	27
	28	To find $s$, we find the distribution of the absolute deviation from
	29	the median, and then find the median of that distribution. This
	30	distribution is given by
	31	\begin{eqnarray*}
[231]	32	g(x) &= &{\text{distribution of }} \|x\|\\
[158]	33	&= &f(x) + f(-x),\ x\ge0\\
	34	&= &\sqrt{\frac{2}{\pi\sigma^2}}\, e^{-x^2/2\sigma^2},\ x\ge0.
	35	\end{eqnarray*}
	36	So, the median absolute deviation from the median, $s$, is given by
	37	\[
[231]	38	\int_{0}^{s} g(x) dx = \int_{s}^{\infty} g(x) dx.
[158]	39	\]
[231]	40	If we use the identity
	41	\[
	42	\int_{0}^{\infty}e^{-x^2/2\sigma^2} dx = \sqrt{\pi\sigma^2/2}
	43	\]
	44	we find that
	45	\[
	46	\int_{s}^{\infty} e^{-x^2/2\sigma^2} dx =
	47	\sqrt{\pi\sigma^2/2}-\int_{0}^{s} e^{-x^2/2\sigma^2}dx.
	48	\]
[158]	49	Hence, to find $s$ we simply solve the following equation (setting
	50	$\sigma=1$ for simplicity -- equivalent to stating $x$ and $s$ in
	51	units of $\sigma$):
	52	\[
[231]	53	\int_{0}^{s}e^{-x^2/2} dx - \sqrt{\pi/8} = 0.
[158]	54	\]
	55	This is hard to solve analytically (no nice analytic solution exists
	56	for the finite integral that I'm aware of), but straightforward to
	57	solve numerically, yielding the value of $s=0.6744888$. Thus, to
	58	estimate $\sigma$ for a Normally distributed data set, one can
	59	calculate $s$, then divide by 0.6744888 (or multiply by 1.4826042) to
	60	obtain the correct estimator.
	61
	62	Note that this is different to solutions quoted elsewhere,
	63	specifically in \citet{meyer04:trunc}, where the same robust estimator
	64	is used but with an incorrect conversion to standard deviation -- they
	65	assume $\sigma = s\sqrt{\pi/2}$. This, in fact, is the conversion used
	66	to convert the \emph{mean} absolute deviation from the mean to the
[258]	67	standard deviation. This means that the cube noise in the \hipass
[158]	68	catalogue (their parameter Rms$_{\rm cube}$) should be 18\% larger
	69	than quoted.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: