Download ppt

Lecture 8 • Source detection NASSP Masters 5003S - Computational Astronomy - 2009 Different sorts of model All models Background + signal Background + many similar signals NASSP Masters 5003S - Computational Astronomy - 2009 b + s vs b + Σsi mr   br, θb   Asrr0  mr   br, θ b   i Ai srr0,i  s may often be assumed to be: -slowly varying with r; -with compact support NASSP Masters 5003S - Computational Astronomy - 2009 Source detection • The basic idea is related to Null Hypothesis testing… – But if the sources can be assumed to be localized, we can cut the data up and test each source-sized bit at a time. Survival function • sliding window. Some missed jargon: the probability at the intercept is called the P-value (you can google it) NASSP Masters 5003S - Computational Astronomy - 2009 Testing the NH: • Not all tests are equally good at finding signals! • Eg Cash statistic is better than χ2 (in circumstances where the Cash test is appropriate – eg bkg is a subset of the signal model). • Cash stat makes use of knowledge about the signal shape – in general any stat which does similar (eg a matched filter) will also perform well. • There is an infinite variety of ‘statistics’ to choose from. NASSP Masters 5003S - Computational Astronomy - 2009 Source detection. – If the SF probability in each patch (the Pvalue) is smaller than a previously chosen cutoff, we can call this a positive detection. • BUT! Note that there is no certainty. – Sometimes the null model will by chance give a large χ2 => ‘false positives.’ For given data, background and cutoff, there will be a fixed number of false positives expected in the source list. • => ‘reliability’. More on this later. – Sometimes a real source will give a small nullhypothesis χ2 => ‘false negatives’, real sources which are missed. • => ‘completeness’. More on this later. NASSP Masters 5003F - Computational Astronomy - 2009 Problems with the NH approach: • We don’t have exact knowledge of the background. – Have to estimate it either from • separate data – in which case we need the separate data! (Don’t always have the luxury.) • or from the same data… but this may be dominated by the source... – Or our background model may be wrong. • Same issues as other model fitting. In particular: – χ2 has to be used with care when the noise is Poisson. NASSP Masters 5003F - Computational Astronomy - 2009 But where are the sources? • Applying some sort of NH test in a sliding window will return a new random signal – now correlated.. • Finding the sources consists rather of looking for peaks in this random signal. • The simplest example is when the noise is uncorrelated and the source peaks have width=0. NASSP Masters 5003F - Computational Astronomy - 2009 Looking for sources 1 channel at a time: • In each channel, we test the NH with N=1. – Since there are no fitted parameters, υ=1 also. – If the source occupies a single channel, this procedure is optimal. – If, however, the source is spread over several channels (as is usual), this procedure is not efficient. – We want a statistic which uses the maximum amount of information about the source shape. NASSP Masters 5003F - Computational Astronomy - 2009 A generic source-detection algorithm • We shall assume that: – The data is ‘binned’ (eg CCD data). – We have a good independent estimate of the background. – The sources are sparsely distributed – such that we can deal with them one at a time. – The shape of the source profile is known. – The source position is unknown. – The source amplitude is unknown (but >0). NASSP Masters 5003F - Computational Astronomy - 2009 Generic source-detection algorithm: The algorithm has 3 steps: 1: Calculate a sliding-window map. 2: Find the peaks in this map. 3: Rejects For each peak, calculate the probability that it could arise by chance from the background (the null hypothesis P-value). No P < Pcutoff? Yes Choose a Pcutoff Sources NASSP Masters 5003F - Computational Astronomy - 2009 1: The sliding window. y U y U y U NASSP Masters 5003F - Computational Astronomy - 2009 1: The sliding window. Same thing. • For each position of the sliding window, a single number U is calculated from the values falling within the window. • The output is a map of the U values. • The intent is to: – Raise the signal-to-noise – Improve sensitivity – Amplify the sources at the expense of the noise. • Sliding-window processing only has value when the source has a width > 1 pixel. • Edges need special treatment. NASSP Masters 5003F - Computational Astronomy - 2009 1: Window functions • A weighted sum (= a convolution). – Simplest with all weights = 1: “sliding box”. – Optimum weights – a “matched filter”: • For uniform Gaussian noise, wopt = s. • Trickier to optimize for Poisson noise. • Per-window null-hypothesis χ2. – With either an independent value of bkg (in which case degrees of freedom = number of pixels Nw in the window), or… – …one fitted from the data (deg free = Nw-1). • Likelihood (same bkg provisions as χ2). NASSP Masters 5003F - Computational Astronomy - 2009 1: Window functions Parent function Data NASSP Masters 5003F - Computational Astronomy - 2009 1: Window functions Parent function Chi squared, size=100 Matched filter, size=10 Log-likelihood, size=100 NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding Gaussian noise, convolved with a gaussian filter. …don’t get the gaussians mixed up! NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding • How best to do it? • There’s no single neat prescription. • Naive prescription: – Pixel i is a peak pixel if yi > any other y within a patch of pixels from i-j to i+j. • This probably looks familiar to you. • But what value to choose for j? • Things to avoid are: – j too small – results in more than 1 peak per source; – j too large – misses a close adjacent source. NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding Box too small: Box too large: NASSP Masters 5003F - Computational Astronomy - 2009 3: Decision time – is it a source or not? • To calculate a P-value we need the probability distribution of peaks in the postwindow map of U values (given the null hypothesis). • This is not the same as the probability distribution of the original data values… • …nor is it even the same as the probability distribution of U values. • In fact, little work seems to have been done on ppeaks. (Though there is quite a lot on the distribution of extrema – not quite the same thing.) NASSP Masters 5003F - Computational Astronomy - 2009 3: The decision ‘Map’ vs ‘peak’ distributions for Gaussian noise. Black: all pixels Red: peaks NASSP Masters 5003F - Computational Astronomy - 2009 3: Cash to the rescue • A practical recipe for applying Cash to source detection goes as follows: – Choose a window area surrounding each peak. – Within this window, calculate Lnull with model mi = bi (the background map values). – Calculate Lbest by fitting a model mi = bi + θ1 s(ri – θr) • Degrees of freedom ν = 1 (the amplitude) + d (the dimensions of the spatial fit). – The Cash statistic 2(Lbest-Lnull) behaves like χ2 with 1+d deg. free. NASSP Masters 5003F - Computational Astronomy - 2009 3: Cash to the rescue • The only difficult point (which is a problem for every method) is to calculate the fraction of pixels which are peaks. – Monte Carlo – Possibly a Fourier technique? • Also, don’t want to use the fit for final parameter values. A Mighell fit is better. From my 2009 Cash paper. NASSP Masters 5003F - Computational Astronomy - 2009 What is the best detection method? From my 2009 Cash paper. NASSP Masters 5003F - Computational Astronomy - 2009

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download ppt