Download f(t) - AAVSO

Basic Time Series Analyzing variable star data for the amateur astronomer What Is Time Series?  Single variable x that changes over time t  can be multiple variables, W W W A T Light curve: x = brightness (magnitude)  Each observation consists of two numbers  Time t is considered perfectly precise  Data (observation/measurement/estimate) x is not perfectly precise  Two meanings of “Time Series” TS is a process, how the variable changes over time  TS is an execution of the process, often called a realization of the TS process  A realization (observed TS) consists of pairs of numbers (tn,xn), one such pair for each observation  Goals of TS analysis Use the process to define the behavior of its realizations  Use a realization, i.e. observed data (tn,xn), to discover the process   This is our main goal Special Needs Astronomical data creates special circumstances for time series analysis  Mainly because the data are irregularly spaced in time – uneven sampling  Sometimes the time spacing (the “sampling”) is even pathological – with big gaps that have periods all their own  Analysis Step 1 Plot the data and look at the graph! (visual inspection)  Eye+brain combination is the world’s best pattern-recognition system  BUT – also the most easily fooled   “pictures in the clouds” Use visual inspection to get ideas  Confirm them with numerical analysis  Data = Signal + Noise  True brightness is a function of time f(t)   it’s probably smooth (or nearly so) There’s some measurement error ε it’s random  It’s almost certainly not smooth  Additive model: data xn at time tn is sum of signal f(tn) and noise εn  xn = f(tn) + εn  Noise is Random That’s its definition!  Deterministic part = signal  Random part = noise  Usually – the true brightness is deterministic, therefore it’s the signal  Usually – the noise is measurement error  Achieve the Goal Means we have to figure out how the signal behaves and how the noise behaves  For light curves, we usually just assume how the noise behaves  But we still should determine its parameters  What Determines Random? Probability distribution (pdf or pmf)  pdf: probability that the value falls in a small range of width dε, centered on ε is  Probability = P(ε) dε  pmf: probability that the value is ε is P(ε)  pdf/pmf has some mean value μ  pdf/pmf has some standard deviation σ  Most Common Noise Model  i.i.d. = “independent identically distributed” Each noise value is independent of others  P12(x1,x2) = P1(x1)P2(x2)  They’re all identically distributed  P1(x1) = P2(x2)  What is the Distribution?  Most common is Gaussian (a.k.a. Normal) P( )  e 1 2 2  (   ) /  2  2 Noise Parameters  μ = mean = <ε>  Usually assumed zero (i.e., data unbiased) σ2 = variance = <(ε-μ)2>  σ = √(σ2) = standard deviation  Typical value is 0.2 mag. for visual data  Smaller for CCD/photoelectric (we hope!)  Note: don’t diparage visual data, what they lack in individual precision they make up by the power of sheer numbers  Is the default noise model right? No! We know it’s wrong  Bias: μ values not zero  NOT identically distributed – different observers have different μ, σ values  Sometimes not even independent (autocorrelated noise)  BUT – i.i.d. Gaussian is still a useful working hypothesis, so W W W A T  Even if … Even if we know the form of the noise …  We still have to figure out its parameters  Is it unbiased (i.e. centered at zero so μ = 0)?  How big does it tend to be (what’s σ )?  And … We still have to separate the signal from the noise  And of course figure out the form of the signal, i.e.,  Figure out the process which determines the signal  Whew! Simplest Possible Signal None at all! f(t) = constant = βo  This is the null hypothesis for many tests  But we can’t be sure f(t) is constant …  … that’s only a model of the signal  Separate Signal from Noise We already said data = signal + noise  Therefore data – signal = noise  Approximate signal by model  Approximate noise by residuals data – model = residuals xn – yn = R n  If model is correct, residuals are all noise  Estimate Noise Parameters Use residuals Rn to estimate noise parameters 1  Estimate mean μ by average R   N R  N j 1  Estimate standard deviation σ by sample standard deviation N s  (R j 1 j  R) N 1 2 j Averages  When we average i.i.d. noise we expect to get the mean      Standard deviation of the average (usually called the standard error) is less than standard deviation of the data  ( ave)  " s.e."   ( raw) N Confidence Interval 95% confidence interval is the range in which we expect the average to lie, 95% of the time  About 2 standard errors above or below the expected value  95% C.I .  x  2 ( ave)  x  2 ( raw) / N Does average change?  Divide time into bins      Usually of equal time width (often 10 days) Sometimes of equal number of data N Compute average and standard deviation within each bin IF signal is constant AND noise is consistent, THEN expected value of data average will be constant So: do the “bin averages” show more variation than is expected from noise? ANOVA test Compare variance of averages to variance of data (ANalysis Of VAriance = ANOVA)  In other words… compare variance between bins to variance within bins  “F-test” gives a “p-value,” probability of getting that result IF the data are just noise  Low p-value  probably NOT just noise  Either we haven’t found all the signal  Or the noise isn’t the simple kind  ANOVA test  50-day averages: Fstat df.between df.within p  0.315563 2 147 0.729871  NOT significant   10-day averages: Fstat df.between df.within p  0.728138 14 135 0.743133  NOT significant  ANOVA test  50-day averages: Fstat df.between df.within  13.25758 2 147  IS significant   p 5e-06 10-day averages: Fstat df.between df.within p  2.546476 14 135 0.002879  IS significant  Averages Rule!  Excellent way to reduce the noise  because σ(ave) = σ(raw) / √N Excellent way to measure the noise  Very little change to signal    unless signal changes faster than averaging time So in most cases averages smooth the data, i.e., reduce noise but not signal Decompose the Signal Additive model: sum of component signals  Non-periodic part  sometimes called trend  sometimes called secular variation   Repeating (periodic) part or almost-periodic (pseudoperiodic) part  can be multiple periodic parts (multiperiodic)  f(t) = S(t) + P(t) Periodic Signal Discover that it’s periodic!  Find the period P  Or frequency ν  Pν = 1 ν = 1 / P   Find amplitude A = size of variation   P=1/ν Often use A to denote the semi-amplitude, which is half the full amplitude Find waveform (i.e., cycle shape) Periodogram Searches for periodic behavior  Test many frequencies (i.e., many periods)  For each frequency, compute a power    Higher power  more likely it’s periodic with that frequency (that period) Plot of power vs frequency is a periodogram, a.k.a. power spectrum Periodograms  Fourier analysis  Fourier periodogram Don’t use DFT or FFT because of uneven time sampling  Use Lomb-Scargle modified periodogram OR  DCDFT (date-compensated discrete Fourier transform)  Folded light curve  AoV periodogram  Many more … these are the most common  DCDFT periodogram AoV periodogram Lots lots more … Non-periodic signals  Periodic but not perfectly periodic    (parameters are changing) What if the noise is something “different”? Come to the next workshop! Enjoy observing variables See your own data used in real scientific study (AJ, ApJ, MNRAS, A&A, PASP, …) Participate in monitoring and observing programs Assist in space science and astronomy Make your own discoveries! http://www.aavso.org/

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download f(t) - AAVSO