Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Functions of random variables • Sometimes what we can measure is not what we are interested in! • Example: mass of binary-star system: V 2a V 3P M G 2G • We want M but can only measure V and P. Y • Must conserve probability: f (Y )dy f (X)dx f(Y) Y g(X) g dY dx X f(X) X Non-linear transformations • e.g.Flux distributions vs. wavelength, frequency: f ( )d f ( )d c d c 2 d f ( ) • Fluxes and magnitudes: M 2.5 log X 2.5 log X a (M) c f ( ) f(M) M=-2.5 log X – Gaussian distribution: X ~ G(X0,2) – Nonlinear transformation induces a bias: 2 – PROBLEM: evaluate a, (M) in terms of X0 , . f(X) X Y f(Y) Y=g(X) Nonlinear transformations bias the mean f(X) X • To find <Y>, use Taylor expansion around X=<X>: 1 2 Y g(X) g( X ) g' ( X ) X X g"( X ) X X ... 2 • Hence 1 g(X ) g( X ) g' ( X )X X g"( X ) X X 2 ... 2 0 1 g( X ) g' ( X ) X X g"( X ) 2 (X) ... 2 This is the bias. Y f(Y) Y=g(X) Variance of a transformed variable f(X) X • Get variance of Y from first principles: 2 (Y ) (Y Y )2 1 g( X ) g' ( X ) X X g"( X ) X X 2 ... 2 g( X ) 2 1 g"( X ) 2 (X) ... 2 0 g' ( X ) X X ... g' ( X ) 2 (X ) 2 2 What is a statistic? • Anything you measure or compute from the data. • Any function of the data. • Because the data “jiggle”, every satistic also “jiggles”. • Example: the mean value of a sample of N data points is a statistic: 1 N X Xi N i1 • It has a definite value for a particular dataset, but it also “jiggles” with the ensemble of datasets to trace out its own PDF. • NB: X X Sample mean and variance - 1 1 N • Sample mean: X Xi N i1 • The distribution of sample means has a mean: 1 1 X Xi N i N 1 Xi N Xi i i • ...and a variance: 1 1 2 1 2 X Xi 2 Xi 2 Xi N i N i N i 2 2 if the Xi are independent Sample mean and variance - 2 • If the Xi are all drawn from a single parent distribution with mean <X> and variance 2, then: 1 N X X X , i.e. X is an unbiased estimator of X . N i1 • And: 2 2 N X Xi 1 2 2 i X 2 Xi 2 N i N N X Xi N , i.e. X " jiggles" much less than a single data value Xi does. Other unbiased statistics • Sample median (half points above, half below) • (Xmax + Xmin) / 2 • Any single point Xi chosen at random from sequence • Weighted average: wi Xi w i i i Inverse variance weighting is best! • Let’s evaluate the variance of the weighted average for some weighting function wi: wi X i w i X i wi2 2 X i i i 2 i . 2 2 wi i w i w i i i 2 • The variance of the weighted average is minimised when: 1 1 wi 2. Var(Xi ) i • Let’s verify this -- it’s important! Choosing the best weighting function • To minimise the variance of the weighted average, set: 2 2 2 2 wi i 2 wi i 2 i 2wk k i 0 2 2 3 wk wi wi wi i i i 2 2 wi i 2 1 2 i wk 2 . 2 wk k k wi wi i i (Note : 2 2 2 w w for w 1/ i i i i i ) Using optimal weights • Good principles for constructing statistics: – Unbiased -> no systematic error – Minimum variance -> smallest possible statistical error • Optimally (inverse-variance) weighted average: Xˆ wi Xi i wi i • Is unbiased, since: 2 X / i i i 2 1/ i i Xˆ X • And has minimum variance: Xˆ 2 1 2 1 / i i