Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COMMON STATISTICAL TEST PROBLEMS: Tests dealing with the mean of data samples Testing sample mean: Is it equal/ larger/ smaller a prescribed value? Comparing two sample sets: Are the mean values different? Comparing paired samples: Are the differences equal/ larger/smaller a certain value? Tests dealing with correlation coefficients Testing the correlation coefficient obtained from two paired samples: Is correlation equal 0, larger 0, or smaller 0? Tests dealing with the variance of the samples Testing a single sample variance: Is the variance equal/greater/smaller a prescribed value? Testing the ratio between the estimated variances from two sample sets: Are the variances equal? Is the ratio between the variances equal 1 greater 1 or smaller 1 ? Tests dealing with regression parameters Testing a simple linear regression model: (a) Is the regression coefficient different from 0, greater 0 or smaller 0. (b) With multiple predictors: Are all regression coefficients as a whole significantly different from 0? Which individual regression parameters are different from 0? Testing the significance of the differences in the speed (of the Starling bird flying through a corridor with striped walls) Experiment Sample size n Standard deviations (guessed) Horizontal stripes 16.5ft/s 10 1.5 Vertical stripes 15.3ft/s 10 1 Step 1: Identifying the type of statistical test: We want to test the difference in the two mean values: The test compares two estimated means. [Both are random variables with an underlying Probability Density Function (PDF)] The variance of samples (and the variance of the means) are also unknown and must be estimated from the data The samples are not paired (the experiments were all done independent) Testing the significance of the differences in the speed (of the Starling bird flying through a corridor with striped walls) Experiment Sample size n Standard Deviations (guessed) Horizontal stripes 16.5ft/s 10 1.5 Vertical stripes 15.3ft/s 10 1 The appropriate test is: “A test for the differences of means under independence” (or “Comparing two independent population means with unknown population standard deviations”) The null hypothesis is H0: The average speed is the same in both experiments If H0 is true then the random variable z is a realization from a population with approximate standard Gaussian distribution.* *Note: Only for large sample sizes n1 and n2 The classical Student+ t-test* Testing if Albany temperatures anomalies from 1950-1980 were different from 0: January 1950-1980 anomalies with respect to the 1981-2010 climatological mean Dashed line: Theoretical probability density function of our test variable. If H0 was true then our test value should be a random sample from this distribution. That means we would expect it to be close to zero. The more our test value lies in the tails of the distribution, the more unlikely it is to be part of the distribution. The test value calculated from the sample *`Student' + (1908a). The probable error of a mean. Biometrika, 6, 1-25. William S. Gosset: ‘He received a degree from Oxford University in Chemistry and went to work as a “brewer'' in 1899 at Arthur Guinness Son and Co. Ltd. in Dublin, Ireland’ (Steve Fienberg. "William Sealy Gosset" (version 4). StatProb: The Encyclopedia Sponsored by Statistics and Probability Societies. Freely available at http://statprob.com/encyclopedia/WilliamSealyGOSSET.html) The classical Student+ t-test* Testing if Albany temperatures anomalies from 1950-1980 were different from zero: Annual mean 1950-1980 anomalies with respect to the 1981-2010 climatological mean Test variable The test value calculated from the sample. 𝑥 n μ0 𝑆𝑥 2 : sample mean : sample size : population mean (here μ0=0) : sample variance The test variable t is calculated from a random sample. As any other quantity estimated from random samples, it is a random variable drawn from a theoretical population with The classical Student+ t-test* Testing H0 : Albany (New York Central Park) temperatures anomalies from 1950-1980 not different from 0. Solid lines: Cumulative density function (for the test variable if H0 is true) Albany 1950-1980 Jan NYC 1950-1980 Jan The classical Student+ t-test* Testing H0 : Albany (New York Central Park) temperatures anomalies from 1950-1980 not different from 0. Alternative hypothesis: the mean anomaly was less than 0! (i.e. it was colder 1950-1980 than 1981-2010) Solid lines: Choose a significance test level 5% one sided t-test Albany 1950-1980 Jan 0.05 NYC 1950-1980 Jan 0.05 The classical Student+ t-test* Testing H0 : Albany (New York Central Park) temperatures anomalies from 1950-1980 not different from 0. Alternative hypothesis: the mean anomaly was less than 0! (i.e. it was colder 1950-1980 than 1981-2010) Solid lines: Choose a significance test level 5% one sided t-test Albany 1950-1980 Jan 0.05 Reject H0! Accept alternative! NYC 1950-1980 Jan 0.05 Accept H0! The single sided t-test Null Hypothesis H0 : Albany temperatures anomalies from 1950-1980 not different from 0. Alternative Hypothesis Ha : Temperature anomalies were negative* tcrit t 0 Area under the curve gives the probability P(t< tcrit) *Note that we formed anomalies with respect to the 1981-2010 climatology. Thus we test if 1950-1980 was significantly cooler than the 1981-2010. The single sided t-test Null Hypothesis H0 : Albany temperatures anomalies from 1950-1980 not different from 0. Alternative Hypothesis Ha : Temperature anomalies were negative* tcrit Calculated t t 0 We reject the null hypothesis if the calculated t-value falls into the tail of the distribution. The p-value is chosen usually chosen to be small 0.1 0.05 0.01 are typical –p-values. We then say: “We reject the null-hypothesis at the level of significance of 10% (5%) (1%)” Area under the curve gives the probability p(t< tcrit) *Note that we formed anomalies with respect to the 1981-2010 climatology. Thus we test if 1950-1980 was significantly cooler than the 1981-2010. The two-sided t-test Null Hypothesis H0 : Albany temperatures anomalies from 1950-1980 not different from 0. Alternative Hypothesis Ha : Temperature anomalies were different from zero -tcrit t 0 +tcrit Area under the curve gives the probability P(t > +tcrit) Area under the curve gives the probability P(t< -tcrit) *Note that we formed anomalies with respect to the 1981-2010 climatology. Thus we test if 1950-1980 was significantly cooler than the 1981-2010. The two-sided t-test Null Hypothesis H0 : Albany temperatures anomalies from 1950-1980 not different from 0. Alternative Hypothesis Ha : Temperature anomalies were different from zero We cannot reject H0 at the two-sided significance level of ‘p’-percent (e.g. 5%) tcrit t 0 Calculated t TESTING A NULL HYPOTHESIS Hypothesis/Conclusion Null hypothesis H0 true Null hypothesis H0 false Null hypothesis accepted Correct decision False decision (Type II error) Null hypothesis rejected False decision (Type I error) Correct decision TEST FOR DIFFERENCES IN THE MEAN H0 : Here we would reject H0 for the given p-value (α = 0.05) Calculated test value Figure 5.1 from Wilks “Statistical Methods in Atmospheric Sciences” (2006) TEST FOR DIFFERENCES IN THE MEAN H0 : Here we would accept H0 for the given p-value (α = 0.05) Calculated test value Figure 5.1 from Wilks “Statistical Methods in Atmospheric Sciences” (2006) TESTING A NULL HYPOTHESIS Hypothesis/Conclusion Null hypothesis H0 true Null hypothesis H0 false Null hypothesis accepted Correct decision False decision (Type II error) Probability of this type of error is usually hard to quantify ( β‘beta’) Null hypothesis rejected False decision (Type I error) Probability of this error is given by the p-value ( α ‘alpha’) Correct decision