Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Theory Dennis Sun Data 301 The Need for Theory When we were testing whether a coin was fair, the null hypothesis completely specified the box model: 0 1 . This allowed us to simulate from the box model and see whether the observed data is consistent with those simulations. The Need for Theory Now suppose we have an instrument that measures the height of a mountain. The measurements are centered around the true height, but with a standard deviation of 10 feet. We make 15 measurements of the height of Mt. Everest, and find that the average is 29,023 feet. The official height is 29,029 feet. Has Mt. Everest gotten shorter? The null hypothesis is that Mt. Everest has not gotten shorter. What does this tell us about the box? ? ? ... ? µ = 29029 σ = 10 We don’t actually know the tickets in the box, so we can’t simulate from the box model. We need theory! Central Limit Theorem Mean of 100 draws from 0 1 Central Limit Theorem Mean of 100 draws from 0 1 If you take the mean of n draws from any box, the distribution of the means will be approximately Normal(µ, √σn ) when n is large. Central Limit Theorem • The Central Limit Theorem says that the mean of 100 draws from 0 1 is approximately Normal(.5, √.5 ). 100 • Instead of simulating from the box model, we can simulate from the normal distribution. • Using the normal approximation, the probability of observing more than 60 heads is 2.3%. (Exact answer is 2.8%.) Central Limit Theorem The benefit of the Central Limit Theorem is that you just need to know the mean and SD of the box. You don’t need to know the tickets in the box! This allows us to answer the Everest problem, where we don’t know the exact composition of the box, but we do know its mean and SD under the null hypothesis.