Survey

Transcript

Chapter 9 Sampling Distributions AP Statistics St. Francis High School Fr. Chris, 2001 Two Key Ideas A Statistic is a Random Variable As such, mean and standard deviations can be found from combining the basic random variables that make the statistic Pick Pennies from a Hat Recall how we did this Try it again: – Pick at random – Note the year – Compute the mean and standard deviation of your sample – NEW: Compute what you think the mean and standard deviation of the entire hat! Formulas x x n p p p p1 p n Statistic vs. Parameter A Statistic is a way to describe a parameter A Parameter describes a population Which is a sample, which is a parameter? 42% of today’s 15 year-old girls will get pregnant in their teens 42: parameter 37% said they would vote for Joan Smith, on election day 41% actually did. 37:statistic 41:parameter The NIH reports that the mean systolic blood pressure for males 35-44 years of age is 128 and the standard deviation is 15. 72 male Stock Brokers in this age group have a mean blood pressure of 126.07 128, 15: parameter, 126.07:statistic Bias vs. Variability Bias: Is your statistic centered around the population’s parameter? Variability: Is your sample distribution scattered or focused? Identify the bias and variability of each: Population Parameter Population Parameter Population Parameter Population Parameter What about your sample? Is it variable? Is it biased? How can you tell? http://www.mathorama.com/stat/penny97hist.html http://www.mathorama.com/stat/penny99hist.html Confidence Intervals Use your sample statistics and what you know of the central limit Theorem, to make an assertion about the Population parameter. x 2 0.4 x z(std.error ) y 1 2 e 2 0.3 where z is the z score for the desired % 0.2 0.1 By hand: -3 -2 -1 1 2 3 http://www.mathorama.com/stat/Confidence.html Computer Simulation http://www.mathorama.com/stat/RandomSamp.html What about a proportion? The Gallup poll asked a probability sample of 1785 adults whether they attended church or synagogue during the past week. Suppose 40% did attend. How likely is it that a SRS of 1785 would be within 3% of this actual value? pˆ p pˆ p1 p n Two rules of thumb: The population must be at least 10 times more than your sample size to use this formula for standard deviation. np > 10 and n(1-p) > 10 in order to use the normal curve for approximating p. Compute the standard deviation Since the population is more than 10 times 1785, ( pˆ ) p(1 p) n .4(.6) =0.0116 1785 The Probability that p-hat is between 37%-43% Since (.4)(1785) >10, and (.6)(1785)>10 then we can convert to z-scores and use the normal curve. z x .37 .4 2.586 0.0116 .43 .4 2.586 0.0116 Using the Normal Distribution… P(-2.586 < Z < 2.586)= P(Z<2.586)-P(Z<-2.586)= normalcdf(-2.586,2.586)= Normalcdf(.37, .43, .4, 0.0116)= .9903! Okay, what if you flip a coin 20 times and it’s heads 14 times? Is it a fair coin? How can justify your answer? Did you mention sample variability? Bias? Do the rules of thumb apply to find a sigma? To use the normal distribution? (.3)(n) 10 10 n If you suspect that 70% is this coin’s true .3 proportion, how many times should we flip it son 34 we can use the normal curve? Dishonest Cola? DC Cola is suspected of underfilling its cans of cola. They say each can has 12 ounces, with a standard deviation of 0.4 oz. If this is true, how likely is it to get an average of 11.9 oz.or less, by taking a random sample of 50 cans? Work it out... 11.9 12 1.77 Z score? z 0.4 50 Look up -1.77 in Table A, or normalcdf(-1E99, -1.77) Or normalcdf(-1E99, 11.9, 12, .4 / √50) =.0384 This leads to inference... If these were your results, there is still a 3% chance that the parameter really is where the company says it is (12 oz.) and sample variation lead you to a result less than 11.9 oz. At what point do you reject the company’s claim? At 5%? 1%? 0.1%? Inferential Statistics We choose a level of rejection (alpha) We assume that our results are no different, and any variation is from chance (Null Hypothesis). If it is unlikely (less than our chosen alpha), we reject the “Null Hypothesis” Then claim our results SIGNIFICANTLY different. Central Limit Theorem Draw an SRS of size n from any population whatsoever with mean µ and a finite standard deviation . When n is large, the sampling distribution of the sample mean x-bar is close to the normal distribution N[µ, /√n] (page 488). Law of Large Numbers Draw observations at random from any population with finite mean µ. As the number of observations drawn increases, the mean x-bar of the observed values gets closer and closer to . Homework 9.1-9.4 (489) Parameter or a statistic? 2.5003 parameter p 7.2% statistic x 2.5009 statistic p 48% p 52% parameter statistic x1 335 statistic x 2 289 statistic 9.5 (492)Tumbling Toast Toss coin 20 times. P-hat= 10 more times… make a histogram of your p-hats…. Is the center close to .5? Pool your work.. Is the center near .5? Is it normal? 9.9 (500) Dead Guinea Pigs QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. 9.10(510) A) Large Bias, Large Variability B)Small Bias Small Variability C)Small Bias Large Variability D)Large Bias Small Variability 9.17 (503) School Vouchers Assuming the poll’s sample size is less than 780,000-10% of the population of NJ… the variability would be about the same 9.19 (511) Got Milk? n=1012 p .7; p .7 p(1 p) (.7)(.3) .0144 n 1012 US 10120 p .67 np (1012)(.7) 708.4 10 n(1 p) (1012)(.3) 303.6 10 P(p .67) P(Z 0.25) .0186 4 *1012 4048 9.33(519) Juan’s results =10 10 5.7735mg n 3 10 3;n 12 n 9.35(524)Bad Rug Mean=1.6 sd=1.2 1.2 normalcdf (2,9999999,1.6, )0 200 9.39(525) Cheap Cola =298, =3 P(<295)? P(xbar<295, n=6)? 295 298 P(X 295) P(Z 1) .8413 3 295 298 P(x 295) P(Z 2.4495) .0072 3 6 9.41(526) What a Wreck! =2.2, =1.4 Not normal but dist of x-bar is! 1.4 N(2.2, .1941) 52 2 2.2 P(x 2) P(Z 1.0302) .1515 1.4 52 P(x 100 52 ) P(Z 2.2 1.4267) .0768 1.4 52 100 52