Download Lecture 8 - Statistics

Statistics 400 - Lecture 8  Completed so far (any material discussed in these sections is fair game):  2.1-2.5  4.1-4.5  5.1-5.8 (READ 5.7)  6.1-6.4; 6.6  7.1-7.2  Today: finish 7.3, 8.1-8.3  READ 7.4!!!  Assignment #3: 6.2, 6.6, 6.34, 6.78 (interpret the plot in terms of Normality), 7.20, 7.28, 8.14, 8.22, 8.36  Due: Tuesday, Oct 16 Central Limit Theorem  In a random sample (iid sample) from any population with mean  and standard deviation  when n is large, the distribution of the sample mean is approximately normal. x  That is,  Thus, x Z / n Implications  So, for random samples, if have enough data, sample mean is approximately normally distributed...even if data not normally distributed  If have enough data, can use the normal distribution to make probability statements about x Example  A busy intersection has an average of 2.2 accidents per week with a standard deviation of 1.4 accidents  Suppose you monitor this intersection of a given year, recording the number of accidents per week.  Data takes on integers (0,1,2,...) thus distribution of number of accidents not normal.  What is the distribution of the mean number of accidents per week based on a sample of 52 weeks of data Example  What is the approximate probability that x is less than 2  What is the approximate probability that there are less than 100 accidents in a given year? Statistical Inference (Chapter 8)  Would like to make inferences about a population based on samples  The fatality rate for a disease is 50%. In controlled study, 100 patients with a disease are given a new drug. Would you conclude that the drug is successful if:  100% of the patients survived  75% of the patients survived  55% of the patients survived  52% of the patients survived  Statistical inference deals with drawing conclusions about population parameters from the analysis of sample data  Estimation of parameters  Estimate a single value for a parameter (point estimation)  Estimate a plausible range of values for a parameter (interval estimation)  Testing of hypothesis  Procedure for testing whether data supports a hypothesis or theory Point Estimation  Objective: to estimate a population parameter based on sample data  Point estimator is a statistic that estimates a population parameter  Standard deviation of the statistic is called the standard error (most of the time) Example  Sample mean:  How do you estimate the standard error?  If have a random sample of size n from a normal population, what is the distribution of the sample mean?  If the sampling procedure is done repeatedly, what proportion of sample means lie in the interval   2 ,   2  ?  If the sampling procedure is done repeatedly, what proportion of sample means lie in the interval   3 ,   3  ?  When estimating  with , the 100(1- )% margin of error, d, is the value where 100(1- )% of the sample means will fall in the interval   d ,   d  x  For large samples, d  z / 2  n Sample Size Calculation  Before collecting data, should have some desired margin of error, d and an associated probability  Based on this can determine appropriate sample size  d  z / 2  n  What does this sample size guarantee? Example (8.12)  Standard deviation of heights of 5 year-old boys is 3.5 inches  How many boys must be sampled if we want to be 90% certain that the population mean height is within 0.5 inches? Confidence Intervals for the Mean  Last day, introduced a point estimator…a statistic that estimates a population parameter  Often more desirable to present a plausible range for the parameter, based on the data  We will call this a confidence interval  Ideally, the interval contains the true parameter value  In practice, not possible to guarantee because of sample to sample variation  Instead, we compute the interval so that before sampling, the interval will contain the true value with high probability  This high probability is called the confidence level of the interval Confidence Interval for  for a Normal Population  Situation:  Have a random sample of size n from N (  , )  Suppose value of the standard deviation is known  Value of population mean is unknown  Last day we saw that 100(1   )% of sample means will fall in the interval:       z ,   z  /2  /2  n n   Therefore, before sampling the probability of getting a sample mean in this interval is (1   )  Equivalently,     P   z / 2  X    z / 2   (1   ) n n   Equivalently,     P X  z / 2    X  z / 2   (1   ) n n   The interval below is called a 100(1   )% confidence interval for     X  z , X  z  /2  /2  n n   Example  To assess the accuracy of a laboratory scale, a standard weight known to be 10 grams is weighed 5 times  The reading are normally distributed with unknown mean and a standard deviation of 0.0002 grams  Mean result is 10.0023 grams  Find a 90% confidence interval for the mean

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 8 - Statistics