SAMPLING DISTRIBUTIONS Parameter Population characteristic e.g. μ, σ, P, median, percentiles etc. Sample statistic Any quantity computed from values in a sample e.g. x , s, sample proportion etc. The value of a population characteristic is fixed. The value of a statistic varies from one sample to another. Hence, it is a random variable and its probability distribution is known as its sampling distribution. Population ↓ ↓ ↓ ↓ Sample 1 Sample 2 Sample 3 Sample 4 2 PROPERTIES OF SAMPLING DISTRIBUTIONS A point estimator is a formula that uses sample data to calculate a single number (a sample statistic) that can be used as an estimate of a population parameter. e.g. x and s2 are point estimators of μ and σ2, respectively. If the sampling distribution of a statistic has a mean equal to the parameter being estimated, the statistic is an unbiased estimator of that parameter, otherwise it is biased. We have a population of x values whose histogram is the probability distribution of x. Select a sample of size n from this population and calculate a sample statistic e.g. x. This procedure can be repeated indefinitely and generates a population of values for the sample statistic and the histogram is the sampling distribution of the sample statistic. 3 4 SAMPLING DISTRIBUTION OF THE MEAN (THE CENTRAL LIMIT THEOREM) Frequently we interested in μ and estimate it using x , so we need to know about the sampling distribution of x . Theory says that for random samples of size n from any population μx = E ( x ) = μ σ σx = Std. Error of the Mean n Central Limit Theorem If n is sufficiently large (≥30) the sampling distribution of x will also be approximately normal 5 Example of the Central Limit Theorem in Practice: Roll 30 dice and calculate the average (sample mean) of the numbers that you get on each die. Now repeat this experiment 1000 times each time rolling 30 dice and computing a new sample mean. Plot a histogram of the 1000 sample means that you have obtained. This plot will look approximately normal 6 Example Manufacturer claims the life of a battery type has a mean of 54 months and std dev of 6 months. A consumer group purchases a sample of 50 of these batteries and tests them. They find an average life of 52 months, what should they conclude? If manufacturer’s claim is true μ x =54 σ x =0.85 n>30 so use central limit theorem. z= x − μx σx 52 − 54 ⎞ ⎛ P ( x ≤ 52) = P⎜ z ≤ ⎟ = P ( z ≤ −2.35) =.0094 ⎝ 0.85 ⎠ If the manufacturer’s claim is true then what the consumer group observed is very unlikely - a more plausible explanation is that the true value of μ is less than 54.