Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Distributions Chapter 7 The Concept of a Sampling Distribution Repeated samples of the same size are selected from the same population. The same sample statistic is calculated from the data in EACH sample. The distribution of the sample statistics is the SAMPLING DISTRIBUTION of that sample statistic. The Sampling Process x1, x2 , ... , xn SAMPLE POPULATION μ x and s The Sampling Distribution Repeated Sampling x1, x2 , x3 , ... , xN POPULATION μ Sampling Distribution What is Standard Error? Standard Error has been identified as a quantity that is not understood. Is it a Standard Deviation? Standard Error of what? What does it tell us? The purpose of this presentation is to make the concept of Standard Error clearer and more understandable. The Sampling Process 30, 42, 48, 49, 61, 54, 41, 38, 59, 57 Histogram Population 600 Calculate Mean = 47.9 Count 500 400 This sample mean is an ESTIMATE of the population mean. 300 200 SAMPLE 100 20 30 40 50 x POPULATION Mean = 50 60 70 80 We should not be surprised that the estimate does not equal the true mean for the population! The Sampling Process 30, 42, 48, 49, 61, 54, 41, 38, 59, 57 Histogram Population Calculate Mean = 47.9 600 Plot the Sample Mean Count 500 400 Means of the Separate Samples 300 Dot Plot 200 SAMPLE 100 20 30 40 50 x 60 70 80 POPULATION Mean = 50 40 45 50 55 mean 60 The Sampling Distribution Calculate means for each sample Repeated Sampling m1, m2, … Means of the Separate Samples Dot Plot Histogram Population 600 Count 500 400 Plot All Sample Means 300 200 100 40 45 50 55 60 mean 20 30 40 50 x 60 70 80 Sampling Distribution of the Sample Means What about this sampling distribution? Means of the Separate Samples Dot Plot Each dot represents a mean from one of the samples. Each sample mean is an ESTIMATE of the population mean. Notice that center of this graph is around 50 and the spread ranges from 45 to 55. 40 45 50 55 mean 60 What about this sampling distribution? Means of the Separate Samples Dot Plot The mean of sampling distribution (that is, the mean of the sample means) is the MEAN of the population! AND… 40 45 50 55 mean 60 We call the standard deviation of the distribution of sample means the STANDARD ERROR OF THE ESTIMATE OF THE POPULATION MEAN. In Summary STANDARD DEVIATION is a measure of the spread of data in a population or in a sample. STANDARD ERROR is a measure of the spread of the ESTIMATES of a measure of a population calculated from repeated sampling. In short… STANDARD DEVIATION STANDARD ERROR Variation in DATA Variation in ESTIMATES FROM SAMPLES Point Estimators When inferences are made from the sample to the population, the sample mean is viewed as an estimator of the mean of the population from which the sample was selected. Similarly, the proportion of successes in a sample is an estimator of the proportion of successes in the population. Properties of Point Estimators The summary statistic should be UNBIASED, that is the mean of the sampling distribution is equal to the value you would get if you computed the summary statistic using the entire population. More formally, an estimator is unbiased if its expected value equals the parameter being estimated. The summary statistic should have as little variability as possible (be more precise than other estimates) and should have a standard error that decreases as the sample size increases. Population Sample Parameter Statistic Mean µ Standard Deviation ơ s Size N n X Sampling Distribution X X Properties of the Sampling Distribution of the Sample Mean If a random sample of size n is selected from a population with mean µ and standard deviation σ, then The mean X of the sampling distribution of X equals the mean of the population µ mX= µ Properties of the Sampling Distribution of the Sample Mean If a random sample of size n is selected from a population with mean µ and standard deviation σ, then The standard deviation, X , of the sampling distribution of X , sometimes called the standard error of the mean, equals the standard deviation of the population σ, divided by the square root of the sample size n: X = σ/√n *Only used when N>10n Properties of the Sampling Distribution of the Sample Mean If a random sample of size n is selected from a population with mean µ and standard deviation σ, then The shape of the sampling distribution will be approximately normal if the population is approximately normal; for other populations, the sampling distribution becomes more normal as n increases This property is called the CENTRAL LIMIT THEOREM (CLT) Reasonably Likely Averages Mean ± 1.96(SE) 1.96 is the z-score and comes from the cut off point of the middle 95% of a normal distribution If the Sampling Distribution is known… Probability questions about sample statistics can be answered. For example, A simple random sample of 50 is selected from a normal population with a mean of 50 and a standard deviation of 10. What is the probability that the sample mean will be greater than 53? The Answer… A simple random sample of 50 is selected from a normal population with a mean of 50 and a standard deviation of 10. What is the probability that the sample mean will be greater than 53? Normal ( , ) Sampling Distribution Normal , n Normal 50,10 Sampling Distribution Normal 50,10 50 x 53 50 z 2.12 p .017 10 n 50 Properties of the Sampling Distribution of the sum of a Sample If a random sample of size n is selected from a distribution with mean µ and standard deviation σ, then The mean of the sampling distribution of the sum is µsum = nµ The standard error of the sampling distribution of the sum is σsum=√n · σ CLT applies Sampling Distribution of the Sample Proportion We will now move from studying the behavior of the sample mean to studying the behavior of the sample proportions (the proportion of “successes” in the sample) Properties of the Sampling Distribution of the Number of Successes If a random sample of size n is selected from a population with proportion of successes, p, then the sampling distribution of the number of successes X Has mean µx = np Has standard error σx = √np(1-p) Will be approximately normal as long as n is large enough As a guideline both np and n(1-p) are at least 10 np≥10 and n(1-p) ≥10 Example The use of seat belts continues to rise in the United States, with overall seat belt usage of 82%. Mississippi lags behind the rest of the nation—only about 60% wear seat belts. Suppose you take a random sample of 40 Mississippians. How many do you expect will wear seat belts? What is the probability that 30 or more of the people in your sample wear seat belts? Solution Sampling Distributions of the Sample Proportion True proportion of successes is represented by “p” Sample proportion of successes is represented by “p hat” p hat = (# of successes)/(sample size) Sampling Distribution of p-hat How does p-hat behave? To study the behavior, imagine taking many random samples of size n, and computing a p-hat for each of the samples. Then we plot this set of p-hats with a histogram. Sampling Distribution of p-hat Properties of p-hat When sample sizes are fairly large, the shape of the p-hat distribution will be normal. The mean of the distribution is the value of the population parameter p. The standard deviation of this distribution is the square root of p(1-p)/n. ˆ) sd ( p p (1 p ) n As a guideline use np ≥ 10 and n(1-p) ≥ 10 Example A. B. About 60% of Mississippians use seat belts. Suppose your class conducts a survey of 40 randomly selected Mississippians. What is the chance that 75% or more of those selected wear seat belts? Would it be quite unusual to find that fewer than 25% of Mississippians selected wear seat belts? Solution