Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript

Sampling Distributions Chapter 18 X Sampling Distributions A parameter is a measure of the population. This value is typically unknown. (µ, σ, and now p or π) A statistic is a measure from a sample. We often use a statistic to estimate an unknown parameter. ( X , sx, and now p̂ ) Sampling Distributions • A proportion is computed from a set of categorical data. It is a random quantity that has a distribution. We call that the sampling distribution for the proportion. Sampling Distributions • Sampling variability: in repeated random sampling, the value of the statistic will vary. (We don’t expect the same value every time we sample do we!) Sampling Distributions • To describe sampling distributions, use the same descriptions as other distribution: overall shape, outliers, center, and spread. (CUSS & BS) Sampling Distributions • Bias suggests that a sampling technique favors a certain outcome (the sampling technique is unfair). Bias in a sampling distribution, is the idea that the center of the sampling distribution is not the same as the population center. • A statistic is unbiased if the mean of its sampling distribution is equal to the population mean it is estimating. Sampling Distributions • The variability of a statistic (σ) is described by the spread of its sampling distribution. The spread is determined by the sampling design and the sample size. Larger samples give less variability. Bias & Variability • Target Practice!!!! Sampling Distributions of Proportions • Sampling Distribution of a Sample Proportion – Categorical Data Choose an SRS of size n from a large population with population proportion p. p̂ is the proportion of the sample having that same characteristic. The sampling distribution of p is approximately normal under these conditions: Sampling Distributions of Proportions • Conditions: 1) Randomization. The sample should be a simple random sample (SRS) of the population. (This is often difficult to achieve in reality. We at least need to be very confident that the sampling method was unbiased and that the sample is representative of the population.) Sampling Distributions of Proportions • Conditions: 2) 10% Rule. In order to insure independence, we can not take a sample that is too large without replacement. As long as our sample is no more than 10% of our population size, we protect independence. Sampling Distributions of Proportions • Conditions: 3) Success/Failure. To insure that the sample size is large enough to approximate normal, we must expect at least 10 successes and at least 10 failures. np 10 and n(1 – p) 10 Sampling Distributions of Proportions Provided conditions are met, the sampling distribution of a proportion will be normal with mean p and standard deviation p(1 p) n Or in notation N(p, p(1 p) ) n Sampling Distributions of Sample Means • Sampling Distribution of a Sample Mean – Quantitative Data • A distribution is created from the means of many samples. Data is quantitative. What is the purpose? Averages are less variable and more normal than individual observations Sampling Distributions of Sample Means • The shape of the distribution of x-bar depends on the shape of the population. ** If the population is normal, then the distribution of the sample mean will be normal (regardless of sample size). Sampling Distributions of Sample Means • The shape of the distribution x-bar depends on the shape of the population. **For skewed or odd shaped distributions, if the sample size is large enough, the sampling distribution will be approximately normal. This idea leads us to… Sampling Distributions of Sample Means The Central Limit Theorem (CLT) CLT addresses two things in a distribution, shape and spread. As the sample size increases: • The shape of the sampling distribution becomes more normal • The variability of the sampling distribution decreases Sampling Distributions of Sample Means • The Law of Large Numbers Draw observations at random from any population with given mean . As n increases, the mean of the observed values (x-bar) gets closer and closer to the true mean, . x Sampling Distributions of Sample Means • Conditions: (1st 2 are the same for both kinds of data) 1) Randomization. 2) 10% rule. Sampling Distributions of Sample Means • Conditions: 3) Large Enough Sample. There is no “for sure” way to tell if your sample is large enough. It is common practice that if your sample is at least 30 (n ≥ 30), you are OK to assume normal for the sampling distribution. (Remember, if the distribution is given normal, then any sample size is OK) Sampling Distributions of Sample Means • When conditions are met, and the data is quantitative, the sampling distribution is normal with a center at the population mean, μ, and a standard deviation at X So…. n N(μ, ) X n Sampling Distributions of Sample Means • Since the standard deviation decreases at a rate of √n, we must take a sample 4 times as large to reduce the standard deviation by ½. Sampling Distributions • We said at the beginning that in most real life cases, we will not know the population parameters (µ, σ, p or π) so we will have to use the sample statistics as estimates of those. Our terminology changes just a little… Sampling Distributions Sampling Distributions Sampling Distributions