Download UNIT 3 Section 7 SAMPLING DISTRIBUTIONS

UNIT 3 Section 7 SAMPLING DISTRIBUTIONS When we describe characteristics of a population, we refer to these as “parameters,” and when we describe characteristics of a sample, we refer to these as “statistics. “ (Characteristics being mean, proportion, standard deviation, etc.) Average GPA of all PBHS students: 𝜇 = 3.59 (a parameter) Average GPA of an SRS of PBHS students: 𝑥̅ = 3.561 (as statistic) When information about all individuals of a population is collected to provide a parameter, the distribution a “population distribution.” When information about a sample of individuals is collected to provide a statistic, the distribution displays “sample data.” We can then combine the various resulting values of the statistic taken from all possible SRSs of size n (from the sample data collections) to provide a “sampling distribution.” Sampling variability explains the chance variation of values 𝑥̅ obtained from each sample. FOR EXAMPLE: Consider the graphs below: there are 200 chips in a bag, 100 red and 100 blue. Suppose we want to look at the proportion of red chips randomly selected. The population distribution shows the distribution when selecting all chips (left column); the distributions of sample data shows the results when drawing SRSs n=20 from the bag of chips (center column); and the sampling distribution shows the distribution of all proportions collected from samples (right column). Central Limit Theorem The Central Limit Theorem refers to the fundamental idea that the sampling distribution of any mean becomes more normal as the sample size increases (must have a finite σ). As the sample size increases, the sampling distribution gets closer and closer to a Normal distribution. Regarding sample size: if your original population distribution is skewed, then your sampling distribution will better approximate a Normal distribution if you use a bigger sample size. (If population is Normally distributed, then you can “get away with” a smaller sample size.) The Normal distribution condition (#3 under “Conditions” ahead) requires for proportions that np and n(1-p) must be ≥ 10 and for means that n ≥ 30 or the population is Normally distributed in order to determine that the sampling distribution is Normal. Bias (think mean/center of distribution) Sample means ( x ) and sample proportions ( p̂ ) are considered “unbiased estimators” of population parameters μ or p, respectively. The mean of a sampling distribution is the mean of the actual population. Variability (think standard deviation/spread of distribution) The variability of a statistic is described by the spread of the sampling distribution, and the spread is determined by the sampling design and the sample size. The larger the sample size, the lower the standard deviation and the less the spread. Statistics from larger samples have less variability which increases the precision of the estimate. The 10% Condition (#2 under “Conditions” ahead) requires that the population size to be at least 10 times the sample size. If the condition is met, the spread of the sampling distribution does not depend on the size of the population. If not met, we cannot calculate the standard deviation of the sampling distribution for p̂ or x . Sample Proportion, pˆ (to estimate population proportion p) æ p(1- p) ö N ç p, ÷ n ø è In order to determine what proportion (percentage) of a population satisfies some categorical variable (such as the proportion of Americans who are registered Republicans or the proportion of teenagers who own a car) we will find pˆ from an SRS to estimate the unknown parameter p. Recall that the sampling distribution of pˆ describes how the sample proportion pˆ varies in all possible samples from a population.  The mean of the sampling distribution of pˆ is equal to the population proportion p; therefore, pˆ is an “unbiased estimator” of population proportion p. A statistic is considered   to be an unbiased estimator if the mean of it sampling distribution is equal to the value of the parameter being estimated.   When sample size n is large, the sampling distribution of p is close to a Normal distribution. We will use Normal approximation when the Large Counts Condition is met (see below). In order to make inferences about a population, certain assumptions/conditions must be met: Assumptions   Independent sample values Large enough sample size Conditions 1. Random samples must be used; we will use SRS. (Randomization in selecting subjects, assigning treatments, sampling methods, ...) 2. Population must be at least 10 times the sample size. (required for finding standard deviation) 3. Both np ≥ 10 AND n(1- p) ≥ 10. (required for Normal approximation of sampling distribution) (Because sample size n in the formula for standard deviation involves taking the square root (and a fraction/denominator), a sample size four times larger is needed to reduce the standard deviation by one half.) Sample Mean, x (to estimate population mean  ) æ s ö N ç m, ÷ è  n ø  In order to determine the mean (average) of a population satisfies some quantitative variable, such as the mean SAT score among high school seniors or the mean salary of U.S. adults, we will find x from an SRS to estimate the unknown parameter  . The mean of sampling distribution is  , so x is an “unbiased estimator” of  . A statistic is considered  to be an unbiased estimator if the mean of it sampling distribution is equal to the value of the parameter  being estimated.    If the population is normally distributed, then so is the sampling distribution of the sample mean x , even when the sample size is small. If the population is not Normally distributed and the sample size is small, then the sampling distribution of x will resemble the population shape (left-skewed, bimodal, etc.).  However, according to the Central limit Theorem, the shape of the sampling distribution will become approximately Normal as sample size n increases, regardless of the shape of the population distribution.  In order to make inferences about a population, certain assumptions/conditions must be met: Assumptions   Independent sample values Large enough sample size Conditions 1. Random samples must be used; we will use SRS. (Randomization in selecting subjects, assigning treatments, sampling methods, ...) 2. Population must be at least 10 times the sample size. (required for finding standard deviation) 3. Population is normally distributed OR n ≥ 30. (required for Normal approximation of sampling distribution)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download UNIT 3 Section 7 SAMPLING DISTRIBUTIONS