Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Distributions A sampling distribution is created by, as the name suggests, sampling from a population and then calculating some statistic such as the sample mean [X-Bar,] sample proportion [p-hat], difference in means, difference in proportions, and numerous other statistics. We use these sampling distributions to assist us in “estimating” population parameters such as the population mean as well as testing hypotheses such as testing the claim that the average fill volume of coke cans is truly 12 fl.oz. [Ho: μ = 12] Example • A fair die is thrown an infinite number of times, • with the random variable X = # of spots on any throw. • The probability distribution of X is: X P(X) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 • …and the mean and variance can be calculated to be: μ = 3.5 and σ2 = 2.92 6 1/6 • • • Sampling Distribution of Two Dice A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means… While there are 36 possible samples of size 2, there are only 11 values for more frequently than others , and some (e.g. =3.5) occur Sampling Distribution of Two Dice… • The sampling distribution of is shown below: 6/36 5/36 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 ) 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 4/36 P( P( ) 3/36 2/36 1/36 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 Compare distribution of X and sampling distribution of 1 2 3 4 X 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 The relationship between population parameters and parameters of the sampling distribution of the sample mean is Central Limit Theorem • The sampling distribution of the mean of a random sample drawn from any population is approximately normal for a sufficiently large sample size. • The larger the sample size, the more closely the sampling distribution of X-bar will resemble a normal distribution. • If the population is normal, then X-bar is normally distributed for all values of n. • If the population is non-normal, then X-bar is approximately normal only for larger values of n. Sampling Distribution of Sample Mean • If X is normal, X-Bar is normal. If X is nonnormal, X-Bar is approximately normal for sufficiently large sample sizes. • Note: the definition of “sufficiently large” depends on the extent of nonnormality of x (e.g. heavily skewed; multimodal) Example • A quality engineer has observed that the amount of soda in each “32-ounce” bottle of coke is actually a normally distributed random variable, with a mean of 32.2 ounces and a standard deviation of .3 ounce. The “32-ounce” is what is on the label of the bottle. • If a customer buys one bottle, what is the probability that the bottle will contain more than 32 ounces (the label)? This was covered in the chapter on normal distributions. Example • We want to find P(X > 32), where X is normally distributed with mean 32.2 and standard deviation 0.3 • “the probability that a single bottle contains more than 32 fl.oz is approximately 0.75.” • This is good because it means that 75% of your bottles actually contain more than the label. Example • If you go to the store and buy a carton of four bottles, you know that each individual bottle should contain somewhere around 32.2 fl.oz. and some may actually contain less that 32.2 fl.oz. (some may actually contain less than the label of 32). You now wish to check to see if the “mean” volume of coke in a 4-pack will be greater than 32 ounces? In other words, you want to know if your 4 bottles average at least 32 fl.oz. • This requires that we know the sampling distribution of the sample mean based on a sample size of 4. Example • = 32.2 • Z = (X – 32.2)/ 0.15 = (32 – 32.2)/0.15 = -1.33 Example Problem • The dean of the School of Business claims that the average salary of the school’s graduates one year after graduation is $800 per week (μx) with a standard deviation of $100 (σx). Note: This is the population. A second-year student would like to check whether the claim about the mean is correct. He does a survey of 25 people who graduated one year ago and determines their weekly salary. He discovers the sample mean to be $750. Is this consistent with the dean’s claim??? x 800 x / n 100 / 25 20 Sample Proportions • The estimator of a population proportion of successes is the sample proportion. That is, we count the number of successes in a sample and compute: • ( “p-hat”). • X is the number of successes, n is the sample size. Sampling Distribution of Sample Proportion • We can determine the mean, variance, and standard deviation of . • (The standard deviation of is called the standard error of the proportion.) Sampling Distribution of Sample Proportion • Normal approximation to the binomial works best when the number of experiments, n, (sample size) is large, and the probability of success, p, is close to 0.5, but it works fine if • Two conditions should be met: 1) np ≥ 5 • 2) n(1–p) ≥ 5 • If these conditions are met, we can use the normal distribution to work proportions problems which means we will eventually use the Z-Score Example • • Assume the probability of an infection during an operation is 0.1(p) and you observe the number of infections during the next 100 (n) operations. Are the conditions satisfied to assume normality? • What is the sampling distribution of the sample proportion • What is the probability that you get more than 20 infections in the next 100 operations? ? Other Common Sampling Distributions • Sampling distribution of the difference between two sample means. • Sampling distribution of the difference between two sample proportions. Homework – Chapter Advise • Don’t worry about – “finite population issues” – Sections 5.4, 5.6 • HW: 5.3.1, 5.3.3, 5.3.5, 5.5.1, 5.5.5 • Review questions and exercises HW: – 1, 4, 7