Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Distributions Martina Litschmannová [email protected] EA 538 Populations vs. Sample ο§ A population includes each element from the set of observations that can be made. ο§ A sample consists only of observations drawn from the population. Exploratory Data Analysis sampling sample population Inferential Statistics Characteristic of a population vs. characteristic of a sample ο§ A a measurable characteristic of a population, such as a mean or standard deviation, is called a parameter, but a measurable characteristic of a sample is called a statistic. Population Sample Expected value (mean) πΈ π , resp. π Sample mean (average) π Median x0,5 Variance (dispersion) π· π , resp. π 2 Std. deviation Ο Probability Ο Sample median π0,5 Sample variance S2 Sample std. deviation S Relative frequency p Sampling Distributions ο§ A sampling distribution is created by, as the name suggests, sampling. ο§ The method we will employ on the rules of probability and the laws of expected value and variance to derive the sampling distribution. For example, consider the roll of one and two dicesβ¦ The roll of one die ο§ A fair die is thrown infinitely many times, with the random variable π = # of spots on any throw. ο§ The probability distribution of π is: π₯π π π₯π 1 2 3 4 5 6 1/6 1/6 1/6 1/6 1/6 1/6 ο§ The mean, variance and standard deviation are calculated as: πΈ π = π· π = 6 2 = π=1 π₯π π π₯π = π, π, πΈ π 2 2 πΈ π β πΈ π = π, ππ, π 6 2 π=1 π₯π π π = π₯π = 15,17 π· π = π, ππ The roll of Two Dices The Sampling Distribution of Mean ο§ A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means. Sample {1, 1} {1, 2} {1, 3} {1, 4} {1, 5} {1, 6} {2, 1} {2, 2} {2, 3} {2, 4} {2, 5} {2, 6} Mean 1,0 1,5 2,0 2,5 3,0 3,5 1,5 2,0 2,5 3,0 3,5 4,0 Sample {3, 1} {3, 2} {3, 3} {3, 4} {3, 5} {3, 6} {4, 1} {4, 2} {4, 3} {4, 4} {4, 5} {4, 6} Mean 2,0 2,5 3,0 3,5 4,0 4,5 2,5 3,0 3,5 4,0 4,5 5,0 Sample {5, 1} {5, 2} {5, 3} {5, 4} {5, 5} {5, 6} {6, 1} {6, 2} {6, 3} {6, 4} {6, 5} {6, 6} Mean 3,0 3,5 4,0 4,5 5,0 5,5 3,5 4,0 4,5 5,0 5,5 6,0 The roll of Two Dices The Sampling Distribution of Mean ο§ A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means. π(π₯) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 6/36 5/36 Probability π₯ 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 5,0 5,5 6,0 4/36 3/36 2/36 1/36 0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 Mean ο§ While there are 36 possible samples of size 2, there are only 11 values for π, and some (e.g. π₯ = 3,5) occur more frequently than others (e.g. π₯ = 1). The roll of Two Dices The Sampling Distribution of Mean ο§ A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means. π(π₯) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 πΈ π = 6/36 5/36 Probability π₯ 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 5,0 5,5 6,0 4/36 3/36 2/36 1/36 0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 Mean 11 π=1 π₯π π π₯π = π, π, πΈ π 2 = π· π = πΈ π2 β πΈ π 2 2 11 π=1 π₯π π = π, ππ, π π = π₯π = 13,71 π· π = π, ππ The roll of Two Dices The Sampling Distribution of Mean ο§ A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their means. π(π₯) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 6/36 5/36 Probability π₯ 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 5,0 5,5 6,0 4/36 3/36 2/36 1/36 0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 Mean ο§ ππ = # of spots on i-th dice, πΈ π1 = πΈ π2 = π, π· π1 = π· π2 = π 2 1 2 ο§ πΈ π = πΈ( π π = 2 π=1 ππ ) π· π = 1 1 2 = π = π, π, π· π = π·( π = π, ππ 2 π=1 ππ ) 1 2 = π 2 = π, ππ, Compare ο§ ππ = # of spots on i-th dice, π = 1,2, πΈ ππ = π, π· ππ = π 2 P(x) 6/36 1/6 Probability 5/36 4/36 3/36 2/36 1/36 0 0 1 2 3 4 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 x Distribution of X Mean Sampling Distribution of πΏ Note that: πΈ π = π, π· π = π2 2 Generalize - Central Limit Theorem ο§ The sampling distribution of the mean of a random sample drawn from any population is approximately normal for a sufficiently large sample size. πΈ π = π, π· π = π2 , π π π = π π ο§ The larger the sample size, the more closely the sampling distribution of X will resemble a normal distribution. f(x) 1,2 n=1 n=5 n=10 n=30 1 0,8 0,6 0,4 0,2 0 0 20 x Central Limit Theorem ο§ ππ β¦ random variable, π = 1, β¦ , π, π β β πΈ ππ = π, π· ππ = π 2 π π π Same Distribution of all ππ ο§ Note that: πΈ π = π, π· π = Sampling Distribution of πΏ π2 , π π π = π π β¦ standard error Generalize - Central Limit Theorem ο§ The sampling distribution of π drawn from any population is approximately normal for a sufficiently large sample size. π2 πβπ π β π π, β π β π 0,1 π π ο§ In many practical situations, a sample size of 30 may be sufficiently large to allow us to use the normal distribution as an approximation for the sampling distribution of π. ο§ Note: If π is normal, π is normal. We donβt need Central Limit Theorem in this case. 1. The foreman of a bottling plant has observed that the amount of soda in each β32-ounceβ bottle is actually a normally distributed random variable, with a mean of 32,2 ounces and a standard deviation of 0,3 ounce. A) If a customer buys one bottle, what is the probability that the bottle will contain more than 32 ounces? 1. The foreman of a bottling plant has observed that the amount of soda in each β32-ounceβ bottle is actually a normally distributed random variable, with a mean of 32,2 ounces and a standard deviation of 0,3 ounce. B) If a customer buys a carton of four bottles, what is the probability that the mean amount of the four bottles will be greater than 32 ounces? Graphically Speaking What is the probability that one bottle will contain more than 32 ounces? π β π 32,2; 0,09 What is the probability that the mean of four bottles will exceed 32 oz? 0,09 π β π 32,2; 4 2. The probability distribution of 6-month incomes of account executives has mean $20,000 and standard deviation $5,000. A) A single executiveβs income is $20,000. Can it be said that this executiveβs income exceeds 50% of all account executive incomes? Answer: No information given about shape of distribution of X; we do not know the median of 6-month incomes. 2. The probability distribution of 6-month incomes of account executives has mean $20,000 and standard deviation $5,000. B) π=64 account executives are randomly selected. What is the probability that the sample mean exceeds $20,500? 3. A sample of size π=16 is drawn from a normally distributed population with πΈ ππ = 20 and π ππ = 8. Find π(16 < 4. Battery life π β π(20, 10). Guarantee: avg. battery life in a case of 24 exceeds 16 hrs. Find the probability that a randomly selected case meets the guarantee. 5. Cans of salmon are supposed to have a net weight of 6 oz. The producer says that the net weight is a random variable with mean ο=6,05 oz. and stand. dev. ο³=0,18 oz. Suppose you take a random sample of 36 cans and calculate the sample mean weight to be 5.97 oz. Find the probability that the mean weight of the sample is less than or equal to 5.97 oz. Since π π < 5,97 = 0,0038, either you observed a βrareβ event (recall: 5,97 oz is 2,67 stand. dev. below the mean) and the mean fill πΈ(π) is in fact 6,05 oz. (the value claimed by the producer), the true mean fill is less than 6,05 oz (the producer is lying ). Sampling Distribution of a Proportion ο§ The estimator of a population proportion π of successes is the sample proportion π. That is, we count the number of successes in a sample and compute: π π π=π= . ο§ X is the number of successes, n is the sample size. Normal Approximation to Binomial ο§ Binomial distribution with n=20 and π = 0,5 with a normal approximation superimposed ( π = ππ = 10 and π 2 = π 1βπ π = 0,0125). Normal Approximation to Binomial ο§ Normal approximation to the binomial works best when the number of experiments π (sample size) is large, and the probability of success π is close to 0,5. ο§ For the approximation to provide good results one condition should be met: π> 9 π 1βπ . Sampling Distribution of a Sample Proportion ο§ Using the laws of expected value and variance, we can determine the mean, variance, and standard deviation of π. πΈ π = π, π· π = π 1βπ π ,π π = π β π π = π, π = π 1βπ π π 1βπ π standard error of the proportion ο§ Sample proportions can be standardized to a standard normal πβπ distribution using this formulation: π β π 0,1 . π 1βπ 6. Find the probability that of the next 120 births, no more than 40% will be boys. Assume equal probabilities for the births of boys and girls. 7. 12% of students at NCSU are left-handed. What is the probability that in a sample of 50 students, the sample proportion that are left-handed is less than 11%? Sampling Distribution: Difference of two means Assumption: Independent random samples be drawn from each of two normal populations. ο§ If this condition is met, then the sampling distribution of the difference between the two sample means will be normally distributed if the populations are both normal. ο§ Note: If the two populations are not both normally distributed, but the sample sizes are βlargeβ (>30), the distribution of π1 β π2 is approximately normal β Central Limit Theorem. Sampling Distribution: Difference of two means 2 2 π1 π1 β π π1 , π1 π2 π2 β π π2 , π2 , 2 β πΈ π1 β π2 = π1 β π2 , π· π1 β π2 = 2 π1 β π2 β π π1 β π π2 , 1 π1 π1 β π2 β π1 β π2 2 π1 π1 2 + π2 π2 2 + π2 π2 β π 0,1 2 π π1 + 2 π1 π2 standard error of the difference between two means Sampling Distribution: Difference of two proportions Assumption: Central Limit Theorem: π1 > 9 π1 1βπ1 , π2 > 9 π2 1βπ2 Sampling Distribution: Difference of two means π1 1 β π1 π1 β π π1 , π1 β πΈ π1 β π2 = π1 β π2 , π1 β π2 β π , π2 1 β π2 π2 β π π2 , π2 π· π1 β π2 = π1 1βπ1 π1 β π2 , π1 π1 β π2 β π1 β π2 π1 1 β π1 π 1 β π2 + 2 π1 π2 π1 1βπ1 π1 + π2 1βπ2 π2 π2 1βπ2 + π2 β π 0,1 standard error of the difference between two proportions Special Continous Distribution π 2 Distribution βπ = 1, β¦ , π: ππ β π 0; 1 , pak π = π 2 π=1 ππ β ππ2 Degrees of Freedom Using of π 2 Distribution π β 1 π2 2 β π πβ1 π2 8. The Acme Battery Company has developed a new cell phone battery. On average, the battery lasts 60 minutes on a single charge. The standard deviation is 4 minutes. Suppose the manufacturing department runs a quality control test. They randomly select 7 batteries. What is probability, that the standard deviation of the selected batteries is greather than 6 minutes? Student's t Distribution π β π 0,1 , π β ππ2 , π and π are independent variables β If π = π , π π then π has Studentβs t Distribution with π degrees of freedom, π β π‘π . Using of Studentβs tDistribution πβπ π β π‘πβ1 π The t distribution should be used with small samples from populations that are not approximately normal. 9. Acme Corporation manufactures light bulbs. The CEO claims that an average Acme light bulb lasts 300 days. A researcher randomly selects 15 bulbs for testing. The sampled bulbs last an average of 290 days, with a standard deviation of 50 days. If the CEO's claim were true, what is the probability that 15 randomly selected bulbs would have an average life of no more than 290 days? F Distribution The f Statistic ο§ The f statistic, also known as an f value, is a random variable that has an F distribution. Here are the steps required to compute an f statistic: ο§ Select a random sample of size n1 from a normal population, having a standard deviation equal to Ο1. ο§ Select an independent random sample of size n2 from a normal population, having a standard deviation equal to Ο2. ο§ The f statistic is the ratio of s12/Ο12 and s22/Ο22. F Distribution Here are the steps required to compute an f statistic: ο§ Select a random sample of size n1 from a normal population, having a standard deviation equal to Ο1. ο§ Select an independent random sample of size n2 from a normal population, having a standard deviation equal to Ο2. ο§ The f statistic is the ratio of s12/Ο12 and s22/Ο22. π = π12 π12 π22 π22 β πΉπ1 β1,π2 β1. Degrees of freedom 10. Suppose you randomly select 7 women from a population of women, and 12 men from a population of men. The table below shows the standard deviation in each sample and in each population. Population Population standard deviation Sample standard deviation Women Men 30 50 35 45 Find probability, that sample standard deviation of men is greather than twice sample standard deviation of women. Study materials : ο§ http://homel.vsb.cz/~bri10/Teaching/Bris%20Prob%20&%20Stat.pdf (p. 93 - p.104) ο§ http://stattrek.com/tutorials/statistics-tutorial.aspx?Tutorial=Stat (Distributions β Continous (Students, π 2 , F Distribution) + Estimation