Download Sampling Distributions and the Central Limit

Sampling Distribution Models and the Central Limit Theorem Transition from Data Analysis and Probability to Statistics Sampling Distribution Models and the Central Limit Theorem OBJECTIVES At the conclusion of this unit you should be able to:  1) Derive the correct sampling distribution model when given the population parameters  2) Correctly apply the Central Limit Theorem to calculate probabilities associated with a sample proportion and sample mean Probability:  From population to sample (deduction) Statistics:  From sample to the population (induction) Sampling Distributions Population parameter: a numerical descriptive measure of a population. (for example:  p (a population proportion); the numerical value of a population parameter is usually not known) Examples:  =mean height of all NCSU students p=proportion of Raleigh residents who favor stricter gun control laws  Sample statistic: a numerical descriptive measure calculated from sample data. (e.g, x, s, p (sample proportion))  Parameters; Statistics  In real life parameters of populations are unknown and unknowable. – For example, the mean height of US adult (18+) men is unknown and unknowable   Rather than investigating the whole population, we take a sample, calculate a statistic related to the parameter of interest, and make an inference. The sampling distribution of the statistic tells us how the value of the statistic varies from sample to sample. DEFINITION: Sampling Distribution  The sampling distribution of a sample statistic calculated from a sample of n measurements is the probability distribution of values taken by the statistic in all possible samples of size n taken from the same population. Based on all possible samples of size n. Constructing a Sampling Distribution   In some cases the sampling distribution can be determined exactly. In other cases it must be approximated by using a computer to draw some of the possible samples of size n and drawing a histogram. Sampling Distribution Models of Sample Proportions Example: sampling distribution of p, the sample proportion    If a coin is fair the probability of a head on any toss of the coin is p = 0.5 (p is the population parameter) Imagine tossing this fair coin 4 times and calculating the proportion p of the 4 tosses that result in heads (note that p = x/4, where x is the number of heads in 4 tosses). Objective: determine the sampling distribution of p, the proportion of heads in 4 tosses of a fair coin. Example: Sampling distribution of p 0 1 2 3 4 Possible values of pˆ : = 0, = .25, = .50, = .75, = 1 4 4 4 4 4 There are 24 = 16 equally likely possible outcomes (1 =head, 0 =tail) (1,1,1,1) (1,1,1,0) (1,1,0,1) (1,0,1,1) (0,1,1,1) (1,1,0,0) (1,0,1,0) (1,0,0,1) (0,1,1,0) (0,1,0,1) (0,0,1,1) (1,0,0,0) (0,1,0,0) (0,0,1,0) (0,0,0,1) (0,0,0,0) p P(p) 0.0 0.25 (0 heads) 1/16= 0.0625 0.50 0.75 1.0 (1 head) (2 heads) (3 heads) (4 heads) 4/16= 0.25 4/16= 0.25 1/16= 0.0625 6/16= 0.375 Sampling distribution of p p P(p) 0.0 0.25 (0 heads) 1/16= 0.0625 0.50 0.75 1.0 (1 head) (2 heads) (3 heads) (4 heads) 4/16= 0.25 4/16= 0.25 1/16= 0.0625 6/16= 0.375 Sampling distribution of p (cont.) p P(p)   0.0 0.25 (0 heads) (1 head) 0.50 0.75 1.0 (2 heads) (3 heads) (4 heads) 1/16= 0.0625 6/16= 0.375 4/16= 0.25 4/16= 0.25 1/16= 0.0625 E(p) =0*.0625+ 0.25*0.25+ 0.50*0.375 +0.75*0.25+ 1.0*0.0625 = 0.5 = p (the prob of heads) Var(p) = (0  0.5)  0.0625  (.25  0.5)  0.25  (0.5  0.5)  0.375 2 2  (0.75  0.5)  0.25  (1  0.5)  0.0625 2 2 = 0.0625  SD( pˆ ) = Var ( pˆ ) = 0.0625 = 0.25  Note that SD( pˆ ) = p (1  p ) .5  .5 .5 = = = 0.25 n 4 4 2 Expected Value and Standard Deviation of the Sampling Distribution of p   E(p) = p SD(p) = p(1  p) n where p is the “success” probability in the sampled population and n is the sample size Shape of Sampling Distribution of p  The sampling distribution of p is approximately normal when the sample size n is large enough. n large enough means np ≥ 10 and n(1-p) ≥ 10 Shape of Sampling Distribution of p Population Distribution, p=.65 Population, p = .65 0.7 0.65 0.6 0.5 0.4 0.3 0.35 0.2 0.1 0 0 1 Sampling distribution of p for samples of size n Example 8% of American Caucasian male population is color blind.  Use computer to simulate random samples of size n = 1000  Histogram of phat's from Simulated Samples (2000 independent samples, each of size n=1000 men) 300 200 100 9 7 0. 10 phat 0. 09 1 0. 09 3 0. 08 5 0. 07 7 0. 06 9 0. 05 1 0 0. 05 # of Samples 400 The sampling distribution model for a sample proportion p Provided that the sampled values are independent and the sample size n is large enough, the sampling distribution of p is modeled by a normal distribution with E(p) = p and standard deviation SD(p) = pq n , that is  pq  pˆ ~ N  p,  n   where q = 1 – p and where n large enough means np>=10 and nq>=10 The Central Limit Theorem will be a formal statement of this fact. Example: binge drinking by college students     Study by Harvard School of Public Health: 44% of college students binge drink. At a particular college 244 students were surveyed; 36% admitted to binge drinking in the past week Assume the value 0.44 given in the Harvard study is the proportion p of college students that binge drink; that is 0.44 is the population proportion p Compute the probability that in a sample of 244 students, 36% or less have engaged in binge drinking. Example: binge drinking by college students (cont.)   Let p be the proportion in a sample of 244 that engage in binge drinking. We want to compute P ( pˆ  .36) pq   .44 *.56 E(p) = p = .44; SD(p) = n = 244 = .032 Since np = 244*.44 = 107.36 and nq = 244*.56 = 136.64 are both greater than 10, we can model the sampling distribution of p with a normal distribution, so … Example: binge drinking by college students (cont.) pˆ ~ N (.44,.032) pˆ  .44 .36  .44   So P ( pˆ  .36) = P    .032   .032 = P ( z  2.5) = .0062 Example: snapchat by college students     recent scientifically valid survey : 77% of college students use snapchat. 1136 college students surveyed; 75% reported that they use snapchat. Assume the value 0.77 given in the survey is the proportion p of college students that use snapchat; that is 0.77 is the population proportion p Compute the probability that in a sample of 1136 students, 75% or less use snapchat. Example: snapchat by college students (cont.)   Let p be the proportion in a sample of 1136 that use snapchat. We want to compute P ( pˆ  .75) pq   = .77 *.23 = .0125 E(p) = p = .77; SD(p) = n 1136 Since np = 1136*.77 = 874.72 and nq = 1136*.23 = 261.28 are both greater than 10, we can model the sampling distribution of p with a normal distribution, so … Example: snapchat by college students (cont.) pˆ ~ N (.77,.0125) pˆ  .75 .75  .77   So P ( pˆ  .75) = P    .0125   .0125 = P ( z  1.6) = .0548 Sampling Distribution Models of Sample Means Another Population Parameter of Frequent Interest: the Population Mean µ  To estimate the unknown value of µ, the sample mean x is often used.  We need to examine the Sampling Distribution of the Sample Mean x (the probability distribution of all possible values of x based on a sample of size n). Example Professor Stickler has a large statistics class of over 300 students. He asked them the ages of their cars and obtained the following probability distribution: x 2 3 4 5 6 7 8 p(x) 1/14 1/14 2/14 2/14 2/14 3/14 3/14  SRS n=2 is to be drawn from pop.  Find the sampling distribution of the sample mean x for samples of size n = 2.  Solution  7 possible ages (ages 2 through 8)  Total of 72=49 possible samples of size 2  All 49 possible samples with the corresponding sample means and probabilities are on the next slide All 49 possible samples of size n = 2 x p(x) 2 1/14 3 1/14 4 2/14 5 2/14 6 2/14 7 3/14 8 3/14 Sample 2,2 2,4 2,6 2,8 2,5 2,3 2,7 4,2 4,4 4,6 4,8 4,5 4,3 4,7 6,2 6,4 6,6 xbar 2 3 4 5 3.5 2.5 4.5 3 4 5 6 4.5 3.5 5.5 4 5 6 Prob 1 196 2 196 2 196 3 196 2 196 1 196 3 196 2 196 4 196 4 196 6 196 4 196 2 196 6 196 2 196 4 196 Sample 6,8 6,5 6,3 6,7 8,2 8,4 8,6 8,8 8,5 8,3 8,7 5,2 5,4 5,6 5,8 5,5 xbar 7 5.5 4.5 6.5 5 6 7 8 6.5 5.5 7.5 3.5 4.5 5.5 6.5 5 Prob 6 196 4 196 2 196 6 196 3 196 6 196 6 196 9 196 6 196 3 196 9 196 2 196 4 196 4 196 6 196 4 196 Sample 5,3 5,7 3,2 3,4 3,6 3,8 3,5 3,3 3,7 7,2 7,4 7,6 7,8 7,5 7,3 7,7 xbar 4 6 2.5 3.5 4.5 5.5 4 3 5 4.5 5.5 6.5 7.5 6 5 7 Prob 2 196 6 196 1 196 2 196 2 196 3 196 2 196 1 196 3 196 3 196 6 196 6 196 9 196 6 196 3 196 9 196 4 196 Population: ages of cars and their distribution Sample 2,2 2,4 2,6 2,8 2,5 2,3 2,7 4,2 4,4 4,6 4,8 4,5 4,3 4,7 6,2 6,4 6,6 xbar 2 3 4 5 3.5 2.5 4.5 3 4 5 6 4.5 3.5 5.5 4 5 6 Prob Probability Distribution of the Sample Mean Age of 2 Cars 1 2 2 3 2 1 3 2 4 4 6 4 2 6 2 4 4 196 196 196 196 196 196 196 196 196 196 196 196 196 196 196 196 196 Sample 6,8 6,5 6,3 6,7 8,2 8,4 8,6 8,8 8,5 8,3 8,7 5,2 5,4 5,6 5,8 5,5 xbar 7 5.5 4.5 6.5 5 6 7 8 6.5 5.5 7.5 3.5 4.5 5.5 6.5 5 Prob 6 4 2 6 3 6 6 9 6 3 9 2 4 4 6 4 196 196 196 196 196 196 196 196 196 196 196 196 196 196 196 196 Sample 5,3 5,7 3,2 3,4 3,6 3,8 3,5 3,3 3,7 7,2 7,4 7,6 7,8 7,5 7,3 7,7 xbar 4 6 2.5 3.5 4.5 5.5 4 3 5 4.5 5.5 6.5 7.5 6 5 7 Prob 2 6 1 2 2 3 2 1 3 3 6 6 9 6 3 9 196 196 196 196 196 196 196 196 196 196 196 196 196 196 196 196 Sample 2,2 2,4 2,6 2,8 2,5 2,3 2,7 4,2 4,4 4,6 4,8 4,5 4,3 4,7 6,2 6,4 6,6 xbar 2 3 4 5 3.5 2.5 4.5 3 4 5 6 4.5 3.5 5.5 4 5 6 1 196 Prob 2 196 2 196 3 196 2 196 1 196 3 196 2 196 4 196 4 196 6 196 4 196 2 196 6 196 2 196 4 196 4 196 Sample 6,8 6,5 6,3 6,7 8,2 8,4 8,6 8,8 8,5 8,3 8,7 5,2 5,4 5,6 5,8 5,5 xbar 7 5.5 4.5 6.5 5 6 7 8 6.5 5.5 7.5 3.5 4.5 5.5 6.5 5 6 196 Prob 4 196 2 196 6 196 3 196 6 196 6 196 9 196 6 196 3 196 9 196 2 196 4 196 4 196 6 196 4 196 Sample 5,3 5,7 3,2 3,4 3,6 3,8 3,5 3,3 3,7 7,2 7,4 7,6 7,8 7,5 7,3 7,7 xbar 4 6 2.5 3.5 4.5 5.5 4 3 5 4.5 5.5 6.5 7.5 6 5 7 2 196 Prob x 2 2.5 3 6 196 1 196 3.5 2 196 2 196 4 3 196 2 196 1 196 4.5 3 196 3 196 5 6 196 6 196 9 196 5.5 6 196 3 196 6 9 196 6.5 7 7.5 8 p(x) 1/196 2/196 5/196 8/196 12/196 18/196 24/196 26/196 28/196 24/196 21/196 18/196 9/196 Solution (cont.)  Probability distribution of x: x 2 2.5 p(x) 1/196 2/196 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 5/196 8/196 12/196 18/196 24/196 26/196 28/196 24/196 21/196 18/196 1/196 This is the sampling distribution of x because it specifies the probability associated with each possible value of x  From the sampling distribution above P(4  x  6) = p(4)+p(4.5)+p(5)+p(5.5)+p(6)  = 12/196 + 18/196 + 24/196 + 26/196 + 28/196 = 108/196 Expected Value and Standard Deviation of the Sampling Distribution of x Example (cont.) Population probability dist. x 2 3 4 5 6 7 8 p(x) 1/14 1/14 2/14 2/14 2/14 3/14 3/14   Sampling dist. of x x p(x) 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 1/196 2/196 5/196 8/196 12/196 18/196 24/196 26/196 28/196 24/196 21/196 18/196 1/196 Population probability dist. x 2 3 4 5 6 7 8 p(x) 1/14 1/14 2/14 2/14 2/14 3/14 3/14 E(X)=2(1/14)+3(1/14)+4(2/14)+ … +8(3/14)=5.714 Sampling dist. of x Population mean E(X)= = 5.714 x 2 2.5 3 3.5 p(x) 1/196 2/196 5/196 8/196 4.5 4 5 5.5 6 6.5 7 7.5 8 12/196 18/196 24/196 26/196 28/196 24/196 21/196 18/196 1/196 E(X)=2(1/196)+2.5(2/196)+3(5/196)+3.5(8/196)+4(12/196)+4.5(18/196)+5(24/196) +5.5(26/196)+6(28/196)+6.5(24/196)+7(21/196)+7.5(18/196)+8(1/196) = 5.714 Mean of sampling distribution of x: E(X) = 5.714 Example (cont.) Population from which sample is selected:  = E ( X ) = 2( 141 )  3( 141 )  4( 142 )   8  143  = 5.714  2 = Var ( X ) = 3.4898  = SD( X ) = Var ( X ) = 3.4898 = 1.8681  Sampling dist. of X : 1 2 E ( X ) = 2( 196 )  2.5( 196 ) 9  8( 196 ) = 5.714 3.4898 Var ( X ) = 2 2 Var ( X ) SD( X ) 1.8681 SD( X ) = Var ( X ) = = = = 1.3209 2 2 2 Var ( X ) =1.7449 = IMPORTANT Numerical Summaries of the Sampling Distribution of X are Related to the Numerical Summaries of the Population X from Which the Sample is Selected E ( X ) = E ( X ) (the mean of the sampling distribution of X is always equal to the mean of the population from which the sample is selected)  Var ( X ) Var ( X ) = n Var ( X ) SD( X ) SD( X ) = Var ( X ) = = n n the standard deviation of the sampling distribution of X is always equal to the standard deviation of the population from which the sample is selected, divided by the square root of the sample size n Sampling Distribution of the Sample Mean X: Example  An example – A fair 6-sided die is thrown; let X represent the number of dots showing on the upper face. – The probability distribution Population mean :  = E(X) = 1(1/6) +2(1/6) of X is x 1 2 3 4 5 6 p(x) 1/6 1/6 1/6 1/6 1/6 1/6 + 3(1/6) +……… = 3.5. Population variance 2 2 =V(X) = (1-3.5)2(1/6)+ (2-3.5)2(1/6)+ ……… ………. = 2.92 Suppose we want to estimate  from the mean x of a sample of size n = 2.  What is the sampling distribution of x in this situation?  Sample 1 2 3 4 5 6 7 8 9 10 11 12 1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6 Mean Sample Mean 1 13 3,1 2 1.5 14 3,2 2.5 2 15 3,3 3 2.5 16 3,4 3.5 3 17 3,5 4 3.5 18 3,6 4.5 1.5 19 4,1 2.5 2 20 4,2 3 2.5 21 4,3 3.5 3 22 4,4 4 3.5 23 4,5 4.5 4 24 4,6 5 Sample 25 26 27 28 29 30 31 32 33 34 35 36 Mean 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 3 3.5 4 4.5 5 5.5 3.5 4 4.5 5 5.5 6 Sample 1 2 3 4 5 6 7 8 9 10 11 12 1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6 Mean Sample Mean 1 13 3,1 2 1.5 14 3,2 2.5 2 15 3,3 3 2.5 16 3,4 3.5 3 17 3,5 4 3.5 18 3,6 4.5 1.5 19 4,1 2.5 2 20 4,2 3 2.5 21 4,3 3.5 3 22 4,4 4 3.5 23 4,5 4.5 4 24 4,6 5 Sample 25 26 27 28 29 30 31 32 33 34 35 36 Mean 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 3 3.5 4 4.5 5 5.5 3.5 4 4.5 5 5.5 6 Var ( X ) Note : E ( X ) = E ( X ) and Var ( X ) = 2 E( x) =1.0(1/36)+ 1.5(2/36)+….=3.5 6/36 5/36 V(X) = (1.0-3.5)2(1/36)+ (1.5-3.5)2(2/36)... = 1.46 4/36 3/36 2/36 1/36 1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 x n=5 E ( X ) = 3.5 Var ( X ) = .5833 ( = Var ( X ) 5 n = 10 ) E ( X ) = 3.5 Var ( X ) = .2917 ( = Var ( X ) 10 n = 25 ) E ( X ) = 3.5 Var ( X ) = .1167 ( = 1 Var ( X ) 25 6 Notice that Var ( X ) is smaller 1 than Var(X). The larger the sample size the smaller is Var ( X ) . Therefore, x tends to fall closer to , as the sample size increases. 6 1 6 ) The variance of the sample mean is smaller than the variance of the population. Mean = 1.5 Mean = 2. Mean = 2.5 Population Let us take samples of two observations 1.5 2.5 22 3 1.5 2.5 22 1.5 2.5 1.5 2 2.5 1.5 2.5 Compare1.5 the variability population 2 of the 2.5 1.5 2.5 to the variability of 22the sample mean. 1.5 2.5 1.5 2.5 2 1.5 2.5 1.5 2 2.5 1.5 2 2.5 1.5 2 2.5 1 Also, Expected value of the population = (1 + 2 + 3)/3 = 2 Expected value of the sample mean = (1.5 + 2 + 2.5)/3 = 2 Properties of the Sampling Distribution of x 1. E ( x ) =  (the expected value of the sampling distribution of x = the expected value  of the sampled population) SD( x)  2. SD( x ) = = n n where  is the standard deviation of the population from which the sample is taken and n is the sample size. Unbiased l Confidence l Precision µ The central tendency is down the center BUS 350 - Topic 6.1 Handout 6.1, Page 1 6.1 - 14 Unbiased Biased µ Biased µ µ The central tendency is down the center BUS 350 - Topic 6.1 Handout 6.1, Page 2 6.1 - 15 Consequences 1. E ( x ) = . This is why we use x to estimate an unknown population mean . The sampling dist. of x is "centered" at the parameter we are trying to estimate. 2. SD( x ) = SD (nx ) ; the standard deviation of x is smaller than SD( x), the stand. dev. of the population from which the sample is taken. The values of x will cluster tightly around  when n is large. A Billion Dollar Mistake      “Conventional” wisdom: smaller schools better than larger schools Late 90’s, Gates Foundation, Annenberg Foundation, Carnegie Foundation Among the 50 top-scoring Pennsylvania elementary schools 6 (12%) were from the smallest 3% of the schools But …, they didn’t notice … Among the 50 lowest-scoring Pennsylvania elementary schools 9 (18%) were from the smallest 3% of the schools A Billion Dollar Mistake (cont.) Smaller schools have (by definition) smaller n’s. SD ( x )  When n is small, SD(x) = n is larger  That is, the sampling distributions of small school mean scores have larger SD’s  We Know More!  We know 2 parameters of the sampling distribution of x : E(x) = μ SD(x) SD(x) = n The Central Limit Theorem tells us about the shape of the distribution of x when the sample size n is sufficiently large. THE CENTRAL LIMIT THEOREM The “World is Normal” Theorem But first,…Sampling Distribution of x- Normally Distributed Population Sampling distribution of x: N( ,  /10) n=10 /10 Population distribution: N( , )  Normal Populations  Important Fact:  If the population is normally distributed, then the sampling distribution of x is normally distributed for any sample size n.  Previous slide Non-normal Populations What can we say about the shape of the sampling distribution of x when the population from which the sample is selected is not normal? Baseball Salaries 600 490 500 Frequency  400 300 200 100 53 102 72 35 21 26 17 8 10 0 Salary ($1,000's) 2 3 1 0 0 1 The Central Limit Theorem (for the sample mean x) If a random sample of n observations is selected from a population (any population), then when n is sufficiently large, the sampling distribution of x will be approximately normal. (The larger the sample size, the better will be the normal approximation to the sampling distribution of x.)  The Importance of the Central Limit Theorem  When we select simple random samples of size n, the sample means x will vary from sample to sample. We can model the distribution of these sample means with a probability model that is …   N  ,    n How Large Should n Be?  For the purpose of applying the Central Limit Theorem, we will consider a sample size to be large when n > 30. Baseball Salaries 600 Frequency ← Even if the population from ← which the sample is ← selected looks like this … 490 500 400 300 200 100 53 102 72 35 21 26 17 8 10 2 3 1 0 0 1 0 Salary ($1,000's) … the Central Limit → Theorem tells us that a → good model for the sampling → distribution of the sample mean x is … Summary Population: mean ; stand dev. ; shape of population dist. is unknown; value of  is unknown; select random sample of size n; Sampling distribution of x: mean ; stand. dev. /n; always true! By the Central Limit Theorem: the shape of the sampling distribution is approx normal, that is x ~ N(, /n) The Central Limit Theorem (for the sample proportion p ) If x “successes” occur in a random sample of n observations selected from a population (any population), then when n is sufficiently large, the sampling distribution of p =x/n will be approximately normal. (The larger the sample size, the better will be the normal approximation to the sampling distribution of p.)  The Importance of the Central Limit Theorem  When we select simple random samples of size n from a population with “success” probability p and observe x “successes”, the sample proportions p =x/n will vary from sample to sample. We can model the distribution of these sample proportions with a probability model that is…  p(1  p)  N  p,  n   How Large Should n Be?  For the purpose of applying the central limit theorem, we will consider a sample size n to be large when np ≥ 10 and n(1-p) ≥ 10 Population, "success" proportion = p 0.7 p __ 0.6 p 0.5 0.4 0.3 1-p 0.2 0.1 0 0 1 … the Central Limit → Theorem tells us that a → good model for the sampling → distribution of the sample x proportion pˆ = n is … ← If the population from ← which the sample is ← selected looks like this … Population Parameters and Sample Statistics Population parameter Value Sample statistic used to estimate  p proportion of population with a certain characteristic Unknown p̂  µ mean value of a population variable  Unknown x The value of a population parameter is a fixed number, it is NOT random; its value is not known. The value of a sample statistic is calculated from sample data The value of a sample statistic will vary from sample to sample (sampling distributions) Example A random sample of n =64 observations is drawn from a population with mean  =15 and standard deviation  =4. SD( X ) 4 a. E ( X ) =  = 15; SD( X ) = = = 0.5 8 n b. The shape of the sampling distribution model for x is approx. normal (by the CLT) with mean E(X) = 15 and SD( X ) = 0.5 (The answer depends on the sample size n since SD( X ) = SD ( X ) n = 4 64 = 84 = 0.5) Example (cont.) c. x = 15.5; z= x  SD ( X ) = 15.5.515 = .5.5 = 1 This means that x =15.5 is one standard deviation above the mean  = E ( X ) = 15 Example 2 The probability distribution of 6-month incomes of account executives has mean $20,000 and standard deviation $5,000.  a) A single executive’s income is $20,000. Can it be said that this executive’s income exceeds 50% of all account executive incomes? ANSWER No. P(X<$20,000)=? No information given about shape of distribution of X; we do not know the median of 6-month incomes.  Example 2(cont.)  b) n=64 account executives are randomly selected. What is the probability that the sample mean exceeds $20,500? answer E(X) = $20, 000 SD(X) = $5, 000 E ( X ) = $20, 000 SD ( X ) = SD ( x ) n = 5,000 64 = 625 By CLT, X ~ N (20, 000, 625) P ( X  20, 500) = P X  20,000 625  20,500  20,000 625 = P ( z  .8) = 1  .7881 = .2119 Example 3 A sample of size n=16 is drawn from a normally distributed population with E(X)=20 and SD(X)=8. X ~ N (20, 8); X ~ N (20, 816 ) a ) P ( X  24) = P ( X 220  24 2 20 ) = P ( z  2) = 1  .9772 = .0228 b) P (16  X  24) = P  16 220  z  24 2 20  = P ( 2  z  2) = .9772  .0228 = .9544 Example 3 (cont.) c. Do we need the Central Limit Theorem to solve part a or part b?  NO. We are given that the population is normal, so the sampling distribution of the mean will also be normal for any sample size n. The CLT is not needed.  Example 4  Battery life X~N(20, 10). Guarantee: avg. battery life in a case of 24 exceeds 16 hrs. Find the probability that a randomly selected case meets the guarantee. E ( x ) = 20; SD( x ) = 10 P ( X  16) = P( 2.04  X  20 .1  .0250 = .9750 24 = 2.04. X ~ N (20, 2.04) 16  20 2.04 ) = P( z  1.96) = Example 5 Cans of salmon are supposed to have a net weight of 6 oz. The canner says that the net weight is a random variable with mean =6.05 oz. and stand. dev. =.18 oz. Suppose you take a random sample of 36 cans and calculate the sample mean weight to be 5.97 oz.  Find the probability that the mean weight of the sample is less than or equal to 5.97 oz. Population X: amount of salmon in a can E(x)=6.05 oz, SD(x) = .18 oz     X sampling dist: E(x)=6.05 SD(x)=.18/6=.03 By the CLT, X sampling dist is approx. normal P(X  5.97) = P(z  [5.97-6.05]/.03) =P(z  -.08/.03)=P(z  -2.67)= .0038 How could you use this answer? Suppose you work for a “consumer watchdog” group  If you sampled the weights of 36 cans and obtained a sample mean x  5.97 oz., what would you think?  Since P( x  5.97) = .0038, either  – you observed a “rare” event (recall: 5.97 oz is 2.67 stand. dev. below the mean) and the mean fill E(x) is in fact 6.05 oz. (the value claimed by the canner) – the true mean fill is less than 6.05 oz., (the canner is lying ). Example 6 X: weekly income. E(X)=1050, SD(X) = 100  n=64; X sampling dist: E(X)=1050 SD(X)=100/8 =12.5   P(X  1022)=P(z  [1022-1050]/12.5) =P(z  -28/12.5)=P(z  -2.24) = .0125 Suspicious of claim that average is $1050; evidence is that average income is less. Example 7  12% of students at NCSU are left-handed. What is the probability that in a sample of 100 students, the sample proportion that are lefthanded is less than 11%? .12*.88 ˆ ˆ E ( p) = p = .12; SD( p) = = .032 100 np = 100  .12 = 12  10; n(1  p) = 100  .88 = 88  10; So By the CLT, pˆ ~ N (.12,.032) Example 7 (cont.)  pˆ  .12 .11  .12  ˆ P( p  .11) = P    .032 .032   = P( z  .31) = .3783 P ( pˆ  .11) = .3783 p̂ pˆ = .11 P( z  .31) = .3783 z = .31

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Sampling Distributions and the Central Limit