Download Sampling Distribution of the Mean

Sampling Distribution of the Mean Central Limit Theorem Given population with distribution will have:  A mean A variance Standard Error (mean) and ( x )  2 the sampling  ( x2 )  ( x )   2 N  N As N increases, the shape of the distribution becomes normal (whatever the shape of the population) Testing Hypothesis Known and Remember: We could test a hypothesis concerning a population and a single score by z  Obtain x  p(z ) and use z table We will continue the same logic Given: Behavior Problem Score of 10 years olds   50,   10 ( N  5) Sample of 10 year olds under stress H 0   50 x  56 H1   50 Because we know  and  , we can use the Central Limit Theorem to obtain the Sampling Distribution when H0 is true. Sampling Distribution will have  x  50 ,  x2  x ( s tan dard error) 2  N  10 2  5  20  4.47 N We can find areas under the distribution by referring to Z table We need to know p( x )  56 Minor change from z score z x  NOW z x x or z x  N With our data z 56  50 6   1.34 4.47 4.47 Changes in formula because we are dealing with distribution of means NOT individual scores. From Z table we find p( z )  1.34 is 0.0901 Because we want a two-tailed test we double 0.0901 (2)0.0901 = 0.1802 0.1802  0.05 NOT REJECT H0 or is 0.0901  0.025 One-Sample t test Pop’n  2  with = known &  2 unknown we must estimate S2 Because we use S, we can no longer declare the answer to be a Z, now it is a t Why? Sampling Distribution of t 2 - S2 is unbiased estimator of - The problem is the shape of the S2 distribution positively skewed thus: S2 is more likely to UNDERESTIMATE 2 (especially with small N) thus: t is likely to be larger than Z (S2 is in denominator) t - statistic z  x x  x  N and substitute S2 for  x 2 N 2 x x x t    S Sx S2 N N To treat t as a Z would give us too many significant results Guinness Brewing Company (student) Student’s t distribution we switch to the t Table when we use S2 Go to Table Unlike Z, distribution is a function of df with N  , t  z Degrees of Freedom For one-sample cases, df  N  1 1 df lost because we used x (sample mean) to calculate S2 ( x  x )  0, all x can vary save for 1 2  Example: One-Sample Unknown Effect of statistic tutorials: Last 100 years:   76.0 this years: x  79.3 (no tutorials) (tutorials) N = 20, S = 6.4 H 0 :   76 t   H 1 :   76 x Sample Mean - Pop' n mean  Sx standard error x s N  79 .3  76 6 .4 20  3.3 1.43  2.31 Go to t-Table t-Table - not area (p) above or below value of t - gives t values that cut off critical areas, e.g., 0.05 t also defined for each df N=20 df = (N-1) = 20-1 = 19 Go to Table t.05(19) is  2.093 critical value 2.31  2.093  reject H 0 Factors Affecting Magnitude of t & Decision 1. 2. Difference between x and  ( x   ) the larger the numerator, the larger the t value Size of S2 as S2 decreases, t increases 3. Size of N as N increases, denominator decreases, t increases 4.  5. One-, or two-tailed test level Confidence Limits on Mean Point estimate Specific value taken as estimator of a parameter Interval estimates A range of values estimated to include parameter Confidence limits Range of values that has a specific (p) of bracketing the parameter. End Points = confidence limits. How large or small  could be without rejecting H 0 if we ran a t-test on the obtained sample mean. Confidence Limits (C.I.) x x t   S Sx N We already know x , S and N We know critical value for t at   .05  t.05 (19)  2.093 We solve for   2.093  79 .3   79 .3    6.4 1.43 20 Rearranging   2.093 (1.43)  79.3   2.993  79.3 Using +2.993 and -2.993  upper 2.993  79.3  82.29  lower 2.993  79 .3  76 .31 C.I .95  76.31    82.29 Two Related Samples t Related Samples Design in which the same subject is observed under more than one condition (repeated measures, matched samples) Each subject will have 2 measures x1 and x 2 that will be correlated. This must be taken into account. Promoting social skills in adolescents Before and after intervention H 0 : 1   2 or 1   2  0 before after Difference Scores Set of scores representing the difference between the subject’s performance or two occasions ( x1 and x2 ) x1 x2 Difference( D) 18 5 19 13 12 4 17 6 12 17 26 3 x S 1 20 15 18 10 8 15 13.333 6.914 11 8 12 27 3 3 14 12 14 11 10 9 11.133 5.998 our data can be the D column H 0 :  D  0 from 1 1 2 2 4 5 1 0 2 6 3 4 1 2 6 2.200 2.933  2  0 H0  we are testing a hypothesis using ONE sample Related Samples t x t Sx remember now D D 0 D 0 t  SD SD N D N N = # of D scores Degrees of Freedom same as for one-sample case df our data = (N - 1) = (15 - 1) = 14 t 2.20  0 2.20   2.91 2.933 0.757 15 Go to table t.05 (14)  2.145  t  2.91  reject H 0 Advantages of Related Samples 1. Avoids problems that come with subject to subject variability. The difference between(x1) 26 and (x2) 24 is the same as between (x1) 6 and (x2) 4 (increases power) (less variance, lower denominator, greater t) 2. Control of extraneous variables 3. Requires fewer subjects Disadvantages 1. Order effects 2. Carry-over effects Two Independent Samples t H 0 : 1  2  0 Sampling distribution of differences between means 2 pop’ns Suppose: x1 1 2 1 draw pairs of samples: and and and x2 2  2 2 sizes N1, and N2 x1 and x2 and the differences record means x between x , and x ( x  x )for each pair of 1 1 2 2 samples 1 repeat  times x1 x2 x1 x2 Mean Difference x11 x21 x22 x11  x21 x12  x1 Mean Variance Standard Error 1  2 1 N1 1 N1  x2 2  N2 2 N2 2 2 x11  x21  x11  x21 1  2  12 N1   12 N1  22 N2   22 N2 Variance Sum Law Variance of a sum or difference of two INDEPENDENT variables = sum of their variances The distribution of the differences is also normal t Difference Between Means  z ( x1  x2 )  ( 1   2 )  x x 1  2 ( x1  x2 )  ( 1   2 )  12 N1 We must estimate  2 with   22 N2 s2 ( x1  x2 )  ( 1  2 )  t s x1  x2 Because H 0 : 1  2  0 ( x1  x2 ) t S x1  x2 or t ( x1  x2 ) 2 1 2 2 s s  N1 N2 t ( x1  x2 ) 2 1 2 2 s s  n1 n2 When is O.K. only when the N’s are the same size n1  n2 we need a better estimate of 2 We must assume homogeneity of variance ( 2 1 2 Rather than using s1 or we use their average. Because   22 ) s22 to estimate  2, n1  n2 we need a Weighted Average weighted by their degrees of freedom (n1  1) s  (n2  1) s s  n1  n2  2 2 p 2 1 2 2 ( s 2p ) Pooled Variance Now x1  x2 t s x1  x2  x1  x2 2 1 2 2 s s  n1 n2  x1  x2 1 1 s     n1 n2  2 p 1 1  n1 n2 come from formula for Standard Error s Degrees of Freedom 2 p two means have been used to calculate  df  (n1 1)  (n2 1)  n1  n2  2 df Example: x s2 Group 1 Group 2 17 13 17 18 21 17 18 13 22 14 18 13 16 18 15 19 18 16 20 14 21 13 16 15 15 14 16 16 20 15 15 13 17 17 15 18.00 15.25 5.286 3.671 Example: 18.00 – 15.25 We have numerator We need denominator Pooled Variance because ??????? n1  n2 2 2 ( n  1 ) s  ( n  1 ) s 1 2 2 s 2p  1 n1  n2  2 14(5.286)  19(3.671)  15  20  2 74.004  69.749   4.356 33 Denominator becomes 1 1  4.356    15 20  4.356 4.356  15 20 = t x1  x2 1 1 s     n1 n2  2 p (18.00  15.25)  4.356 4.356  15 20 2.75  0.5082 2.75   3.86 0.713 df  (15  20  2)  33 Go to Table t.05 (33)  2.04 t  3.86  2.04  reject H 0 Summary If  and  2 are known, then treat x s in Z score formula; x replaces z  as x x  n If  is known and   in sx 2 is unknown, then sD replaces x t s n If two related samples, then D replaces and s replaces sx x D 0 t sD n If two independent samples, and Ns are of equal size, then sD t is replaced by s1 s2  n1 n2 x1  x2 2 1 2 2 s s  n n If two independent samples, and Ns are NOT equal, s2 then 1 and t s22 are replaced by s 2p x1  x2 1 1 s     n1 n2  2 p

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Sampling Distribution of the Mean