Download Two-Sample Inference Procedures

Two-Sample Inference Procedures with Means Two-Sample Procedures with Means • Goal: Compare two different populations/treatments • INDEPENDENT samples from each population/treatment Remember: When combining two random variables X and Y,     x y  x y x y    2 2 x y This formula only works if X and Y are independent Suppose we have a population of adult men with a mean height of 71 inches and standard deviation 2.6 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation 2.3 inches. Heights are normally distributed. Describe the distribution of the difference in heights between males and females (male – female). Normal distribution μM-F = 6 inches & σM-F = 3.471 inches Female 65 Male 71 Difference (male – female) 6 σ = 3.471 a) What is the probability that a randomly selected man is at most 5 inches taller than a randomly selected woman? P(xM – xF < 5) = normalcdf(-1E99, 5, 6, 3.471) = .3866 b) What is the 70th percentile for the difference (male – female) in heights of a randomly selected man and woman? (xM – xF) = invNorm(.7, 6, 3.471) = 7.82 Calculator Simulation! a) What is the probability that the mean height of 30 men is at most 5 inches taller than the mean height of 30 women? P(xM – xW < 5) = .057 b) What is the 70th percentile for the difference (male – female) in mean heights of 30 men and 30 women? 6.332 inches Conditions for Two Means • Two independent SRS's (or randomly assigned treatments) • Both samp. dist. are approx. normal – Both populations normal – Both n's > 30 – Both graphs linear Degrees of Freedom Option 1: Use the smaller df: n1 – 1 or n2 – 1  Using the larger one overestimates the collective sample sizes Option 2: Welch-Satterthwaite approximation s s     n n   df  1 s  1 s     n  1 n  n  1 n 1 2 2 1 2 1 2 2 2 2 1 2 1 2 2 Calculator does this automatically!    Confidence Interval for the Difference of Two Means Standard Error/ Deviation CI  statistic  critical value SD of statistic x  x   t 1 2 * df 2 2 1 2 1 2 s s  n n  Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B. Absorption time is normally distributed. Twelve people were randomly selected and given a dosage of brand A. Another 12 were randomly selected and given an equal dosage of brand B. The length of time in minutes for the drugs to reach a specified level in the blood was recorded. The results follow: mean SD n Brand A 20.1 8.7 12 Brand B 18.9 7.5 12 a) Describe the sampling distribution of the differences in the mean speed of absorption (A – B). Normal; s = 3.316 b) Construct a 95% confidence interval for the difference in mean lengths of time (A – B) required for bodily absorption of each brand. Conditions: • 2 independent randomly assigned treatments • Populations are normal 2 2 1 “Price2 s s  is Right”: x1  x 2   t n 2 going over Closest dfn1without *Think 21.53 From calculator 2 2 8.7 7.5 20.1  18.9  2.080   (5.685,8.085) 12 12 We are 95% confident that the true difference in mean absorption time (A minus B) is between -5.685 minutes and 8.085 minutes. If we made lots of intervals this way, 95% of them would contain the true difference in means. A Subtle Distinction • Matched pairs: “mean difference” • Two-sample inference: “difference in means” Hypothesis Statements H0: μ1 = – μ22 = 0 Ha: μ1 – < μ22 < 0 – μ22 > 0 Ha: μ1 > Ha: μ1 –≠ μμ22 ≠ 0 Be sure to define BOTH μ1 and μ2! Test Statistic Test statistic  t df  Since we assume statistic - parameter H0 is true, this part 0 – so we SD equals of statistic can leave it out  x  x       1 2 1 2 2 1 2 1 2 s s  n n 2 c) Is there sufficient evidence that the two brands differ in the speed at which they enter the bloodstream? Conditions: • 2 independent randomly assigned treatments • Populations are normal H0: A= B H a: A ≠ B t21.53  Where μA and μB are the true mean absorption times x1  x 2 2 1 2 2 s s  n1 n 2  20.1 18.9 2 8.7 7.5  12 12 2  .361 p-value = .7210 α = .05 Since p-value > α, we fail to reject H0. There is not sufficient evidence to suggest these drugs differ in their absorption time. Pooling • Used for two populations with the same variance (σ2) • Pooling = Averaging the two s2 to estimate σ2 • We almost never pool for means, since we don't know σ Robustness • Two-sample procedures: more robust than one-sample procedures • Most robust with equal sample sizes (but not necessary!) A modification has been made to the process for producing a certain type of film. Since the modification costs extra, it will be incorporated only if sample data indicate that the modification decreases the true average development time. At a significance level of 10%, should the company incorporate the modification? Original Modified 8.6 5.1 4.5 5.4 6.3 5.5 4.0 3.8 6.0 5.8 6.6 5.7 8.5 4.9 7.0 5.7 Conditions: • 2 independent SRS's of film • Normal prob. plots linear  approx. normal sampling dist.’s Where μO and μM are the true mean developing times for original and modified film H0: μO = μM Ha: μO > μM t12.55  xO  xM  2 O 2 M s s  nO nM p-value = .076  6.3375  5.3375 2 1.5146 1.0636  8 8 2  1.53 α = .1 Since p-value < α, we reject H0. There is sufficient evidence to suggest the company should incorporate the modification.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Two-Sample Inference Procedures