Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Two-Sample Inference Procedures with Means Two-Sample Procedures with Means • Goal: Compare two different populations/treatments • INDEPENDENT samples from each population/treatment Remember: When combining two random variables X and Y, x y x y x y 2 2 x y This formula only works if X and Y are independent Suppose we have a population of adult men with a mean height of 71 inches and standard deviation 2.6 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation 2.3 inches. Heights are normally distributed. Describe the distribution of the difference in heights between males and females (male – female). Normal distribution μM-F = 6 inches & σM-F = 3.471 inches Female 65 Male 71 Difference (male – female) 6 σ = 3.471 a) What is the probability that a randomly selected man is at most 5 inches taller than a randomly selected woman? P(xM – xF < 5) = normalcdf(-1E99, 5, 6, 3.471) = .3866 b) What is the 70th percentile for the difference (male – female) in heights of a randomly selected man and woman? (xM – xF) = invNorm(.7, 6, 3.471) = 7.82 Calculator Simulation! a) What is the probability that the mean height of 30 men is at most 5 inches taller than the mean height of 30 women? P(xM – xW < 5) = .057 b) What is the 70th percentile for the difference (male – female) in mean heights of 30 men and 30 women? 6.332 inches Conditions for Two Means • Two independent SRS's (or randomly assigned treatments) • Both samp. dist. are approx. normal – Both populations normal – Both n's > 30 – Both graphs linear Degrees of Freedom Option 1: Use the smaller df: n1 – 1 or n2 – 1 Using the larger one overestimates the collective sample sizes Option 2: Welch-Satterthwaite approximation s s n n df 1 s 1 s n 1 n n 1 n 1 2 2 1 2 1 2 2 2 2 1 2 1 2 2 Calculator does this automatically! Confidence Interval for the Difference of Two Means Standard Error/ Deviation CI statistic critical value SD of statistic x x t 1 2 * df 2 2 1 2 1 2 s s n n Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B. Absorption time is normally distributed. Twelve people were randomly selected and given a dosage of brand A. Another 12 were randomly selected and given an equal dosage of brand B. The length of time in minutes for the drugs to reach a specified level in the blood was recorded. The results follow: mean SD n Brand A 20.1 8.7 12 Brand B 18.9 7.5 12 a) Describe the sampling distribution of the differences in the mean speed of absorption (A – B). Normal; s = 3.316 b) Construct a 95% confidence interval for the difference in mean lengths of time (A – B) required for bodily absorption of each brand. Conditions: • 2 independent randomly assigned treatments • Populations are normal 2 2 1 “Price2 s s is Right”: x1 x 2 t n 2 going over Closest dfn1without *Think 21.53 From calculator 2 2 8.7 7.5 20.1 18.9 2.080 (5.685,8.085) 12 12 We are 95% confident that the true difference in mean absorption time (A minus B) is between -5.685 minutes and 8.085 minutes. If we made lots of intervals this way, 95% of them would contain the true difference in means. A Subtle Distinction • Matched pairs: “mean difference” • Two-sample inference: “difference in means” Hypothesis Statements H0: μ1 = – μ22 = 0 Ha: μ1 – < μ22 < 0 – μ22 > 0 Ha: μ1 > Ha: μ1 –≠ μμ22 ≠ 0 Be sure to define BOTH μ1 and μ2! Test Statistic Test statistic t df Since we assume statistic - parameter H0 is true, this part 0 – so we SD equals of statistic can leave it out x x 1 2 1 2 2 1 2 1 2 s s n n 2 c) Is there sufficient evidence that the two brands differ in the speed at which they enter the bloodstream? Conditions: • 2 independent randomly assigned treatments • Populations are normal H0: A= B H a: A ≠ B t21.53 Where μA and μB are the true mean absorption times x1 x 2 2 1 2 2 s s n1 n 2 20.1 18.9 2 8.7 7.5 12 12 2 .361 p-value = .7210 α = .05 Since p-value > α, we fail to reject H0. There is not sufficient evidence to suggest these drugs differ in their absorption time. Pooling • Used for two populations with the same variance (σ2) • Pooling = Averaging the two s2 to estimate σ2 • We almost never pool for means, since we don't know σ Robustness • Two-sample procedures: more robust than one-sample procedures • Most robust with equal sample sizes (but not necessary!) A modification has been made to the process for producing a certain type of film. Since the modification costs extra, it will be incorporated only if sample data indicate that the modification decreases the true average development time. At a significance level of 10%, should the company incorporate the modification? Original Modified 8.6 5.1 4.5 5.4 6.3 5.5 4.0 3.8 6.0 5.8 6.6 5.7 8.5 4.9 7.0 5.7 Conditions: • 2 independent SRS's of film • Normal prob. plots linear approx. normal sampling dist.’s Where μO and μM are the true mean developing times for original and modified film H0: μO = μM Ha: μO > μM t12.55 xO xM 2 O 2 M s s nO nM p-value = .076 6.3375 5.3375 2 1.5146 1.0636 8 8 2 1.53 α = .1 Since p-value < α, we reject H0. There is sufficient evidence to suggest the company should incorporate the modification.