Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
10.1: 2-Proportion Situations 3.9.2017 • 2-proportion confidence intervals • 2-proportion significance tests • WE ARE LOOKING AT THE DIFFERENCE BETWEEN TWO PROPORTIONS – What is the plausible range for the difference between the two? (confidence interval) – Are the two different from each other? (significance test( 2 Proportions • So now we have 2 samples – 1 from one population – 1 from another population • We want to know what the difference between the two populations is – NOT what the difference between the samples is – We are just using the samples to draw conclusions about the populations – Shape • Difference between two proportions (p1-p2) • Normality conditions Center • If the new distribution is defined as the difference between the two proportions (p1p2), then the mean of the distribution is p1-p2 • Since we don’t know p1 and p2, we estimate the center to be Spread • The standard deviation is our usual measure of spread • This is on the AP formula sheet Clarification • We have these two proportions • To make conclusions about the difference between them, we are effectively creating a new variable/distribution that is defined by one minus the other • So once we do that, we now have a single sampling distribution that allows us to draw conclusions • So it behaves much the same way as chapters 8 and 9 Scenario 1 • We know (or think we know) the true population proportions • The question may then ask us, assuming that those values are true, to find the probability of getting a certain result – See next slide for example • Find the probability of getting a difference of .1 or less from the two samples • Find the probability of getting a difference of .1 or less from the two samples – Z=(value-mean)/(st. dev) – Z=(.1-.2)/st. dev • St dev. =.0579 – (-.1)/(.0579)=-1.727 – Normalcdf(-555555,-1.727,0,1)=.0427 – .0427 or about a 4% chance • Find the probability of getting a difference of .1 or less from the two samples – .0427 or about a 4% chance • Does this give us reason to doubt the counselors’ reports? Scenario 2 • We want a confidence interval for the difference between the two proportions • We don’t know the true proportion—we use sample data to create the confidence interval • Remember: • (Point estimate) ± (Critical value)(St dev) • ± (critical value) • First, let’s check our conditions: – Random? YES – Normal? YES • Independent? YES—one person’s response doesn’t affect another’s • Now, let’s construct our 95% confidence interval – “difference between teens and adults” – Teens minus adults • Point estimate: • Critical value: • St. dev: • Point estimate: (.73-.47)=.26 • Critical value: 1.96 (or 2) • St. dev: – .0189 • .26 ± (1.96)(.0189) • .26 ± .037 or .223--.297 You try • In a random sample of 50 at-bats, one major league baseball player reached base 19 times. Another player, (also in a sample of 50 at-bats) reached base 14 times. Find a 95% confidence interval to represent the difference in their on-base rates • Point estimate: (.38-.28)=.1 • Critical value: 1.96 • St. Dev: .0935 • .1 ± (1.96)(.0935) • .1 ± .183 • -.083 -- .283 Now different confidence level • In a random sample of 50 at-bats, one major league baseball player reached base 19 times. Another player, (also in a sample of 50 at-bats) reached base 14 times. Find an 84% confidence interval to represent the difference in their on-base rates • Point estimate: (.38-.28)=.1 • Critical value: 1.405 (use invnorm • St. Dev: .0935 • .1 ± (1.405)(.0935) • .1 ± .131 • -.031 -- .231 Scenario 3 • We want to perform a significance test to see if the two proportions are different from each other – We don’t know the population proportions—if we did, no need for significance test • Null hypothesis is that there is no difference – Or that the difference equals 0 • Alternative hypothesis will be 1 of 3 choices – The difference is >0 – The difference is <0 – The difference ≠ 0 • We are going to use standard deviation when we calculate the • For our test statistics (Z) we use the same procedure as always: (value-mean)/(st dev) • For the standard deviation, we use the same formula as before, but we plug in for both p1 and p2: • • H0: • Ha: • Z=(.2375-.1733)/(st. dev) • Pooled prob: (45/230)=.1957 • St dev= .0549 • • • • • Z=(.0642)/(.0549)=1.169 Normalcdf(1.169,BIG,0,1)=.121 .121 x 2 = .242 Fail to reject at a .05 significance level Cannot conclude that there is a difference You try • • • • • • • H0: p1-p2=0 Ha: p1-p2<0 P-hat1: .027 P-hat2: .041 N1: 2051 N2: 2030 Pc: .034 • • • • • • • H0: p1-p2=0 Ha: p1-p2<0 P-hat1: .027 P-hat2: .041 N1: 2051 N2: 2030 Pc: .034 • • • • Z= (.027-.041)/(st dev) St dev: .0057 Z=(-.014)/.0057=-2.456 Normalcdf(SMALL,2.456,0,1)=.007 • Reject null • Conclude that the heart attack rate is lower for those who take the drug compared to the placebo • • • • • • • H0: p1-p2=0 Ha: p1-p2<0 P-hat1: .027 P-hat2: .041 N1: 2051 N2: 2030 Pc: .034 • We can also do this on our calculator • 2-PropZTest • X1:56 • N1: 2051 • X2: 84 • N2: 2030 You try • Out of a sample of 3000 people in city #1, 1567 say that they will vote for the Republican candidate. Out of a sample of 3700 people in city 2, 1801 say that they will vote for the Republican candidate. Is there a statistically significant difference between the preferences of the two cities at the .01 significance level?