Download large sample confidence interval

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Class Seven
Turn In:
Chapter 18: 32, 34, 36
Chapter 19: 26, 34, 44
Quiz 3
For Class Eight:
Chapter 20: 18, 20, 24
Chapter 22: 34, 36
Read Chapters 23 & 24
Complete Final Exam
Objectives for Class Seven
• Compute confidence intervals and perform significance
tests for proportions for two sample problems.
• Construct and interpret two way tables of counts of
categorical variables.
• Discuss ways of representing and analyzing categorical
data.
Conditions for Two Sample Proportions
• when the samples are large, the distribution p̂1  p̂ 2 is
approximately Normal
– large sample confidence interval: use this interval only when
the populations are at least 10 times as large as the samples
and the counts of successes and failures are each 10 or more in
both samples
– plus four confidence interval: use this interval whenever both
samples have 5 or more observations
– significance test: use this test when the populations are at least
10 times as large as the samples and the counts of successes
and failures are each 5 or more in both samples
• the mean of p̂1  p̂ 2 is p1 – p2 i.e. the difference between
sample proportions is an unbiased estimator of the
difference between population proportions.
• the standard deviation of the difference is:
p11  p1  p 2 1  p 2 

n1
n2
Large Sample Confidence Interval
• an approximate confidence interval for p1 – p2 is:
p̂1  p̂ 2   z * SE
p̂11  p̂1  p̂ 2 1  p̂ 2 
where SE 
n1

n2
Plus Four Confidence Interval
• a more accurate interval can be found using the plus four
method where we add one success and one failure to each
sample.
count of successes in the sample  1
~
p1 
count of observatio ns in the sample  2
count of successes in the sample  1
~
p2 
count of observatio ns in the sample  2
~
p11  ~
p1  ~
p2 1  ~
p2 
SE 

n1  2
n2  2
~
~
~
~

p

1

p

p

1

p2  
1
1
2
~
~

p1  p2   z * 

n2  2 
 n1  2
Significance Tests for Comparing Proportions
• the null hypothesis says that there is no difference in the
two populations: H0: p1 = p2. The alternative hypothesis
state what kind of difference we expect.
count of successes in both samples combined
p̂ 
count of observatio ns in both samples combined
p̂1  p̂ 2
z
1 1 
p̂1  p̂   
 n1 n 2 
Two Way Tables
• two way tables compare categorical variables.
–
–
–
–
each row should be totaled
each column should be totaled
the table should be totaled
each cell can be represented as either a percentage of its row
total or its column total or the total of the table
– percentages for each row should add to 100% (given rounding
error)
– percentages for each column should total 100% (given
rounding error)
Multiple Comparisons
• separate tests do not yield an overall conclusion with a
single p value the represents the probability that two
distributions differ by chance
• multiple comparisons entail:
– an overall test to see if there is good evidence of any difference
among the parameters that we want to compare
– a detailed follow-up analysis to decide which parameters differ
and to estimate how large the differences are
Expected Counts
• if the observed counts are far from the expected counts ,
that is evidence against the null hypothesis that there is no
relationship between the two categorical variables
(row total)(column total )
expected count 
table total
Chi-Square Test
• the statistical test that tells us whether the observed
difference between two categorical variables are
statistically significant compares the observed and
expected counts
• the chi square statistic is a measure of how far the
observed counts in a two way table are from the expected
counts, given by the formula
2
(observed
count

expected
count)
2  
expected count
– large values of X2 are evidence against H0 because they say that
the observed counts are far from what we would expect if H0
were true
– the chi square test is one sided
– small values of X2 are not evidence against H0
When the Chi Square Test is Significant
• compare appropriate percents :which cells have the largest
or smallest conditional distributions?
• compare observed and expected cell counts: which cells
have more or fewer observations than we would expect if
H0 were true?
• look at the terms in the chi square statistic: which cells
contribute the most to the value of X2?
Chi Square Distributions
• take only positive values
• are skewed right
• specific chi square distributions are specified by their
degrees of freedom found by taking (# of rows – 1)(# of
columns – 1)
– use critical values from the chi square distribution (Table E)
for the degrees of freedom
– the p value is the area to the right of X2 under the density
curve of this chi square distribution
• the mean of any chi square distribution is equal to the
degrees of freedom
Uses of the Chi Square Test
• Use the chi square test to test the hypothesis H0: that there
is no relationship between two categorical variables when
you have a two way table from one of these situations:
– independent SRSs from each of several populations, with each
individual classified according to one categorical variable (the
other variable says which sample the individual comes from).
– a single SRS with each individual classified according to both
of two categorical variables
• You can safely use the chi square test with critical values
form the chi square distribution when no more than 20% of
the expected counts are less than 5 and all individual
counts are 1 or greater.
Chi Square and z Tests
• If the rows of a table are r groups and the columns are
successes and failures then the p values coming from the
chi square test with r – 1 degrees of freedom are
comparing two proportions just as we did in Chapter 19.
– the two tests (chi square and z) always agree
– X2 statistic is the z statistic squared
– p values for the X2 and the z statistic are equal
The Chi Square Test for Goodness of Fit
• A categorical variable has k possible outcomes with
probabilities p1, p2, p3, ..., pk.
• pi is the probability of the ith outcome
• we have n independent observations from this categorical
variable
• To test H0: p1 = p10, p2 = p20, ..., pk = pk0 us the chi square
statistic
2
(count
of
outcome
i

np
)
i0
X2  
np i0
• the p value is the area to the right of X2 under the density
curve of the chi square distribution with k – 1 degrees of
freedom
Objectives for Class Seven
• Compute confidence intervals and perform significance
tests for proportions for two sample problems.
• Construct and interpret two way tables of counts of
categorical variables.
• Discuss ways of representing and analyzing categorical
data.
Next Week Class Eight
To Be Completed Before Class Eight:
Chapter 20: 18, 20, 24
Chapter 22: 34, 36
Read Chapters 23 & 24
Complete Final Exam