Download Section 4.2: How to Look for Differences

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Transcript
Section 4.2: How to
Look for Differences
Cross-Tabulations

College student binge drinkers
experienced many personal and social
problems, the researchers said. For
instance, 90% of those who binged more
than once a week said they experienced a
hangover and 63% said they did
something they regretted. --- Newsday,
12/94
Making Sense of the Data

Were binge drinkers more likely to do
things they regretted than students who
did not binge drink?

Defn: Binge drinking – consuming 5 or
more dinks in a row for men or four or
more in a row for women.
Wechsler Study on American
College Students
Nonbinge
drinkers
(n=6894)
Infrequent
Frequent
binge drinkers binge drinkers
(n=4090)
(n=3291)
Did something 14%
you regret
37%
63%
Missed a
class
8%
30%
61%
Got behind in
schoolwork
6%
21%
46%
Cross-Tab Tables

Categories of one measurement are listed
on the top and the categories of the other
measurement are listed on the side. Also,
list the number of people that appear in
each cell.
Wechsler Cross Tab
Nonbinge
Kept up
6,480
with
schoolwork
Got behind 414
with
schoolwork
Infrequent
Binge
Frequent
Binge
3,231
1,777
859
1,514
Marginal Totals and Proportions

Cross Tabs will often show the totals
(proportions) of each column in the bottom
margin under the column and the total
(proportions) of each row in the margin to
the right of the row.
Cross-Tab with Column Marginals
Nonbinge
Infrequent
Binge
Frequent
Binge
Kept up with
schoolwork
6,480
3,231
1,777
Got behind
with
schoolwork
414
859
1,514
Column
Count
6,894
4,090
3,291
Column %
48%
29%
23%
Correlation


A cross-tab is a useful way of checking for
correlation between two categorical variables.
Two variables are correlated if knowing an
observation’s value for one of the measurements
helps you to better estimate the likelihoods of
that observation’s value on the other
measurement.
Correlation – Another View

If the counts you would expect if there was
no correlation are the counts you
observed, then there is no correlation.
Computing Expected Counts

Expected counts can be thought of as the
count that would be if the row proportion
applied equally to all the data in it; that is,
if there were no correlation between the
observations in the cross-tab.
Correlation in the Sample vs.
Population

Even though the expected counts may
show a correlation in the sample data, how
does one show there is correlation in the
population?
 Perform
a hypothesis test.
Step 1: State the Null Hypothesis

H0: No correlation in the population data.

Remember…. H0 is always the hypothesis
of “no difference”. In this instance “no
difference” means no difference between
expected and observed counts – or no
correlation.
Step 2: Collect Sample Data

Use the Wechsler data.

Is the sample data statistically significant? That
is, assuming the null hypothesis is correct (i.e.
there is no correlation between the two
variables), how likely is it to get the sample data
that we observed?
Step 3: Compute the Appropriate
Test Statistic

In testing a claim about one sample
proportion, the test statistic is the z-test
statistic.

Since we have several sample proportions
we’ll need a new test statistic – the chisquared test statistic.
To compute the test statistic:
Calculate the expected counts for each
cell.
 We’ll actually calculate the chi-squared
test statistic tomorrow.

Step 4: Find the p-value

A p-value means just what it did earlier….
Assuming the null hypothesis is true (that
is, there is no correlation between the two
variables), it is the probability of getting
sample data as extreme or more extreme
than we observed.
Or more simply….
If there is no correlation (association)
between the two variables, could the
differences in the observed and expected
cell counts be attributed to chance?
 With this test we’ll use the TI-83 to find the
p-value.

Step 4: Reject or retain the null
hypothesis
Step 5: State Your Conclusions

There is statistical evidence to suggest
that there is (is not) correlation between
the two variables…..