Download Tests of Significance and Measures of Association

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Tests of Significance and
Measures of Association
Definitons

Test of Significance – Given a random sample drawn from a population, a
test of significance is a formal test evaluating the probability that an event
or statistical result based on the sample could have happened by random
chance.

With most tests of significance we look for low p-values. The lower the pvalue, the lower the probability that the event or result could have
happened by random chance. What a p-value tells us is that “1-the p-value”
percent of the possible samples we could have drawn from the population
would contain our estimate.
Null and Alternative Hypotheses

Null hypothesis – Often labeled H0. One perspective: an estimated result
is due to random sampling error. A statement that the estimated
parameter or parameter difference is random in the population.

Null hypothesis – Another perspective is that the estimated result is
consistent with some externally imposed value, say due to theory or prior
research.

Alternative hypothesis – Often labeled Ha.Your estimated result. If we are
to have confidence in an estimated result, then we need evidence that the
result could not have resulted from random chance.

In inferring from a sample to a population, we can be wrong in either of
two ways.

False rejection error. Falsely rejecting a true null hypothesis. This type of
error is usually called Type I Error. Here the researcher concludes there is
a relationship in the population, when there is actually none.

False acceptance error. Falsely accepting a false null hypothesis. This type of
error is usually called Type II Error. Here the researcher concludes there is
no relationship in the population, when there actually is one.

Which type of error is worse (if you don’t want to mislead science)? Type I
error is worse. Why?

Consider the following graph as a way of organizing your thoughts on Type
I and II error.
Comparing the Means of Two
Groups

Consider the variables from the National Election Studies which give
respondents’ feeling thermometer about the Democratic party.

Men and women differ in their feelings about the Democratic party.
However, in order to test the significance of these differences we need the
standard error of the mean differences. If the variances of the two samples
are the same, then we can simply calculate these from the standard errors
for men and women.

Square the two estimates of the standard error for the two samples for
men and women. Sum the squared standard errors. Take the square root of
the sum of the squared standard errors.

That is,
Two Approaches to Hypothesis
Testing

Confidence Interval Approach- Here we use the standard error to
determine the smallest plausible mean difference in the population.
Or

P-value approach- Here we determine the exact probability of obtaining
the observed sample difference, given that the null hypothesis is true.

Look up the appropriate p-value in the distribution. The t-distribution
depends on the degrees of freedom. Last week we estimated one
parameter, so the degrees of freedom was N-1. Here we are estimating
two parameters so the degrees of freedom is N-2.
Alternatively, let STATA compute the p-value. It returns a p-value of 0.0005.
In other words, given the sample drawn, there is only a 0.0005 chance that
the true mean difference in the population is zero. Said differently, only 5 in
10,000 of the samples drawn from the population would report a mean of
zero.
Comparing Sample Proportions

The standard error of a difference in proportions is easy to compute.
Let p1 and p2 be the proportions for two groups. Then, let q1=(1-p1) and
q2=(1-p2) be the complements of these two proportions. Then, the
standard error of the difference in proportions is given
Example

Is this a statistically significant difference? Using the rule of thumb that if
the estimated coefficient is roughly twice the standard error, the difference
is statistically significant.

We observe an estimated coefficient (0.093) about 3 times as large as it’s
standard error (0.031).
Chi-square Test of Significance

A Chi-square test evaluates whether the observed dispersion of cases
deviates from what one would expect if the null hypothesis were correct.

The Chi-square test is commonly used when one is conducting a crosstabular analysis.

The test is based on the difference between the observed frequencies in a
table versus the expected frequency if there is no relationship.
Example:

Summing the numbers in bold the table, we have a Chi-squared statistic of
15.2. We compare this statistic to a table of Chi-squared statistics with
degrees of freedom equal to (r-1)(c-1)
Measures of Association

Statisticians have developed a number of measures of association for
evaluating the strength of relationships.

PRE- Proportional Reduction in Error. PRE measures give a number that
ranges from 0 to 1. 0 represents no association and 1 represents a perfect
association.

Lambda and Somers d are two different measures of association,
commonly used in evaluating relations in a cross-tabular analysis.
Here there are 226+358=584 errors in prediction when hypothesizing that
gender makes a difference. Calculate

Somers dyx- An alternative measure of association which counts
concordant and discordant pairs.
Clowfrequent=7*5+7*6=77; Clowoccasional=5*6=30
Thus, there are 77+30=107 pairs that are concordant with there being a
positive relationship. Work diagonally and down.
Dhighfrequent=3*5+3*4=27; Dhighoccasional=5*4=20
Thus, there are 47 pairs that are discordant with there being a positive
relationship. Work diagonally and up.
There are 60 more pairs that fit a positive than negative relationship.
However, we need the number of ties to convert this number to a measure
of association from 0 to 1.
Tlow=7*5+7*4+5*4=83; Thigh=3*5+3*6+5*6=63; T=146
Somers dyx=(107-47)/(107+47+146)=0.20