Download What is inference?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Ronald Fisher wikipedia , lookup

Probability box wikipedia , lookup

Transcript
Inferential statistics by
example
Maarten Buis
Monday 2 January 2005
Two statistics courses
• Descriptive Statistics (McCall, part 1)
• Inferential Statistics (McCall, part 2 and 3)
Course Material
• McCall: Fundamental Statistics for
Behavioral Sciences.
• SPSS (available from Surfspot.nl)
• Lectures: 2 x a week
• computer labs: 1 x a week.
• course website
setup of lectures
• Recap of material assumed to be known
• New Material
• Student Recap
How to pass this course
• Read assigned portions of McCall before
each lecture
• Do the exercises
• Do the computer lab assignments, and
hand them in before Tuesday 17:00!
• come to the computer lab
• come to the lectures
• ask questions: during class or to the
course mailing list
What is inference?
• Drawing general conclusions from partial
information
• Based on your observations some
conclusions are more plausible than
others.
• Compare with logic
Sources of uncertainty in inference
•
•
•
•
Sample
Measurement
Model
Typos when typing the data into SPSS
• Inference, as discussed here, assumes
that random sampling error is by far the
most dominant source of uncertainty.
How is inference done?
• If a null hypothesis is true than the probability of
observing the data is so small that either we
have drawn a very weird sample or the null
hypothesis is false. (Ronald Fisher)
• We use a “good” procedure to choose between
two hypotheses, whereby “good” means that you
draw the right conclusion in 95% of the times
you use that procedure. (Jerzy Neyman and
Egon Pearson)
PrdV
• New populist party, wanted to participate
in the next election if 41% of the Dutch
population thought that “the PrdV would be
an asset to Dutch politics”.
• This was asked to a sample of 2,598
people between, and on 16 December
only 31% agreed.
• Peter R. de Vries decided not to
participate in the next election.
The Inference Problem
• The 31% people approving is 31% of the
people in the sample.
• Peter R. de Vries doesn’t care about what
people in the sample think, he cares about
what all the people in the Netherlands
think.
• Could it be that he has drawn a “weird”
sample, and that in the Netherlands 41%
or more really think he would be an asset
to Dutch politics?
Two hypotheses
• H0: 41% or more support PrdV
• HA: less than 41% support PrdV
A thought experiment (1)
• If support for PrdV in the Netherlands is 41%
and we draw 100 random samples of 2598
persons, than we get 100 estimates of the
support for PrdV, some of them a bit too high,
some of them a bit too low.
• We would expect that 5 samples would show a
support for PrdV of 39% or less.
• If we find a support for PrdV of 39% or less and
reject H0, than we have followed a procedure
that would result in taking the right decision in
95% of the times we used that procedure.
What does that 39% mean?
• We propose the following procedure: If we
find a support for PrdV of less than x%
than reject H0
• We choose x in such a way that the
probability of rejecting H0 when we
shouldn’t is only 5%
• The reason for mistakenly rejecting H0 is
drawing a ‘weird’ sample.
Where does that 39% come from?
• If H0 is true, than we draw a sample from a
population in which the support for PrdV is 41%
• We can let the computer draw many (100,000)
samples and calculate the mean in each
sample.
• 50,000 or 5% of these samples have a mean of
39% or less.
• So if we reject H0 when we find a support of 39%
or less, than the probability of making a mistake
is 5%
sampling distribution of support for PrdV
1.0e+04
8000
6000
4000
2000
0
.36
.38
.4
.42
% support for PrdV
.44
.46
Where did that 39% come from?
• If we draw many random samples, and compute
the mean in each sample, than the distribution of
these means will be approximately normally
distributed with a mean of .41 and a standard
SD
deviation of N
• Remember that the sample size is 2598, and
the SD of a proportion is p  (1  p) , so the
Standard Deviation of the distribution of means
is .41 (1  .41)  .0096
2598
• 5% of the samples has a support for PrdV of
less than 39%
Neyman Pearson hypothesis
testing
• This procedure is the Neyman Pearson
hypothesis testing approach
• Note that it tells us something quality of
the procedure we use to make a decision,
not about the strength of evidence against
H0
Thought experiment (2)
• If the H0 is true, than the probability of
drawing a sample of size 2598 with a
support for PrdV of 31% or less is 1.041 x
10-25.
• This is so small that we think it is safe to
reject H0.
Where did that 1.041 x 10-25 come from?
• In the 100,000 samples that were drawn from
the population if H0 were true none were lees
than .31%
• So the probability of drawing this or a more
extreme sample when H0 is true is less than
1/100,000.
• Remember that if H0 is true, the distribution of
means obtained from many samples is normal
with a mean of .41 and a standard deviation of
.0096
• The proportion of samples with a mean less than
.31 is 1.041 x 10-25
Fisher hypothesis testing
• This procedure is Fisher hypothesis
testing.
• Note that it gives us a measure of
evidence against H0, but it does not give
us an indication of how likely we are to
make the wrong decision.
Fisher vs. Neyman Pearson
• You will draw the same conclusion
whichever method you use.
• However, it really helps to choose one
approach when writing your results down.
Limits to inference
• More importantly, both assume random
sampling, and we almost never have that.
• Testing is more helpful to determine
whether the data is ‘screaming’ or
whispering’ at us.
• Knowing the reasoning behind statistical
inference will help you determine the
weight you should assign to conclusions
derived from statistical tests.
Terminology (1)
• Distribution means obtained from different
samples is the sampling distribution of the
mean.
• The standard deviation of the sampling
distribution is the standard error.
• Proportion of samples that wrongly reject the H0
is the significance level or a or Type I error
rate.
• Proportion of samples that wrongly fail to reject
H0 is the Type II error rate or b.
• Proportion of samples that will rightly reject H0 is
the power.
Terminology (2)
• The probability of the data given that H0 is
true is the p-value.
• Maximum p-value that will cause you to
reject H0 is also the level of significance.
What to do before Wednesday?
• Read Chapter 8
• Do exercises of chapter 8