Download Statistics 203

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Topics for Today
More on Confidence Intervals
Stat203
Fall 2011 – Week 6 Lecture 3
Page 1 of 28
Example
NAEP quantitative scores
The NEAP is a broad-scope survey conducted
on a _____________ of students in grade 4, 8
and 12 in the United States.
http://en.wikipedia.org/wiki/National_Assessment_of_Educational_Progress
It’s used by law makers and educators to form
policy and priorities.
Such as
comparing
states:
[ source http://www.schoolinfosystem.org/archives/naep_state.gif ]
Stat203
Fall 2011 – Week 6 Lecture 3
Page 2 of 28
and tracking the reading performance of
students (scores are from the ‘reading’
component of the survey)
[ source: http://www.balancedreading.com/2005NAEP.gif ]
Stat203
Fall 2011 – Week 6 Lecture 3
Page 3 of 28
… but there are many components:
Stat203
Fall 2011 – Week 6 Lecture 3
Page 4 of 28
The NAEP mathematics section test basic
arithmetic skills and gives a ‘quantitative
score’.
Scores on the test range from 0 to ___.
A person who scores 233 can add the
amounts of two cheques appearing on a bank
deposit slip.
Someone scoring 325 can determine the price
of a meal from a menu.
A person scoring ___ can transform a price in
cents per ounce into dollars per pound.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 5 of 28
In a recent year, ___ U.S. men 21 to 25 years
of age were in the NAEP sample.
Their mean quantitative score was x = ___.
These ___ men are a _____________ from
the population of all young men.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 6 of 28
On the basis of this sample, what can we infer
about the mean score μ in the population of all
9.5 million young men of these ages in the
U.S.?
Stat203
Fall 2011 – Week 6 Lecture 3
Page 7 of 28
The ____________________ tells us the
sample mean x from a random sample will be
close to the population mean μ.
x = 272 so μ is somewhere around 272
But how _____?
To determine how close it is, we need to
remember the _____________________ of x ,
and use it to construct a confidence interval.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 8 of 28
Sampling distribution of x :
 _____________________: the mean x of
840 scores has a distribution that is
approximately ______.
 The mean of this normal sampling
distribution is the same as the mean, μ of
the entire population.
 The __________________ of x from a

sample of 840 is 840 , where σ is the
standard deviation of the distribution of
individual scores.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 9 of 28
Suppose σ = __. The standard deviation of x
is 60840 = ___
If we choose many samples of size 840 and
find the means and display their distribution,
it would look more like a ______ distribution
with mean μ and standard deviation = 2.1.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 10 of 28
Statistical Confidence
The 68-95-99.7 rule says that in 95% of all
samples, the mean x will be within _ standard
deviations of the _______________ score μ.
So in 95% of the _______ the population
parameter μ is between
x - 4.2 and x + 4.2
Our sample gave x = ___
Stat203
Fall 2011 – Week 6 Lecture 3
Page 11 of 28
Statistically speaking we are 95% confident
that the unknown mean (μ) score lies between
x - 4.2 = 272 - 4.2 = ______
and
x + 4.2 = 272 + 4.2 = _____
95% confidence implies the method gives
correct results 95% of the time.
x  4.2 ________________
is called a 95% confidence interval for μ.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 12 of 28
Recognize there are 2 possibilities:
1. The interval between 267.8 and 276.2
________ the true population parameter μ.
2.
Our sample was one of the few samples
for which x is __________ 4.2 points of μ.
Only __ of the samples will give such
results.
[ source http://www.southalabama.edu/coe/bset/johnson/lectures/lec16_files/image006.jpg ]
Stat203
Fall 2011 – Week 6 Lecture 3
Page 13 of 28
Stat203
Fall 2011 – Week 6 Lecture 3
Page 14 of 28
Confidence Interval
Estimate ± _______________
Estimate: Our guess for the unknown
parameter. The estimate in our case is the
statistic x .
Margin of Error: Measures the _________ of
our estimate, based on the variability of the
estimate.
Confidence Level: ___________ that the
interval will _______ the true parameter. In
repeated sampling we would expect 95% of
the intervals to contain the true population
parameter μ.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 15 of 28
Margin of Error
A __% confidence interval for the population
mean μ has a margin of error
____ * (standard deviation of x )
= 1.96 *

n
The margin of error for a __% confidence
interval for the mean μ is given by:
_____*

n
The margin of error for a __% confidence
interval for the mean μ is given by:
____ *

n
Stat203
Fall 2011 – Week 6 Lecture 3
Page 16 of 28
Example: Interpretting a CI
A poll on voting preferences for candidates
interviewed 1025 people randomly selected in
the Vancouver area. The poll found that 46 %
of the people surveyed said they preferred the
Liberal party.
a) The poll announced a margin of error of ±
3 percentage points with a 95%
confidence level. What is the 95%
confidence interval for the percent of all
people who will vote Liberal in the
upcoming election?
b) What are your conclusions?
Stat203
Fall 2011 – Week 6 Lecture 3
Page 17 of 28
Example: Analyzing chemical data
A manufacturer of chemical products analyses
a sample from each batch of a product to
verify the concentration of a particular
ingredient. The chemical analysis is not very
accurate.
Repeated measurements on the same batch
give different results and are approximately
normally distributed.
The analysis procedure has no bias and the
population mean μ is the true concentration of
the sample. The standard deviation of this
distribution is known to be σ = 0.0068
grams/litre.
The lab analysed each sample three times and
reported the average reading.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 18 of 28
Three analyses of one sample give the
following concentrations:
0.8403
0.8363
0.8447
a) Construct a 95% confidence interval for the
true concentration μ.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 19 of 28
b) Management asks the lab to produce
results that are accurate to within ± 0.001 with
a 95% confidence. How many samples should
be taken to comply with this request?
Stat203
Fall 2011 – Week 6 Lecture 3
Page 20 of 28
Stat203
Fall 2011 – Week 6 Lecture 3
Page 21 of 28
In reality,  is usually UNKNOWN
Up to now, all the examples have given you
the standard deviation from the __________
(), but this is rarely (if ever) known. Instead,
we will have to use the standard deviation
from the ______ (s)
Stat203
Fall 2011 – Week 6 Lecture 3
Page 22 of 28
Remember, the standard deviation  and s
mean the same thing, just one is measured on
the __________ and the other is measured on
the ______.
However, when the population standard
deviation is unknown, we ____ to use the
sample standard deviation.
This doesn’t come for free.
Since we’re estimating something else from
the sample (instead of just ‘knowing’ it), our
confidence intervals get _____.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 23 of 28
The t-distribution
The t-distribution is a similar shape to the
normal distribution.
(t in red, normal in blue)
[source http://www.mechanical-writings.com/img/gt/confidence-interval-t-distribution/T_distribution_1df.png ]
Stat203
Fall 2011 – Week 6 Lecture 3
Page 24 of 28
Unlike the normal distribution, (which depends
only on µ and ), the t-distribution depends on
something called the degrees of freedom.
[source http://02.edu-cdn.com/files/static/mcgrawhillprof/9780071621885/ESTIMATION_AND_CONFIDENCE_INTERVALS_07.GIF ]
Stat203
Fall 2011 – Week 6 Lecture 3
Page 25 of 28
or … you can check out this applet:
http://www.stat.tamu.edu/~jhardin/applets/signed/T.html
look at the panel on the left side and change
the degrees of freedom (slider at the top of the
page) to see how the t-distribution becomes
more ‘normal’ as the degrees of freedom
increase.
Our margin of error will use the t-distribution
instead of the z.
More on this … next time.
Stat203
Fall 2011 – Week 6 Lecture 3
Page 26 of 28
Today’s Topics
Confidence Intervals
- Constructing and interpreting confidence
intervals
- t-statistic is used if  is unknown
Stat203
Fall 2011 – Week 6 Lecture 3
Page 27 of 28
Reading for next lecture
No New Reading
Stat203
Fall 2011 – Week 6 Lecture 3
Page 28 of 28