Download t distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

German tank problem wikipedia, lookup

History of statistics wikipedia, lookup

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Degrees of freedom (statistics) wikipedia, lookup

Statistical inference wikipedia, lookup

Transcript
10-2
Estimating a Population Mean
(σ Unknown)
Confidence Intervals Involving Z
Using the Calculator
The t distributions
When we substitute the standard error of xbar for
its standard deviation, the distribution of the
resulting statistic, t, is not Normal.
We call it the t distribution.
The t distributions
There is a different t-distribution for each sample
size n.
We specify a t distribution by giving its degrees of
freedom, which is equal to n-1
We will write the t distribution with k degrees of
freedom as t(k) for short.
We also will refer to the standard Normal
distribution as the z-distribution.
Comparing t and z distributions
Y1= normalpdf(x)
Y2= tpdf(x,2) (DISTR menu)
Window X[-3,3] Y[-0.1,0.4]
Y2= tpdf(x,9)
Y2= tpdf(x,30)
Comparing t and z distributions
Compare the shape,
center, and spread of
the t-distribution with
the z-distribution.
As the degrees of freedom k increase, the t(k)
density curve approaches the N(0,1) curve ever
more closely. As the sample size increases, s
estimates σ ever more closely.
Finding t with Table C
Suppose you want to construct a 95% confidence
interval for the mean mu of a population based on
a SRS of size n=12. What critical value t* should
you use?
If you have a TI-84+, you
can use invT((1+C)/2, df)
to find t*.
One sample t interval for mu
Recall the inference tool-box
One sample t interval for mu
Environmentalists, government officials, and
vehicle manufacturers are all interested in
studying the auto exhaust emissions produced by
motor vehicles. The table gives the nitrogen oxide
(NOX) levels for a random sample of light-duty
engines of the same type.
One sample t interval for mu
Construct a 95% confidence interval for the mean
amount of NOX emitted by light-duty engines of
this type.
One sample t interval for mu
Step 1: Parameter
Step 2: Conditions
Step 3:Calculations
s
x± t
n
One sample t interval for mu
Step 1: Parameter
Step 2: Conditions
Step 3:Calculations
Step 4:Interpretation
Remember the three C's! Conclusion, Connection, Context
One sample t interval for mu
Note: When the actual df does not appear in
Table C, use the greatest df available that is less
than your desired df.
Recall matched pairs design...
Matched pairs is a form of block design in which
just two treatments are compared.
Subjects are matched in pairs and each treatment
is given to one subject in each pair.
or...
each subject receives both treatments in some
randomized order
Is it a matched pairs design?
When you have two sets of data, ask yourself if
there is something that links the values in pairs
and, therefore, prevents them from being
independent. If so, a one-sample procedure is
optimal.
Inference procedures for two samples assume
that the samples are selected independently of
each other. This assumption does not hold when
the same subjects are measured twice.
Too many numbers what do I do?
Notice that paired t procedures are also useful in
before and after observations on the same
subjects.
Warning: I probably shouldn’t even
show you the next slide.
Don’t even think about writing it down…
it’s on page 651.
The parameter µ in a paired t procedure
is...



The mean difference in the responses to the
two treatments within matched pairs of subjects
in the entire population (when subjects are
matched in pairs), or...
The mean difference in response to the two
treatments for individuals in the population
(when the same subject receives both
treatments), or...
The mean difference between before-and-after
measurements for all individuals in the
population (for before-and-after observations on
the same individuals).
The parameter µ in a paired t procedure
is...

Okay so it’s the difference in the means for the
entire population.
Paired t procedures
Example 10.10 pg 651
Construct and interpret a 90% confidence interval
for the mean change in depression score.
Randomization
Random Selection of individuals for a statistical
study allows us to generalize the results of that
study to a larger population.
Random Assignment of treatments to subjects in
an experiment lets us investigate whether there is
evidence of a treatment effect (cause and effect).
That is it lets us compare results of different
treatments.
The t-procedures are not robust against outliers,
because xbar and s are not resistant to outliers.
One sample t interval for mu
Without the outlier, the interval is much narrower
and centered differently (1.165, 1.421). Can we
really be 95% confident in either interval?
No, since the outlier suggests that the population
may not be Normal.
Robust Procedures
T procedures are not robust against outliers, but
they are quite robust against non-Normality of the
population when there are no outliers, especially
when the distribution is roughly symmetric.
Robust Procedures
Larger samples improve the accuracy of critical
values from the t distribution when the population
is not normal.
For most purposes, you can safely use the onesample t procedures when n≥ 15 unless an
outlier or some strong skewness is present.
Why can’t I use a z procedure is n is large?
Because σ is unknown!
Can we use t?
Given the percent of each state's residents who
are at least 65 years of age, can or should we use
t to approximate the mean of these percents?
Hint: This is a population not a sample.
Can we use t?
Given the time of the first lightning strike each day in a
mountain region of Colorado, can or should we use t
procedures to draw conclusions about the mean time of a
day's first lightning strike with complete confidence?
Hint: n =70 and the distributionn is what shape?
Can we use t?
Given the distribution of word lengths in Shakespeare's
plays?
Hint: n is unknown.