Download 40. INTRODUCTION TO t

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
The t-Distribution
Estimating a Population Mean
t-Distribution
When sample sizes are sometimes small,
and often we do not know the standard
deviation of the population, statisticians
rely on the distribution of the t statistic
(also known as the t score), whose values
are given by:
_
s
X ± t*
√n
_
Degrees of freedom
There are actually many different t distributions. The
particular form of the t distribution is determined by its
degrees of freedom. The degrees of freedom refers to
the number of independent observations in a set of
data.
df = n-1
When to Use the
t-Distribution
The population distribution is normal.The
sampling distribution is symmetric, unimodal,
without outliers, and the sample size is 15 or
less.The sampling distribution is moderately
skewed, unimodal, without outliers, and the
sample size is between 16 and 40.The sample
size is greater than 40, without outliers.
The t distribution should not be used with small
samples from populations that are not
approximately normal.
Auto polution
Constructing a one-sample t-interval for ℳ
Environmentalists, government
officials, and vehicle manufactureres
are all interested in studying the auto
exhaust emissions produced by motor
vehicles. The major pollutants in auto
exhaust from gasoline engines are
hydorcarbon, monoxide, and nitrogen
oxides.
Amount of nitrogen oxides (NOX) emitted by
light-duty engines (grams/mile)
1.28
1.24
0.95
1.31
1.31
0.51
2.27
1.47
1.17
0.71
2.20
1.80
1.45
1.49
1.87
1.06
1.16
0.49
1.78
1.15
1.22
1.33
2.94
2.01
1.08
1.38
1.83
0.97
1.32
0.86
1.16
0.60
1.20
1.26
1.12
1.47
0.57
1.45
1.32
0.78
1.73
0.72
1.44
1.79
1.51
Construct a 95% confidence interval for the mean amount of
NOX emitted by light-duty engines of this type.
_
x = 1.329
_
X ± t*
_s
n = 45
√n
df = 44
t confidence interval: (1.185, 1.473)
s = 0.484
We are 95% confident that the true
mean level of nitrogen oxides
emitted by this type of light-duty
engine is between 1.185 and 1.473
grams/mole
t* = 2.021
The Student t-distribution
 The t-distribution is a family of distributions indexed by a
“degrees of freedom” parameter. (Different distribution for
different sample sizes)
 The degrees of freedom is the number of sample values that can
vary after certain restrictions have been imposed on all data values.
 It has a bell shape, but has greater variability than N(0,1). (has a
wider spread.)
 As the sample size n gets larger, the Student t- distribution gets
closer to the normal distribution. When the sample size is small,
the degrees of freedom is small and there is more variability (i.e.,
wider spread).
 When the sample size is large, the degrees of freedom is large and
there is less variability, and it’s closer to N(0,1).
 The mean is 0.
 The standard deviation is greater than 1.
Density curves for t distribution
Standard Error of point estimate
Standard Error of a point estimate is a common term for
standard deviation of the point estimate
Standard error of y: (based on known )
Clarification of terminology
An estimator is a rule for computing a quantity from a
sample that is to be used to estimate a model
parameter.
An estimate is the value that the rule gives when the
data are taken.
The distribution of the estimator is called its sampling
distribution.
The standard deviation of the sampling distribution of
an estimator is called the standard error of the
estimator.
Properties of the t-distributions:
1.The t-curve corresponding to any fixed number of
degrees of freedom (df) is bell shaped, symmetric and
centered at 0.
2.Each t-curve is more spread out than the z-curve
(standard normal curve).
3.As the df increase, the spread of the corresponding tcurve decreases.
4.As the number of df increases, the t-curves get closer
and closer to the z-curve.
5.When estimating a single mean, df = n – 1.
Estimating a population mean