Download Section 10.1 ~ t Distribution for Inferences about a Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Section 10.1 ~
t Distribution for Inferences about a Mean
Introduction to Probability and Statistics
Ms. Young
Sec. 10.1
Objective

After this section you will understand when it
is appropriate to use the t distribution rather
than the normal distribution for constructing
confidence intervals or conducting hypothesis
tests for population means, and know how to
make proper use of the t distribution.
Sec. 10.1
t Distribution for Inferences about a Mean

When dealing with confidence intervals (ch.8) and hypothesis
testing (ch.9), we worked with samples that were large enough to
assume a normal distribution which allowed us to use the standard
scores (z-scores) to find probabilities of certain values occurring


Recall that in order to find the z-score, the population standard
deviation is needed
In real applications, the population standard deviation is typically
not available, which means that in order to find the confidence
interval or conduct the hypothesis test we would estimate it using
the sample standard deviation

Many statisticians believe that this is not the best approach and they
use what is known as a t distribution (or student t distribution) in
place of the normal distribution

As long as the sample size is at least 30 or the population assumes a
normal distribution, a t distribution can be used to find a confidence
interval and/or conduct a hypothesis test



The t distribution is similar in shape and symmetry to the normal distribution
It accounts for greater variability that is expected with small samples
Note ~ when you know the population standard deviation and the
sample size is greater than 30 or the population is normally
distributed, the normal distribution is best to use
Sec. 10.1
t Distribution for Inferences about a Mean

The following diagram is a comparison between the standard
normal distribution and two different t distributions of sample
size n = 3 and n = 12

As you can see, they are very similar in shape, and as the sample size
increases, the t distribution becomes more and more normal
Sec. 10.1
Confidence Intervals Using the t Distribution



When determining a confidence interval using a t distribution, we
use t values rather than z-scores to determine significance
A t value is a number that represents the number of standard
deviations a value falls from the mean on a t distribution
Recall that to write a confidence interval, you must first calculate
the margin of error

The formula for the margin of error using a t distribution is:

t = t value

s
E t
n
Found by looking up the value that corresponds to the appropriate number of
degrees of freedom (table 10.1 on P.412 )
degrees of freedom for t distribution = n  1


n = sample size
s = standard deviation of the sample
Sec. 10.1
Critical Values of t
Degrees of
freedom
Use column 2 for a
97.5% confidence level
for a one-tailed test
Use column 3 for a 95%
confidence level for a onetailed test
Use column 3 for a 90%
confidence level for a twotailed test (or confidence
interval)
Use column 2 for a 95%
confidence level for a twotailed test (or confidence
interval)
Sec. 10.1
Confidence Intervals Using the t Distribution

Recall that the standard form for a confidence interval when
dealing with means is:
x E   x E

Example 1 ~ Diastolic Blood Pressure

Here are five measures of diastolic blood pressure from randomly
selected adult men: 78, 54, 81, 68, 66. These five values result in
these sample statistics: n = 5, x  69.4, and s = 10.7. Using this
sample, construct the 95% confidence interval estimate of the mean
diastolic blood pressure level for the population of all men.


Note ~ we are using the t distribution because the population standard
deviation is not known and it is reasonable to assume that blood pressure
levels are normally distributed
Before finding the margin of error, we must first find the t value from
the table that corresponds to 4 degrees of freedom (since the sample
size was 5; the degrees of freedom is 5 – 1, or 4)


For the 95% confidence level, 4 degrees of freedom corresponds to a t value
of t = 2.776
Note ~ for confidence intervals, we use the t values for the “area in two tails”
because the margin of error can either be below the mean or above the mean
Sec. 10.1
Confidence Intervals Using the t Distribution

Example 1 Cont’d…

Here are five measures of diastolic blood pressure from randomly selected
adult men: 78, 54, 81, 68, 66. These five values result in these sample
statistics: n = 5, x  69.4, and s = 10.7. Using this sample, construct the 95%
confidence interval estimate of the mean diastolic blood pressure level for
the population of all men.

Now that we know that t = 2.776, we can find the margin of error:
s
E t
n

10.7
 E  2.776 
5
 E  13.3
To construct the confidence interval, add and subtract the margin of error to the
sample mean ( x )
x E   x E
69.4  13.3    69.4  13.3
56.1    82.7

Based on the five sample measurements, we can be 95% confident that the true
mean of diastolic blood pressure for adult men is between 56.1 and 82.7
Sec. 10.1
Hypothesis Tests Using the t Distribution

When a t distribution is used to conduct a hypothesis test, the t
value plays the role that the z-score played when we worked with
the normal distribution
Recall, that we determined statistical significance by comparing the
z-score to critical values or by using the z-score to determine the Pvalue
 Use the following formula to calculate the t value:

x 
t
s/ n

null hypothesis
This t value is then compared to the “Critical Values of t” chart to
determine significance

Note ~ a P-value can be calculated, but it is usually done with the aid of
statistical software in which case we will not be calculating the P-values
using a t distribution in this course
Sec. 10.1
Hypothesis Tests Using the t Distribution

Once you calculate t, you can decide whether to reject or not
reject the null hypothesis by using this following criteria:

Right-tailed test: reject the null if the t value that you found is ≥
the t value from the table (that corresponds to the appropriate
degrees of freedom)


Left-tailed test: reject the null if the t value that you found is ≤
the negative of the t value from the table (that corresponds to the
appropriate degrees of freedom)


Use column 2 as a comparison if you want a 97.5% confidence level and
column 3 if you want a 95% confidence level
Use column 2 as a comparison if you want a 97.5% confidence level and
column 3 if you want a 95% confidence level
Two-tailed test: reject the null if the absolute value of the t
value that you found is ≥ to the t value from the table (that
corresponds to the appropriate degrees of freedom)

Use column 2 as a comparison if you want a 95% confidence level and
column 3 if you want a 90% confidence level
Sec. 10.1
Hypothesis Tests Using the t Distribution
Example 2 ~ Right Tailed Hypothesis Test for a Mean
Listed below are ten randomly selected IQ scores of statistics students:
111 115 118 100 106 108 110 105 113 109
Using methods from Chapter 4, you can confirm that these data have the following
sample statistics: n = 10, x  109.5 , and s = 5.2. Using a 0.05 significance level,
test the claim that statistics students have a mean IQ score greater than 100,
which is the mean IQ score of the general population.

Step 1:
H 0 :   100

H a :   100
Step 2:
Sample size: n = 10
 Sample mean: x  109.5
 Standard deviation of the sample: s = 5.2

Sec. 10.1
Hypothesis Tests Using the t Distribution

Step 3:

Since this is a one-tailed test, the t value that we will be comparing will be
found in the 3rd column of the table that corresponds to 9 degrees of
freedom (10 – 1); it is 1.833


Since this is a right-tailed test, it will be statistically significant if the t value that
we found is greater than or equal to the t value of 1.833 (found in the table)
5.777 is greater than 1.833, so this is statistically significant at the 0.05 level
t

x 
s/ n
 t
109.5  100
5.2 / 10
 t  5.777
Step 4:

Since this is statistically significant at the .05 level, we can conclude that we
have enough evidence to reject the null hypothesis and support the claim that
the mean IQ score of the general population is greater than 100
Sec. 10.1
Hypothesis Tests Using the t Distribution
Example 3 ~ Two Tailed Hypothesis Test for a Mean
Using the same data from example 2 and the same significance level of .05, test the
claim that the mean IQ score is equal to 100

Step 1:
H 0 :   100


H a :   100
Step 2:
 Sample size: n = 10
 Sample mean: x  109.5
 Standard deviation of the sample: s = 5.2
Step 3:
t  5.777



Since this is a two-tailed test, we are looking at column 2 for a .05 significance level
The degrees of freedom is 9, so the t value in the table is 2.262
Because this is a two-tailed test, this will be statistically significant at the .05 level if
the absolute value of our t value (5.777) is greater than or equal to 2.262
Sec. 10.1
Hypothesis Tests Using the t Distribution

Step 4:
 Since the absolute value of the t value that we found (5.777) is greater than
2.262, we can say that this is statistically significant at the .05 level and
therefore reject the null hypothesis that the mean score is equal to 100
 In other words, there is sufficient evidence that supports the alternative
hypothesis that the mean IQ score is not equal to 100