Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Class Handout #3 (Sections 1.8, 1.9)
Definitions
zAREA the z-score above which lies an area under the normal curve equal to the
subscript AREA
1. Find each of the z-scores listed by using Table 2 of the Statistical Tables.
0.05
0.025
z0.05 = 1.645
z0.025 = 1.960
0.005
z0.005 = 2.576
0.01
z0.01 = 2.326
2. When a measurement is randomly selected from a population having a normal
distribution, what is the probability that the z-score for this measurement will be
less than z0.10?
1 – 0.10 = 0.90
3. For each of the normal probability distribution curves, find the indicated areas
under the curve.
0.05
area = ______
area = ______
0.90
– z0.05 = – 1.645
0.025
area = ______
– z0.025 = – 1.960
0.95
area = ______
area = ______
0.05
z0.05 = + 1.645
0.025
area = ______
z0.025 = + 1.960
0.99
area = ______
0.005
area = ______
– z0.005 = – 2.576
0.01
area = ______
0.005
area = ______
z0.005 = + 2.576
area = ______
0.98
– z0.01 = – 2.326
/2
area = ______
area = ______
0.01
z0.01 = + 2.326
1–
area = ______
– z / 2
/2
area = ______
z / 2
Class Handout #3
Definitions
zAREA the z-score above which lies an area under the normal curve equal to the
subscript AREA
point estimation
the use of the value of a statistic to estimate a parameter
Examples are
(1) using x to estimate  ,
(2) using s to estimate  .
If the mean of the sampling distribution for a statistic is equal to the
parameter being estimated, then the statistic is called unbiased;
otherwise the statistic is called biased.
With random sampling, x is an unbiased estimator of .
interval estimation the use of an interval of values (often based on a statistic and a
standard error) to estimate a parameter
confidence interval an interval estimate together with a corresponding probability;
the probability represents the chance that the interval actually
contains the parameter being estimated and is called the
confidence level or confidence coefficient.
The most commonly chosen confidence levels are 90%, 95%, and 99%.
role of the Central Limit Theorem in finding a confidence interval for 
The Central Limit Theorem tells us that when a random sample of size n is taken
from a population that has a normal distribution with mean  and standard deviation
, then the sampling distribution of x has a normal distribution with mean 

——
and standard deviation
.
n
This is even true when the population does not
have a normal distribution, as long as the sample size n is sufficiently large.
called the standard error of estimate or standard error of the mean
There is a 95% chance that x will be within 2 (more precisely, 1.960
standard errors of .
)
There is a (1 – )100% chance that x will be within z/2 standard errors of .
That is, we can be (1 – )100% confident that population mean  will lie


between
x – z/2 ——
and
x + z/2 —— .
n
n
s
Not knowing the value for , we estimate the standard error with —— .
n
With this estimated standard error of the mean, we must use a t distribution in
place of a normal or z distribution.
Student’s t distribution a distribution based on sample standard deviation s similar to
the way the standard normal distribution is based on
population standard deviation 
The t distributions
(1) depend on degrees of freedom df (= n –1 for the one sample t test statistic);
(2) are symmetric and bell-shaped but flatter than the standard normal distribution;
(3) become more like the standard normal distribution as df increase.
Table 3 of the Statistical Tables displays values from various t distributions.
The concept of “degrees of freedom” is not easy to explain completely, but one
intuitive explanation is to think of “degrees of freedom” as representing the
“number of pieces of data observed” minus “the number of parameters being
estimated”. When using a confidence interval to estimate a population mean , we
observed n pieces of data (i.e., the n measurements in the selected random sample),
and we are estimating 1 parameter (i.e., the population mean ).
tAREA the t-score above which lies an area under a t curve equal to the subscript AREA;
if the corresponding degrees of freedom (df) is not clear, the t-score can be
represented as tdf ; AREA
4. Use Table 3 of the Statistical Tables to obtain each of the following:
t distribution
with df = 1
t distribution
with df = 8
t0.10 = 3.078
t distribution
with df = 1
t0.05 = 1.860
t distribution
with df = 2
t0.025 = 12.706
t distribution
with df = 3
t0.025 = 4.303
t distribution
with df = 15
t0.025 = 3.182
t distribution
with df = 30
t0.025
=
2.131
t distribution
with df = 
t0.025 = 2.042
t0.025 = 1.960
= z0.025
The t-scores in the  row are exactly the same as z-scores.
tAREA the t-score above which lies an area under a t curve equal to the subscript AREA;
if the corresponding degrees of freedom (df) is not clear, the t-score can be
represented as tdf ; AREA
confidence interval for 
We can be (1 – )100% confident that
the population mean is between
s
x – t/2 ——
n
and
s
x + t/2 —— .
n
For a given sample size, increasing the confidence level _______________
increases
the confidence interval length, and decreasing the confidence level _______________
decreases
the confidence interval length.
For a given confidence level, increasing the sample size _______________
tends to decrease
the confidence interval length, and decreasing the sample size _______________
tends to increase
the confidence interval length.
5. Forbes magazine published data on the best small firms in 1993. (Forbes,
November 8, 1993, "America's Best Small Companies,"); these were firms with
annual sales of more than $5 million and less than $350 million. The ages (in
years) of the chief executive officer (CEO) for the first 20 firms listed are as
follows:
53
43
33
45
46
55
41
55
36
45
55
50
49
47
69
51
48
62
45
37
(This data is stored in the worksheet CEO_Data of the Excel file M214_Data.)
(a) Treating these 20 ages as a random sample of ages from the population of ages
of chief executive officers for small companies, find a 95% confidence interval
for the mean age of chief executive officers for small companies.


0.025
—
=
—
= 0.025
n = 20
x = 48.25
s = 8.6382
2
2
1 –  = 0.95
These statistics can all be verified by using the
Excel spreadsheet named Summary_Statistics,
df = 19
48.25 – (2.093)(8.6382/20) , 48.25 + (2.093)(8.6382/20)
t0.025 = 2.093
44.207 , 52.293
We can be 95% confident that the mean age of chief executive officers for small
companies is between 44.207 and 52.293 years.
5.-continued
(b) What must we assume in order for the confidence in part (a) to be appropriate?
We assume that either the ages are normally distributed, at least approximately,
or the sample size 20 is sufficiently large so that the sampling distribution of y
is approximately normal.
(c) How would the confidence interval in part (a) have been different if a 90%
confidence level were chosen?
The 90% confidence interval would have shorter length than the
95% confidence interval in part (a).
(d) How would the confidence interval in part (a) have been different if a 99%
confidence level were chosen?
The 99% confidence interval would have longer length than the
95% confidence interval in part (a).
(e) How would the confidence interval in part (a) have been different if the sample
size were 40 instead of 20?
A 95% confidence interval based on a sample size of 40 would tend to have
shorter length than a 95% confidence interval based on a sample size of 20.
(f) If we are willing to assume that the ages are normally distributed (at least
approximately), how could we estimate an interval between which lie 95% of
the ages of chief executive officers for small companies?
We know that about 95% of the ages are within 2 (or more precisely 1.96)
standard deviations of the mean. If we estimate the population mean and
standard deviation with the sample mean and standard deviation, then we
estimate that 95% of the ages of chief executive officers for small companies
lie between
48.25 – (2)(8.6382) and 48.25 + (2)(8.6382) , that is, 30.974 and 65.526 years.
(This type of interval can be called a prediction interval; notice how much
wider this interval is than the confidence interval in part (a).)
After considering how to estimate a mean with a confidence interval, we now
consider how to perform a hypothesis test about a mean.
A hypothesis test is used when we have some hypothesized value for the mean
prior to any data collection.
Return to the definitions in Class Handout #3:
hypothesis testing an inferential statistical analysis used to decide which of two
competing hypotheses should be believed (analogous to a court trial)
Confidence intervals are a method of inferential statistics used when no hypothesized
value about a parameter to be estimated exists prior any data analysis; however, when
such a hypothesized value exists, hypothesis testing is a popular method of inferential
statistics to decide if a statistically significant difference exists. (A hypothesis test
can also tell us whether or not a relationship is statistically significant.)
null hypothesis (H0) a statement assumed to be true at the outset of a hypothesis test;
often, a statement that a parameter is equal to a specific
hypothesized value (comparable to “innocence” in a court trial)
alternative (research) hypothesis (H1) a statement for which sufficient evidence is
required before it will be believed; often, a
statement that the parameter is not equal to the
hypothesized value (comparable to “guilt” in a
court trial)
one-sided hypothesis test
Now let us go to Class Exercise 6(a).
two-sided hypothesis test
6. It is believed that the mean right hand grip strength of men between 20 and 40
years of age in the USA is 86.3 lbs. It is now of interest to perform a hypothesis
test concerning the mean grip strength of men between 20 and 40 years of age in
the country of Techavia.
(a) If we are looking for evidence that the mean grip strength in Techavia is
different from 86.3 lbs., state the null and alternative hypotheses for the
hypothesis test.
H0:  = 86.3 (The mean grip strength is 86.3 lbs.)
H1:   86.3 (The mean grip strength is different from 86.3 lbs.)
(b) Is the hypothesis test one-sided or two-sided?
Now look at the definitions for one-sided and two-sided tests.
(c) Describe what it would mean to make a Type I error in this hypothesis test and
what it would mean to make a Type II error in this hypothesis test.
hypothesis testing an inferential statistical analysis used to decide which of two
competing hypotheses should be believed (analogous to a court trial)
Confidence intervals are a method of inferential statistics used when no hypothesized
value about a parameter to be estimated exists prior any data analysis; however, when
such a hypothesized value exists, hypothesis testing is a popular method of inferential
statistics to decide if a statistically significant difference exists. (A hypothesis test
can also tell us whether or not a relationship is statistically significant.)
null hypothesis (H0) a statement assumed to be true at the outset of a hypothesis test;
often, a statement that a parameter is equal to a specific
hypothesized value (comparable to “innocence” in a court trial)
alternative (research) hypothesis (H1) a statement for which sufficient evidence is
required before it will be believed; often, a
statement that the parameter is not equal to the
hypothesized value (comparable to “guilt” in a
court trial)
one-sided hypothesis test a test designed to identify a difference from a hypothesized
value in only one direction
two-sided hypothesis test a test designed to identify a difference from a hypothesized
value in either direction
Even though hypothesis tests may be one-sided or two-sided,
confidence intervals are generally two-sided (except for rare occasions).
6. It is believed that the mean right hand grip strength of men between 20 and 40
years of age in the USA is 86.3 lbs. It is now of interest to perform a hypothesis
test concerning the mean grip strength of men between 20 and 40 years of age in
the country of Techavia.
(a) If we are looking for evidence that the mean grip strength in Techavia is
different from 86.3 lbs., state the null and alternative hypotheses for the
hypothesis test.
H0:  = 86.3 (The mean grip strength is 86.3 lbs.)
H1:   86.3 (The mean grip strength is different from 86.3 lbs.)
(b) Is the hypothesis test one-sided or two-sided?
Since we are looking for evidence that the population mean is different from the
hypothesized value 86.3 in either direction, then the test is two-sided
(c) Describe what it would mean to make a Type I error in this hypothesis test and
what it would mean to make a Type II error in this hypothesis test.
Now look at the definitions for Type I and Type II error.
Type I error believing H1 (the alternative hypothesis) when in reality H0 (the null
hypothesis) is true (in a court trial, saying that the defendant is guilty
when the defendant is really innocent)
Type II error believing H0 (the null hypothesis) when in reality H1 (the alternative
hypothesis) is true (in a court trial, saying that the defendant is innocent
when the defendant is really guilty)
test statistic
significance level ()
rejection (critical) region
p-value (probability value)
Now let us go to Class Exercise 6(c).
6. It is believed that the mean right hand grip strength of men between 20 and 40
years of age in the USA is 86.3 lbs. It is now of interest to perform a hypothesis
test concerning the mean grip strength of men between 20 and 40 years of age in
the country of Techavia.
(a) If we are looking for evidence that the mean grip strength in Techavia is
different from 86.3 lbs., state the null and alternative hypotheses for the
hypothesis test.
H0:  = 86.3 (The mean grip strength is 86.3 lbs.)
H1:   86.3 (The mean grip strength is different from 86.3 lbs.)
(b) Is the hypothesis test one-sided or two-sided?
Since we are looking for evidence that the population mean is different from the
hypothesized value 86.3 in either direction, then the test is two-sided
(c) Describe what it would mean to make a Type I error in this hypothesis test and
what it would mean to make a Type II error in this hypothesis test.
Making a Type I error means the mean grip strength is actually 86.3 lbs., but we
mistakenly conclude that it is different from 86.3 lbs.
Making a Type II error means the mean grip strength is actually different from
86.3 lbs., but we mistakenly conclude that it is equal to 86.3 lbs.
Type I error believing H1 (the alternative hypothesis) when in reality H0 (the null
hypothesis) is true (in a court trial, saying that the defendant is guilty
when the defendant is really innocent)
Type II error believing H0 (the null hypothesis) when in reality H1 (the alternative
hypothesis) is true (in a court trial, saying that the defendant is innocent
when the defendant is really guilty)
test statistic a statistic which is used to decide whether to believe H0 or to believe H1
significance level ()
It is the test statistic which provides us with
evidence to make our decision in a hypothesis test.
Now let us go to Class Exercise 6(d).
rejection (critical) region
p-value (probability value)
(d) Suppose we plan to measure each right hand grip strength in a random sample of
16 men from Techavia. If we assume that either the grip strengths are normally
distributed or the sample size 16 is sufficiently large so that the sampling
distribution of x is approximately normal, what test statistic would be
appropriate for us to use to decide whether to believe H0 or to believe H1?
x – 86.3
If H0 were true, then
s
—––
16
would be the t-score for x , where df = 15 ,
and we expect this t-score to be within the bounds of random variation.
If H0 were not true, then we would expect the t-score to be outside the bounds
of random variation.
Consequently, we can use this t-score as a test statistic to decide whether to
believe H0 or to believe H1, but we need to choose specific bounds for what
should be considered random variation.
Type I error believing H1 (the alternative hypothesis) when in reality H0 (the null
hypothesis) is true (in a court trial, saying that the defendant is guilty
when the defendant is really innocent)
Type II error believing H0 (the null hypothesis) when in reality H1 (the alternative
hypothesis) is true (in a court trial, saying that the defendant is innocent
when the defendant is really guilty)
test statistic a statistic which is used to decide whether to believe H0 or to believe H1
significance level () the highest probability of making a Type I error that we are
willing to tolerate, commonly chosen to be 0.10, 0.05, or 0.01
With a given sample size n, the probability of making a Type II
error increases as we decrease  (the probability of making a
Type I error).
rejection (critical) region a set of test statistic values which lead to rejecting H0 in
favor of H1
p-value (probability value)
6.-continued
(e) Find the rejection region for the hypothesis test if
(i) a 0.05 significance level were chosen.
 = 0.05
t distribution
with df = 15

— = 0.025
2
1 –  = 0.95
– t0.025 = –2.131

— = 0.025
2
t0.025 = 2.131
The rejection region is defined to be all test statistic values t > 2.131 or t < –2.131 .
(ii) a 0.01 significance level were chosen.
 = 0.01
t distribution
with df = 15

— = 0.005
2
1 –  = 0.99
– t0.005 = –2.947

— = 0.005
2
t0.005 = 2.947
The rejection region is defined to be all test statistic values t > 2.947 or t < –2.947 .
(f) Suppose we actually measure each right hand grip strength in a random sample
of 16 men from Techavia, and we find that x = 91.0 lbs. and s = 7.8 lbs. Find
the test statistic value, and find the p-value for the hypothesis test.
x – 86.3
91.0 – 86.3
The observed test statistic value is t (or t15) =
=
= 2.410
s
7.8
—––
—––
16
16
Note that this observed test statistic
provides us with sufficient evidence against the H0 (that is, t = 2.410 is in the
rejection region) with  = 0.05.
does not provide us with sufficient evidence against the H0 (that is, t = 2.410
is in the rejection region) with  = 0.01.
Next class, we shall define and calculate the p-value.
Before submitting Homework #3, check some of the answers (if you haven’t
done so already) from the link on the course schedule:
http://srv2.lycoming.edu/~sprgene/M214/Schedule214.htm