Download Notes - Section 7 – 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Confidence interval wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Notes – Chapter 23
Inference for the Mean of a Population when  is Unknown
The sample standard deviation s, is used to estimate the population standard
deviation . The standard deviation of
x is estimated by
sx/ n
This quantity is called the standard error of the sample mean
x.
Instead of the z (normal) distribution that we use with proportions, we use the tdistribution. We use the t-distribution because the population  is unknown
and we would like to account for more variability since sx is an estimate of the
population standard deviation σ.
The density curves of the t-distributions are symmetric about 0 and bell-shaped.
The spread is a bit greater than that of a normal dist. due to the extra variability. As
the degrees of freedom increase (i.e. the sample size), the t-dist. curve approaches
the standard normal curve.
Conditions:
*Data is from an SRS of size n from the
Population of interest
* Sample size < 10% of Population size
*Sample is approx. normal/no outliers or large n
(see guidelines below)

n < 15: Use t-procedures if the data is
close to normal. If severe skewness
or outliers are present, do not use t.

15 ≤ n < 30 : The t-procedures can be used except in the presence of
outliers

large n: The t-procedures can be used
even for clearly skewed distributions when the sample is large (n 
30) by CLT.
The One-Sample t-Procedure:
A level C confidence interval for  is:
x
t *(s/
n)
where t* is the critical value from the t distribution based on degrees of freedom.
Your two key phrases when making statements:
Phrase 1: Interprets the confidence level:
Saying that we are “95% confident” means that with this data, if many intervals
were constructed in this manner, we would expect approximately 95% of them to
contain the true population mean of ____(context).
Phrase 2: Interprets a single confidence interval:
We are #% confident that the true population mean of ____(context) lies in the
interval ……
To test the hypothesis: Ho:  = o and
Ha:  > o
Ha:  < o
Ha:   o
Calculate the test statistics t and the p value.
These have the same meanings as they did for proportions. We make the same
conclusions based on p-values and alpha.
For a one-sample t statistic:
x
t
sx
n
has the t distribution with n – 1 degrees of freedom.*
*there is a different t distribution for each sample size.
Common Steps to all Significance Tests:
1) State Ho and Ha.
2) Specify significance level, .
3) Identify correct test and conditions.
4) Calculate the value of the test statistic
5) Find the P-value for the observed data
(If the P-value is less than or = to , the test
result is “statistically significant at level .)
6) Answer the question in context.
Remember your two possible conclusions:
If p value < α,
With a p-value of ___ < α at ____, we can reject the null & can support
_____(the alternative in context).
If p-value > α,
With a p-value of ___ > α at ____, we fail to reject the null and we cannot
support that _____ (the alternative in context).
Remember that x and s are not resistant to outliers which means that the tdistribution isn’t resistant to them either if n is small.
The t-procedures are very resistant against skewness when the sample size is large.
Except in the case of small samples, the SRS condition from the population of
interest is more important than the population being normal.
Examples
Example 1: The effect of alcohol on the nervous system has been the subject of considerable
research. Suppose a researcher is testing the effects of response time by injecting 100 rats with a
unit dose of alcohol, then subjecting each to some kind of stimulus and recording the response
time. She finds that the injected rats had a mean response time of 1.5 seconds and a standard
deviation of 0.7 seconds. Construct a 95% confidence interval that represents the mean response
time for rats injected with 1 unit of alcohol.
Example 2: Last year the number of false fire alarms in a large city averaged 10.4 a day. In an
effort to reduce this number, the fire department conducted a safety program in the city’s
schools. Six months after completion of the program, a sample of 21 days had a mean of 8.1
false alarms and a standard deviation of 3.4. Does it appear that the fire department’s program
was successful?
Example 3: What is normal body temperature? A paper published in the Journal of the American
Medical Association presented evidence that normal body temperature may be less than 98.6F.
The paper presented data for 18 randomly selected individuals:
98.2
98.4
98.0
97.8
99.7
99.2
99.0
98.2
98.6
98.6
97.4
97.1
98.2
97.6
97.2
97.8
98.4
98.5
Would this data give good evidence that the mean body temperature is really lower than 98.6F?
Example 4: The Ford Motor Company claims that the Ford Ranger pick-up truck gets highway
mileage of 29 mpg. We take a SRS of 40 Ford Rangers and get a mean mpg of 28.4 with a
standard deviation of 3.1 mpg.
A) Give a 90% confidence interval for the true mean mpg for Ford Ranger.
B) Explain what it means to be 90% confident in any interval.
C) Interpret your confidence interval from part A.
D) Is the sample mean of 28.4 good evidence that the Ford Ranger has a highway mpg lower
than the 29 mpg that Ford Motor Company claims?
Example 5: The mean yield of corn in the United States is about 120 bushels per acre. A survey
of 50 farmers this year gives a sample mean yield of x-bar = 123.6 bushels per acre with a
standard deviation of sx= 10 bushels per acre. We want to know whether this is good evidence
that the national mean this year is not 120 bushels per acre. Assume that the farmers surveyed are
an SRS from the population of all commercial corn growers.
Are you convinced that the population mean is not 120 bushels per acre?
Example 6: A new blood pressure drug is advertised to reduce a patient’s blood pressure an
average of 10 units after a week of medication. Blood pressure reductions were recorded for 37
patients after treatment with the drug for 1 week. The patients had a mean reduction in blood
pressure of 8.7 units with a standard deviation of 5.1 units. Is there evidence to dispute the
advertised claim from the drug’s manufacturer?
Example 7: A pharmaceutical manufacturer does a chemical analysis to check the potency of its
products. The standard release potency for cephalothin crystals is 910 ppm. An SRS of 16 lots
gives the following potency data:
897
914
913
906
916
918
905
921
918
906
895
893
908
906
907
901
You want to know if the cephalothin crystals have lost potency during shipping and storage.
Example 8: Ben & Jerry (the ice cream guys) are thinking about branching out into the area of
frozen gourmet pies. The pies will be in a variety of flavors and quite delicious but they will also
be fairly expensive in relationship to other pies (like Mrs. Smith’s etc…) Before they begin this
venture, they decide to do a market survey and ask what would you be willing to pay for a Ben &
Jerry’s pie? They have decided that without a mean response of at least $7.00, they will not go
into the pie business. State the null and alternative hypotheses.
In the context of this situation, what would a type I be and what are the consequences?
In the context of this situation, what would a type II error be and what are the consequences?