Download chapter 6 - Web4students

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
6.1 – Overview
Inferential Statistics: We use sample data to make generalizations, inferences, or
predictions about a population.
In chapter 6: We will use sample data to estimate the value of a population
proportion and population mean.
We will also present methods for determining the sample size necessary to estimate
those parameters with desired accuracies.
In chapter 7: We will use sample data to test some claims, or hypotheses, about a
population.
Read the Chapter Problem on page 297.
6.2 – Estimating a Population Proportion
In this section, we are going to estimate the proportion of all adult Minnesotans who
oppose the photo-cop legislation. (See Chapter Problem)
We use the sample of 829 surveyed adults and consider the sample proportion of
51% as the best point estimate of the population proportion.
Since the true population proportion probably is not exactly 51%, instead of using a
single value .51, we may use a range of values (or interval) that is likely to contain
the true value of the population proportion. This is called a confidence interval.
With a confidence interval is associated a degree of confidence.
The degree of confidence tells us the percentage of times that the confidence
interval actually should contain the population parameter (e.g. proportion or mean),
assuming that the estimation process is repeated a large number of times.
We will be working under the following three assumptions:
Assumptions
1.
The sample is a simple random sample. (SRS)
2.
The conditions for a binomial distribution are satisfied by the sample. That is:
there is a fixed number of trials, the trials are independent, there are two categories
of outcomes, and the probabilities remain constant for each trial. A “trial” would be
the examination of each sample element to see which of the two possibilities it is.
3.
The normal distribution can be used to approximate the distribution of sample
proportions because np ≥ 5 and nq ≥ 5 are both satisfied. (q = 1 – p)
1
Notation for Proportions
p
is the population proportion (percent of those who have the
"quality" under discussion)
p
(read "p-hat") is the sample proportion (percent of those in the
sample who have the "quality" under discussion)
p
x
n
x is the # of successes (# of those who have the "quality" under
discussion) in a sample of size n.
q
(read "q-hat") is the percent of those in the sample who do not have
the "quality" under discussion) q  1  p
Some Definitions
Point Estimate: a single value (or point) used to approximate a population
parameter.
Best Point Estimate of Population Proportion p:
Use p 
x
as best point estimate
n
Standard Deviation of the Distribution of Sample Proportions:
p 
pq
n
An estimate that reveals how good the point estimate is, is the
confidence interval (or interval estimate): a range (or interval) of values used to
estimate the true value of a population parameter.
(A confidence interval is associated with a degree of confidence which is a measure
of how certain we are that our interval contains the population parameter)
Degree of Confidence (or confidence level, or confidence coefficient): the
probability or the proportion of times that the confidence interval should contain the
true value of the population parameter, assuming that the estimation process is
repeated a large number of times.
Degree of confidence = 1 –  (  is the complement of the confidence level)
2
The most common choices for confidence level are:
90% (  =.10), 95% (  =.05), 99% (  =.01)
Example: Here is an example of a confidence interval based on the sample
data of 829 surveyed adult Minnesotans, 51% of whom are opposed to use of the
photo-cop:
The best point estimate of p is .51.
The 0.95 (or 95%) confidence interval estimate of the population proportion is
0.476 < p < 0.544
Interpreting a Confidence Interval
In our example, we are 95% confident that the interval from 0.476 to 0.544 actually
does contain the true value of p. This means that if we were to select many different
samples of size 829 and construct the corresponding confidence intervals, 95% of
them should actually contain the value of the population proportion . (Note that the
level of 95% refers to the success rate of the process being used to estimate the
proportion, and it does not refer to the population proportion itself)
Note: It is incorrect to say that there is a 95% chance that the true population
proportion will fall between 0.476 and 0.544. (Why? Because p is a constant, not a
random variable. p has already occurred, we just don't know what it is.)
(Read more on page 302)
Critical Values:
• z / 2 is the positive value separating an area of α/2 in the right tail of the
standard normal distribution.
• - z / 2 separates an area of α/2 in the left tail.
Example: Find the critical values for the indicated confidence levels. Use table A-2
Confidence Level
α
α /2
z / 2
90%
95%
99%
3
Margin of Error (E) (or maximum error of the estimate): maximum likely
difference between the observed sample proportion p and the true value of the
population proportion p. It is calculated by multiplying the critical value and the
standard deviation of the distribution of sample proportions:
E  z / 2
pq
n
*** Do # 14, p. 312
*** Do # 16, p 312
Confidence Interval for the Population Proportion p:
pE p pE
where
E  z / 2
pq
n
Other ways of expressing the confidence intervals:
pE
or
( p  E, p  E )
Round-Off Rule for Confidence Interval Estimates of p
3 significant digits
Procedure for Constructing a Confidence Interval for p
1) Verify that the assumptions are satisfied.
2) Find the critical value z / 2 .
3) Evaluate the margin of error E.
4) Find p  E
*** Do # 18, p. 312
4
Finding the Point Estimate and the Margin of Error from a Confidence Interval
Point estimate of p (middle of interval)
p=
upper.confidence.int erval.lim it  lower.confidence.int erval.lim it
2
Margin of E (1/2 the length of the interval)
E=
upper.confidence.int erval.lim it  lower.confidence.int erval .limit
2
*** Do # 10, p. 312
*** Do # 6, p. 312
*** Do # 8, p. 312
Determining Sample Size:
( z / 2 )2 p q
If we have an approximate idea of what p is : n 
E2
If no estimate of p is known:
0.25 ( z / 2 ) 2
n
E2
Round-Off Rule for Sample Size, n: Use the computed size if it is a whole
number. If it is not a whole number, round it up to the next higher whole number.
*** Do # 22, p. 312
*** Do # 24, p. 312
5
*** Do # 26, p. 313
*** Do # 28, p. 313
*** Do # 36, p. 315
Using the TI-83 to Construct Confidence Intervals for p:
STAT>>TESTS choose A:1-propZInt.
6
6.3
Estimating a Population Mean: σ Known
In this section, we will again be working with confidence intervals and sample size
determination, but here our objective is to estimate a population mean, μ.
Assumptions:
1.
The sample is a simple random sample (All samples of the same size
have an equal chance of being selected.)
2.
The value of the population standard deviation σ is known.
3.
Either or both of these conditions is satisfied:
i) The population is normally distributed, or
ii) n > 30 (The sample has more than 30 values)
The best point estimate of the population mean is x .
*** Example: For the sample of 106 body temperatures (midnight on day 2) given in
Data Set 4 in Appendix B, the mean x is 98.20°F. This is the best point
estimate of the population mean μ of all body temperatures.
Again, as in the previous section, to fine-tune our estimate, we may use a
confidence interval which is a range (or interval) of values that is likely to contain
the true value of the population mean.
Margin of Error (E) (or maximum error of the estimate): maximum likely difference
between the observed sample mean x and the true value of the population mean μ.
It is calculated by multiplying the critical value and the standard deviation of the
sample means:
E = z/2

n
*** Do # 6, p. 327
*** Do #7, p. 327
Confidence Interval Estimate of the Population Mean μ (With σ known)
or
xE
xE  xE
or ( x  E , x  E )
The two values x  E and x  E are called confidence interval limits
7
Procedure for Constructing a Confidence Interval for μ
, (with Known σ)
1. Verify that the required assumptions are satisfied. (We have a simple
random sample, σ is known, and either the population appears to be
normally distributed or n > 30.)
2. Find the critical value z / 2 .
3. Evaluate the margin of error E.
(E = z/2

)
n
4. Then using E and the sample mean the confidence interval is:
xE  xE
Round-off Rule for Confidence Intervals used to Estimate μ:
a) If original data is given: use one more decimal place than original values.
b) If you are given summary statistics from a data set, use the same number
of decimal places used for the sample mean.
*** Example: For the sample of 106 body temperatures (midnight on day 2) given in
Data Set 4 in Appendix B, where x is 98.20°F, construct the 98% confidence
interval estimate of the mean body temperature for all healthy adults.
Assume that the sample is a simple random sample and that it is known that
σ = 0.62°F.
Interpret the Results
We are ____%__ confident that the interval from __________ to _________ actually
does contain the true value of the population mean μ. This means that if we were to
select many different samples of the same size (106) and construct the
corresponding confidence intervals, in the long run ______%____ of them would
actually contain the value of μ.
*** Do #12, p. 328
8
*** Do #22, pg. 328
Finding the Point Estimate and the Margin of Error from a Confidence Interval
(Similar to the process done with sample proportions in page 5 of notes, section 6.2)
(i.e., point estimate is middle of interval and margin of error is ½ the length of the
interval)
*** Do # 17 – 20, p. 328
Determining Sample Size Required to Estimate μ
n(
z / 2 *
E
)2 rounded up to the nearest whole number
What if σ is not known?
1) Use the range rule of thumb:
σ ~ range/4
2) We often use s from a pilot test (n > 30)
3) Estimate the value of σ by using the results of some other study that was done
earlier.
*** Do # 14, p. 328
*** Do #28, pg. 329
9
*** Do #30, pg. 329
Using the TI-83 to Construct a Confidence Interval for Estimating μ
STAT>>TESTS choose 7:ZInterval
*** Do #24, pg. 328
10
6.4
Estimating a Population Mean: σ Not Known
In section 6.3, we constructed confidence intervals for the mean of a population
whose standard deviation σ was known. This assumption is not very realistic. The
methods of this section are realistic and practical and do not include a requirement
that σ is known. The usual procedure is to collect sample data, compute the
statistics n, x , and s, and use them to construct the confidence interval.
Assumptions
1.
The sample is a simple random sample
2.
Either the sample is from a normally distributed population or n > 30.
The sample mean x is the best point estimate of the population mean μ
If the above two conditions are satisfied, to set find the confidence interval instead of
using the normal distribution, we use the Student t Distribution
Student t Distribution
If the distribution of a population is essentially normal (approximately bell shaped),
then the distribution of
x
,
t
s
( )
n
is essentially a Student t distribution for all samples of size n. The student t
distribution, (or t distribution), is used to find the critical values t / 2 .
In a Student t Distribution, the critical values t / 2 are found in table A-3 by locating
the degrees of freedom and the area of one or two tails.
The number of degrees of freedom for a collection of sample data set is the
number of sample values that can vary after certain restrictions have been imposed
on all data values.
degrees of freedom = df = (n – 1) (one less than the sample size)
Reading the Student t Tables (Two-tailed)
Suppose the conditions are satisfied to use the student t distribution. Use table A-3
to find the critical value t / 2 for the given sample size and degree of confidence:
n
Degree of confidence
t / 2
20
27
16
95%
90%
99%
11
Properties of the Student t Distribution
1. Different for different sample sizes. See figure below.
2. Same general symmetric bell shape as the standard normal distribution,
but t curves are lower in the center and higher in the tails.
3. Has mean of t = 0.
4. Standard deviation varies with the sample size, but it is greater than 1.
5. As n gets larger, the Student t distribution gets closer to the standard
normal distribution.
Student t Distributions for
n = 3 and n = 12
Slide 71
Figure 6-5
Copyright © 2004 Pearson Education, Inc.
Conditions for Using the Student t Distribution
1. σ is unknown, (if σ is known we use the methods of 6.3);
2. The parent population has a distribution that is essentially normal; or
3. If the parent population is not normally distributed, then n > 30.
Notes:
1. Criteria for deciding whether the population is normally distributed:
Population need not be exactly normal, but it should appear to be
somewhat symmetric with one mode and no outliers.
To asses normality, use the last graph in STAT PLOT with data list. If
close to a line without significant outliers, the distribution of the data is
normal.
2. Sample size n > 30:
This is a commonly used guideline, but sample sizes of 15 to 30 are
adequate if the population appears to have a distribution that is not
far from being normal and there are no outliers. For some population
with distributions that are extremely far from normal, the sample size
might need to be larger than 50 or even 100.
12
Choose the Appropriate Distribution
Using the Normal and
t Distribution
Slide 72
Figure 6-6
Copyright © 2004 Pearson Education, Inc.
*** Do #2, pg. 343
*** Do #4, pg. 343
*** Do #6, pg. 343
Margin of Error for the Estimate of μ (With σ Not Known)
E = t/2
s
, where t/2 has n – 1 degrees of freedom
n
Confidence Interval for the Estimate of μ (With σ Not Known)
xE  xE
where
E = t/2
s
n
13
*** Do #12, pg. 343
*** Do #14, pg. 344
Using the TI-83 to Construct a Confidence Interval for Estimating μ
STAT>>TESTS choose 8:TInterval
*** Do #18, pg. 344
*** Do #20, pg. 345
14