Download TPS4e Ch8-8.1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Sufficient statistic wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
+
“Statisticians use a confidence interval to describe
the amount of uncertainty associated with a sample
estimate of a population parameter.”
+ REFLECTION…
 In
Chapter 7 they GAVE us either the population mean
or the mean of the sampling distribution

If we had the mean of the sampling distribution we could predict
the population

In real life, you seldom have this information

So….how do you estimate your population mean?!
Point Estimates vs. Interval Estimates
+
We Don’t KNOW the sampling mean (which
would be equal to our population mean) So we
don’t KNOW where our statistic is RELATIVE to
the true population parameter.
+ Making
 In

connections…
Chapter 2 we learned about the Empirical rule:
68% of values lie within (+1) σ of the mean, 95% of values lie
within (+2) σ of the mean, 99% of values lie within (+3) σ of the
mean.

In Chapter 4, we learned if we randomly select the sample, we
should be able to generalize our results to the population of interest.

In Chapter 7 we learned that if we take multiple samples each may
vary by a certain amount, but if we take all the possible
combinations of samples, then, we can construct a sampling
distribution.
Intervals: The
Basics
Definition:
A point estimator is a statistic that provides an
estimate of a population parameter. The value of that
statistic from a sample is called a point estimate.
Ideally, a point estimate is our “best guess” at the
value of an unknown parameter.
+
Confidence
+

WHAT IS THE POINT ESTIMATOR?

WHAT IS THE POINT ESTIMATE?

CAN YOU IDENTIFY EACH IN AN
APPLICATION?

LET’S TRY THE FOLLOWING …
+
What proportion p of U.S. high school students
smoke? A 2007 Survey questioned a random
sample of 14,041 students in grades 9-12. Of
these 2,808 said they had smoked cigarettes a
least one day in the past month
+
The math department wants to know what
proportion of students own a graphing calculator,
so they take a random sample of 100 students
and find that 28 own a graphing calculator.
+
The makes of a new golf ball want to estimate the
median distance the new balls will travel when hit by a
mechanical driver. They select a SRS of 10 balls and
measure the distance travelled after being hit by a
mechanical driver.
The following are the distances in yards:
285
286
284
285
282
284
287
290
288
285
+
BIG IDEA

We are estimating the mean of a sampling distribution,
and transitively the population parameter, by using a
sample’s statistic [mean/ proportion]!
An interval estimate is defined by two numbers,
between which a population parameter is said to lie.
For example, (a, b) is an interval estimate that states
the population mean is between a and b.
+
INTERVAL ESTIMATE
INTERVALS: THE
BASICS
The confidence level describes the uncertainty associated with a
sampling method.
Suppose we used the same sampling method to select different
samples and to compute a different interval estimate for each
sample.
Some interval estimates would include the true population parameter
and some would not.
A 90% confidence level means that we would expect 90% of the
interval estimates to include the population parameter.
A 95% confidence level means that 95% of the intervals would
include the parameter.
+
CONFIDENCE
Example:

You give the SAT Test to a SRS of 500 high school seniors in
California. The mean x for the math portion is 461. The standard
deviation of the population σ= 100.
What is the sample Standard deviation?
Would the sample mean vary if we took many samples of
500 seniors from the same population?
+

+
*68-95-99.7 Rule says that in 95% of all samples, −x of the samples will be
within 2σ of the population mean μ.
So the mean x−of 500 SAT Math scores will be within 4.5*2 = 9 points of the
unknown μ in 95% of all samples.
+
•
− and x+9….
−
So in 95% of all samples, the unknown μ lies between x-9
•
−+9
Or x
•
−
−
Or (x-9,
x+9)
+
INTERPRETING CONFIDENCE LEVEL
AND
CONFIDENCE INTERVALS
CONFIDENCE LEVEL: To say that we are 95% confident is shorthand for
“95% of all possible samples of a given size from this population will
result in an interval that captures the unknown parameter.”
CONFIDENCE INTERVAL: To interpret a C% confidence interval for an
unknown parameter, say:
“We are C% confident that the interval from _____ to _____ captures
the actual value of the [population parameter in context].”
+
CONFIDENCE INTERVALS
Interpreting Confidence Levels and Confidence Intervals
The confidence level tells us how likely it is that the method we
are using will produce an interval that captures the population
parameter if we use it many times.
The confidence level does not tell us the
chance that a particular confidence
interval captures the population
parameter.
Instead, the confidence interval gives us a set of plausible values for
the parameter.
We interpret confidence levels and confidence intervals in much the
same way whether we are estimating a population mean, proportion,
or some other parameter.
+

To express a confidence interval, you need three pieces of information.
•Confidence level
•A confidence level C gives the probability that the interval will
capture the true parameter value in repeated samples C% of the
time.
•C is usually 90%, 95% or 99% but can be any %.
•Statistic
−
^
•x or p
•Margin of error
•Critical value (z score) * Standard deviation of the statistic
+
CONFIDENCE INTERVAL DATA
REQUIREMENTS
+
We usually choose a confidence level of 90% or higher because we want to be
quite sure of our conclusions. The most common confidence level is 95%.
a Confidence Interval
The confidence interval for estimating a population parameter has the form
statistic ± (critical value) • (standard deviation of statistic)
where the statistic we use is the point estimator for the parameter.
Properties of Confidence Intervals:
 The “margin of error” is the (critical value) • (standard deviation of statistic)
 The user chooses the confidence level, and the margin of error follows
from this choice.
 The critical value depends on the confidence level and the sampling
distribution of the statistic.
 Greater confidence requires a larger critical value
Confidence Intervals: The Basics
Calculating a Confidence Interval
+
 Calculating
 The standard deviation of the statistic depends on the sample size n
The margin of error gets smaller when:
 The confidence level decreases
 The sample size n increases
Before calculating a confidence interval for µ or p there are
three important conditions
1) Random:
The data should come from a well-designed random sample
or randomized experiment.
2) Normal:
The sampling distribution of the statistic is approximately Normal.
For means: If the population distribution is Normal. If the population
distribution is not Normal, then the CLT tells us the sampling
distribution will be approximately Normal if n ≥ 30.
For proportions: We can use Normal approximation to the sampling
distribution as long as np ≥ 10 and n(1 – p) ≥ 10.
3) Independent:
Individual observations are independent. When sampling
without replacement, the sample size n should be no more than
10% of the population size N (the 10% condition) to use our
formula for the standard deviation of the statistic.
+
Conditions