Download Ch 16

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Sufficient statistic wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Generalizing a
Sample’s Findings to
Its Population and
Testing Hypotheses
About Percents and
Means
Statistics Versus Parameters
• Statistics: values that are computed
from information provided by a sample
• Parameters: values that are computed
from a complete census which are
considered to be precise and valid
measures of the population
• Parameters represent “what we wish
to know” about a population. Statistics
are used to estimate population
Ch 16
2
parameters.
Ch 16
3
The Concepts of Inference and
Statistical Inference
• Inference: drawing a conclusion
based on some evidence
• Statistical inference: a set of
procedures in which the sample size
and sample statistics are used to
make estimates of population
parameters
Ch 16
4
Ch 16
5
How to Calculate Sample Error
(Accuracy)
error  z
pq
n
Where z = 1.96 (95%)
or 2.58 (99%)
2000
1850
1700
1550
1400
1250
1100
950
800
650
500
350
200
16%
14%
12%
10%
8%
6%
4%
2%
0%
50
sp
Accuracy
Sample Size and Accuracy
Sample Size
Ch 16
6
Accuracy Levels for Different
Sample Sizes
The “p” you found in your sample
• At 95% ( z = 1.96)
• n
p=50%
•
•
•
•
•
10
100
250
500
1,000
Ch 16
±31.0%
±9.8%
±6.2%
±4.4%
±3.1%
p=70%
±28.4%
±9.0%
±5.7%
±4.0%
±2.8%
p=90%
±18.6%
±5.9%
1.96
times sp
±3.7%
±2.6%
±1.9%
95% Confidence interval: p ± 1.96 times sp
7
Parameter Estimation
• Parameter estimation: the process of
using sample information to compute
an interval that describes the range of
values of a parameter such as the
population mean or population
percentage is likely to take on
Ch 16
8
Parameter Estimation
•
Ch 16
Parameter estimation involves three
values:
1. Sample statistic (mean or percentage
generated from sample data)
2. Standard error (variance divided by
sample size; formula for standard error
of the mean and another formula for
standard error of the percentage)
3. Confidence interval (gives us a range
within which a sample statistic will fall
if we were to repeat the study many
9
times over
Parameter Estimation
• Statistics are generated from sample data
and are used to estimate population
parameters.
• The sample statistic may be either a
percentage, i.e., 12% of the respondents
stated they were “very likely” to patronize a
new, upscale restaurant OR
• The sample statistic may be a mean, i.e.,
the average amount spent per month in
restaurants is $185.00
Ch 16
10
Parameter Estimation
• Standard error: while there are two
formulas, one for a percentage and
the other for a mean, both formulas
have a measure of variability divided
by sample size. Given the sample
size, the more variability, the greater
the standard error.
Ch 16
11
Parameter Estimation
• The lower the standard error, the
more precisely our sample statistic
will represent the population
parameter. Researchers have an
opportunity for predetermining
standard error when they calculate
the sample size required to
accurately estimate a parameter.
Recall Chapter 13 on sample size.
Ch 16
12
Standard Error of the Mean
Ch 16
13
Standard Error of the
Percentage
Ch 16
14
Parameter Estimation
• Confidence intervals: the degree of
accuracy desired by the researcher
and stipulated as a level of
confidence in the form of a
percentage
• Most commonly used level of
confidence: 95%; corresponding to
1.96 standard errors
Ch 16
15
Parameter Estimation
• What does this mean? It means that
we can say that if we did our study
over 100 times, we can determine a
range within which the sample
statistic will fall 95 times out of 100
(95% level of confidence). This gives
us confidence that the real population
value falls within this range.
Ch 16
16
How do I interpret the confidence
interval?
• Theoretical notion
• Take many, many, many samples
2.5%
2.5%
• Plot the p’s
• 95 % will fall in confidence interval
95%
(p ± z times sp)
Ch 16
17
Parameter Estimation
• Five steps involved in computing
confidence intervals for a mean or
percentage:
1. Determine the sample statistic
2. Determine the variability in the
sample for that statistic
Ch 16
18
Parameter Estimation
3. Identify the sample size
4. Decide on the level of confidence
5. Perform the computations to
determine the upper and lower
boundaries of the confidence
interval range
Ch 16
19
Parameter Estimation Using
SPSS: Estimating a Percentage
• Run FREQUENCIES (on
RADPROG) and you find that 41.3%
listen to “Rock” music.
• So, set p=41.3 and then q=58.7,
n=400, 95%=1.96, calculate Sp.
• The answer is 36.5%-46.1%
• We are 95% confident that the true %
of the population that listens to
“Rock” falls between 36.5% and
46.1%. (See p. 464).
Ch 16
20
sp  z
How to Compute a Confidence
pq Interval for a Percent
n
•
Determine the confidence
interval using
• Sample size (n)
• 95% level of confidence
(z=1.96)
• P=?%; q=100%-?%
pz
Ch 16
pq
n
Lower boundary
pz
pq
n
Upper boundary
21
Estimating a Population
Percentage with SPSS
• How do we interpret the
results?
– Our best estimate of the
population percentage
that prefers “Rock” radio
is 41.3 percent, and we
are 95 percent confident
that the true population
value is between 36.5
and 46.1 percent.
Ch 16
22
Parameter Estimation Using
SPSS: Estimating a Mean
• SPSS will calculate a confidence
interval around a mean sample
statistic.
• From the Hobbit’s Choice data
assume
– We want to know how much those
who stated “very likely” to patronize
an upscale restaurant spend in
restaurants per month. (See p.
465.)
Ch 16
23
Parameter Estimation Using
SPSS: Estimating a Mean
• We must first use DATA, SELECT
CASES to select LIKELY=5.
• Then we run ANALYZE, COMPARE
MEANS, ONE SAMPLE T-TEST.
• Note: You should only run this test
when you have interval or ratio data.
Ch 16
24
Ch 16
25
Ch 16
26
Parameter Estimation Using
SPSS: Estimating a Percentage
• Estimating a Percentage: SPSS will
not calculate for a percentage. You
must run FREQUENCIES to get your
sample statistic and n size. Then use
the formula p±1.96 Sp.
• AN EXAMPLE: We want to estimate
the percentage of the population that
listens to “Rock” radio.
Ch 16
27
Estimating a Population
Percentage with SPSS
• Suppose we wish to know how
accurately the sample statistic
estimates the percent listening to
“Rock” music.
Ch 16
– Our “best estimate” of the population
percentage is 41.3% prefer “Rock”
music stations (n=400). We run
FREQUENCIES to learn this.
– But how accurate is this estimate of the
true population percentage preferring
28
rock stations?
Estimating a Population Mean
with SPSS
• How do we interpret the results?
Ch 16
– My best estimate is that those “very
likely” to patronize an upscale
restaurant in the future, presently spend
$281 dollars per month in a restaurant.
In addition, I am 95% confident that the
true population value falls between $267
and $297 (95% confidence interval).
Therefore, Jeff Dean can be 95%
confident that the second criterion for
the forecasting model “passes” the test.29
Hypothesis Testing
• Hypothesis: an expectation of what
the population parameter value is
• Hypothesis testing: a statistical
procedure used to “accept” or “reject”
the hypothesis based on sample
information
• Intuitive hypothesis testing: when
someone uses something he or she
has observed to see if it agrees with
or refutes his or her belief about that
Ch 16
30
topic
Hypothesis Testing
• Statistical hypothesis testing:
– Begin with a statement about what
you believe exists in the population
– Draw a random sample and
determine the sample statistic
– Compare the statistic to the
hypothesized parameter
Ch 16
31
Hypothesis Testing
• Statistical hypothesis testing:
– Decide whether the sample
supports the original hypothesis
– If the sample does not support the
hypothesis, revise the hypothesis
to be consistent with the sample’s
statistic
Ch 16
32
What is a Statistical Hypothesis?
• A hypothesis is what someone
expects (or hypothesizes) the
population percent or the average
to be.
• If your hypothesis is correct, it will
fall in the confidence interval
(known as supported).
• If your hypothesis is incorrect, it will
fall outside the confidence interval
Ch 16 (known as not supported)
33
How a Hypothesis Test Works
• Sample
Test hypothesis
----- Population
• Exact amount---- Uses sample error
• percent
----- Test against Ho
• average
----- Test against Ho
Ch 16
34
How to Test Statistical
Hypothesis
2.5%
2.5%
95%
+1.96
-1.96
Ch 16
35
Testing a Hypothesis of a Mean
• Example in Text: Rex Reigen
hypothesizes that college interns
make $2,800 in commissions. A
survey shows $2,750. Does the
survey sample statistic support or fail
to support Rex’s hypothesis? (p. 472)
Ch 16
36
• Since 1.43 z falls between -1.96z and
+1.96 z, we ACCEPT the hypothesis.
Ch 16
37
How to Test Statistical
p  Hypothesis
z 
H
sp
p  H

pq
n
x  H
z
sx
2.5%  x   H
2.5%
s
n
95%
-1.96
Ch 16
Not Supported
Supported
+1.96
38
Not Supported
• The probability that our sample mean
of $2,800 came from a distribution of
means around a population parameter
of $2,750 is 95%. Therefore, we
accept Rex’s hypothesis.
Ch 16
39
Hypothesis Testing
• Non-Directional hypotheses:
hypotheses that do not indicate the
direction (greater than or less than) of
a hypothesized value
Ch 16
40
Hypothesis Testing
• Directional hypotheses: hypotheses
that indicate the direction in which
you believe the population parameter
falls relative to some target mean or
percentage
Ch 16
41
Using SPSS to Test Hypotheses
About a Percentage
• SPSS cannot test hypotheses about
percentages; you must use the
formula. See p. 475
Ch 16
42
Using SPSS to Test Hypotheses
About a Mean
• In the Hobbit’s Choice Case we want
to test that those stating “very likely” to
patronize an upscale restaurant are
willing to pay an average of $18 per
entrée.
• DATA, SELECT CASES, Likely=5
• ANALYZE, COMAPRE MEANS, ONE
SAMPLE T TEST
• ENTER 18 AS TEST VALUE
• Note: z value is reported as t in output.
Ch 16
43
Ch 16
44
Ch 16
45
What if We Used a Directional
Hypothesis?
• Those stating “very likely” to
patronize an upscale restaurant are
willing to pay more than an average
of $18 per entrée.
• Is the sign (- or +) in the
hypothesized direction? For “more
than” hypotheses it should be +; if
not, reject.
Ch 16
46
What if We Used a Directional
Hypothesis?
• Since we are working with a
direction, we are only concerned with
one side of the normal distribution.
Therefore, we need to adjust the
critical values. We would accept this
hypothesis if the z value computed is
greater than +1.64 (95%).
Ch 16
47
Ch 16
48