Download Confidence Interval Estimation for the Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Sampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Sampling and Statistical
Analysis for Decision Making
A. A. Elimam
College of Business
San Francisco State University
Chapter Topics
• Sampling: Design and Methods
• Estimation:
• Confidence Interval Estimation for the Mean
(s Known)
•Confidence Interval Estimation for the Mean
(s Unknown)
•Confidence Interval Estimation for the
Proportion
Chapter Topics
• The Situation of Finite Populations
• Student’s t distribution
• Sample Size Estimation
• Hypothesis Testing
• Significance Levels
• ANOVA
Statistical Sampling
• Sampling: Valuable tool
• Population:
• Too large to deal with effectively or practically
• Impossible or too expensive to obtain all data
• Collect sample data to draw conclusions
about unknown population
Sample design
• Representative Samples of the population
• Sampling Plan: Approach to obtain samples
• Sampling Plan: States
• Objectives
• Target population
• Population frame
• Method of sampling
• Data collection procedure
• Statistical analysis tools
Objectives
• Estimate population parameters such as a
mean, proportion or standard deviation
• Identify if significant difference exists
between two populations
Population Frame
• List of all members of the target population
Sampling Methods
• Subjective Sampling:
• Judgment: select the sample (best customers)
• Convenience: ease of sampling
• Probabilistic Sampling:
• Simple Random Sampling
• Replacement
• Without Replacement
Sampling Methods
• Systematic Sampling:
• Selects items periodically from population.
• First item randomly selected - may produce bias
• Example: pick one sample every 7 days
• Stratified Sampling:
• Populations divided into natural strata
• Allocates proper proportion of samples to each stratum
• Each stratum weighed by its size – cost or significance of
certain strata might suggest different allocation
• Example: sampling of political districts - wards
Sampling Methods
• Cluster Sampling:
• Populations divided into clusters then random sample each
• Items within each cluster become members of the sample
• Example: segment customers for each geographical location
• Sampling Using Excel:
• Population listed in spreadsheet
• Periodic
• Random
Sampling Methods: Selection
• Systematic Sampling:
• Population is large – considerable effort to randomly select
• Stratified
Sampling:
• Items in each stratum homogeneous - Low variances
• Relatively smaller sample size than simple random sampling
• Cluster Sampling:
• Items in each cluster are heterogeneous
• Clusters are representative of the entire Population
• Requires larger sample
Sampling Errors
• Sample does not represent target population
(e. g. selecting inappropriate sampling method)
• Inherent error:samples only subset of population
• Depends on size of Sample relative to population
• Accuracy of estimates
• Trade-off: cost/time versus accuracy
Sampling From Finite Populations
• Finite without replacement (R)
• Statistical theory assumes: samples selected with R
• When n < .05 N – difference is insignificant
• Otherwise need a correction factor
• Standard error of the mean
s
x

s
n
N n
N 1
Statistical Analysis of Sample Data
• Estimation of population parameters (PP)
• Development of confidence intervals for PP
• Probability that the interval correctly estimates
true population parameter
• Means to compare alternative decisions/process
(comparing transmission production processes)
• Hypothesis testing: validate differences among PP
Estimation Process
Population
Mean, m, is
unknown
Sample
Random Sample
Mean
X = 50
I am 95%
confident that m
is between 40 &
60.
Population Parameters
Estimated
Population Parameter
Point Estimate
_
Mean
m
X
Proportion
p
ps
Variance
s
Std. Dev.
s
2
s
2
s
Confidence Interval Estimation
• Provides Range of Values

Based on Observations from Sample
• Gives Information about Closeness
to Unknown Population Parameter
• Stated in terms of Probability
Never 100% Sure
Elements of Confidence Interval Estimation
A Probability That the Population Parameter
Falls Somewhere Within the Interval.
Sample
Confidence Interval
Statistic
Confidence Limit
(Lower)
Confidence Limit
(Upper)
Example of Confidence Interval Estimation
Example: 90 % CI for the mean is 10 ± 2.
Point Estimate = 10
Margin of Error = 2
CI = [8,12]
Level of Confidence = 1 -  = 0.9
Probability that true PP is not in this CI = 0.1
Confidence Limits for Population Mean
Parameter =
Statistic ± Its Error
m  X  Error
X  m
Z 
= Error =
m  X
X  m
Error
s X
s
X

Error  Z s
m  X  Zs X
x
Confidence Intervals
X  Z s X  X  Z 
s
sx_
n
_
X
m  1 .645 s x
m  1 .645 s x
90% Samples
m  1 . 96 s x
m  1 . 96 s x
95% Samples
m  2 .58s x
m  2 .58s x
99% Samples
Level of Confidence
•
Probability that the unknown
population parameter falls within the
interval
•
Denoted (1 - ) % = level of confidence
e.g. 90%, 95%, 99%

 Is Probability That the Parameter Is Not
Within the Interval
Intervals & Level of Confidence
Sampling
Distribution of
the Mean
/2
Intervals
Extend from
s_
x
1-
mX  m
/2
_
X
(1 - ) % of
Intervals Contain m.
X  ZsX
 % Do Not.
to
X  ZsX
Confidence Intervals
Factors Affecting Interval Width
•
Data Variation
Intervals Extend from
measured by s
X - Zs
•
Sample Size
sX  sX / n
•
Level of Confidence
(1 - )
x
to X + Z s
x
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Finite
Population
Confidence Intervals (s Known)
•
•
Assumptions

Population Standard Deviation is Known

Population is Normally Distributed

If Not Normal, use large samples
Confidence Interval Estimate
s  m 
X  Z / 2 
n
s
X  Z / 2 
n
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Finite
Population
Confidence Intervals (s Unknown)
•
Assumptions
Population Standard Deviation is Unknown
 Population Must Be Normally Distributed

•
Use Student’s t Distribution
•
Confidence Interval Estimate
S
S  m  X t
X  t / 2 ,n1 
 / 2 ,n1 
n
n
Student’s t Distribution
•
•
•
•
•
•
•
Shape similar to Normal Distribution
Different t distributions based on df
Has a larger variance than Normal
Larger Sample size: t approaches Normal
At n = 120 - virtually the same
For any sample size true distribution of
Sample mean is the student’s t
For unknown s and when in doubt use t
Student’s t Distribution
Standard
Normal
Bell-Shaped
Symmetric
‘Fatter’ Tails
t (df = 13)
t (df = 5)
0
Z
t
Degrees of Freedom (df)
•
Number of Observations that Are Free to Vary
After Sample Mean Has Been Calculated
•
Example

Mean of 3 Numbers Is 2
X1 = 1 (or Any Number)
X2 = 2 (or Any Number)
X3 = 3 (Cannot Vary)
Mean = 2
degrees of freedom =
n -1
= 3 -1
=2
Student’s t Table
Assume: n = 3
=n-1=2
Upper Tail Area
df
.25
.10
.05
df
 = .10
/2 =.05
1 1.000 3.078 6.314
2 0.817 1.886 2.920
.05
3 0.765 1.638 2.353
0
t Values
2.920
t
Example: Interval Estimation s Unknown
A random sample of n = 25 has X = 50 and
s = 8. Set up a 95% confidence interval
estimate for m.
S
S
X  t / 2 ,n1 
 m  X  t / 2 ,n1 
n
n
50  2 . 0639 
8
25
m
46 . 69
 m 
50  2 . 0639 
53 . 30
8
25
Example: Tracway Transmission
Sample of n = 30, S = 45.4 - Find a 99 % CI for, m , the
mean of each transmission system process. Therefore  =
.01 and /2 = .005
t
/ 2, n 1
t
.005,29
 2.7564
S
 m  X  t / 2 ,n1  n
45.4  m  289.6  2.7564  45.4
289.6  2.7564 
30
30
S
X  t / 2 ,n1 
n
266.75
 m 
312.45
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Finite
Population
Estimation for Finite Populations
•
Assumptions
 Sample Is Large Relative to Population
 n / N > .05
•
Use Finite Population Correction Factor
Confidence Interval (Mean, sX Unknown)
S
S
N n
N n
X  t / 2 ,n1 
X  t / 2,n1  


m

X
n
n N 1
N 1
•
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Finite
Population
Confidence Interval Estimate Proportion
•
Assumptions
 Two Categorical Outcomes
 Population Follows Binomial Distribution
 Normal Approximation Can Be Used

•
n·p  5
&
n·(1 - p)  5
Confidence Interval Estimate
ps ( 1  ps )
ps ( 1  ps )
 p  ps  Z / 2 
ps  Z / 2 
n
n
Example: Estimating Proportion
A random sample of 1000 Voters showed
51% voted for Candidate A. Set up a 90%
confidence interval estimate for p.
ps ( 1  ps )
ps ( 1  ps )  p 
ps  Z / 2 
ps  Z / 2 
n
n
.51(1  .51)  p 
.51  1.645 
1000
.51(1  .51)
.51  1.645 
1000
.484  p  .536
Sample Size
Too Big:
•Requires too
much resources
Too Small:
•Won’t do
the job
Example: Sample Size for Mean
What sample size is needed to be 90%
confident of being correct within ± 5? A
pilot study suggested that the standard
deviation is 45.
Z s
2
n
2
Error
2

1645
.
5
2
2
45
2
 219.2 @ 220
Round Up
Example: Sample Size for Proportion
What sample size is needed to be within ± 5 with
90% confidence? Out of a population of 1,000,
we randomly selected 100 of which 30 were
defective.
Z 2 p ( 1  p ) 1 . 645 2 (. 30 )(. 70 )
n

 227 . 3
2
2
error
. 05
@ 228
Round Up
Hypothesis Testing
• Draw inferences about two contrasting
propositions (hypothesis)
•
Determine whether two means are equal:
1. Formulate the hypothesis to test
2. Select a level of significance
3. Determine a decision rule as a base to
conclusion
4. Collect data and calculate a test statistic
5. Apply the decision rule to draw conclusion
Hypothesis Formulation
• Null hypothesis: H0 representing status quo
• Alternative hypothesis: H1
• Assumes that H0 is true
• Sample evidence is obtained to determine
whether H1 is more likely to be true
Significance Level
TestTrue
Accept
Reject
False
Type II Error
Type I Error
Probability of making Type I error  = level of significance
Confidence Coefficient = 1- 
Probability of making Type II error  = level of significance
Power of the test = 1- 
Decision Rules
• Sampling Distribution: Normal or t distribution
• Rejection Region
• Non Rejection Region
• Two-tailed test , /2
• One-tailed test , 
•
P-Values
Hypothesis Testing: Cases
• Two-Sample Means
• F-Test for Variances
• Proportions
• ANOVA: Differences of several means
• Chi-square for independence
Chapter Summary
• Sampling: Design and Methods
• Estimation:
• Confidence Interval Estimation for Mean
(s Known)
• Confidence Interval Estimation for Mean
(s Unknown)
• Confidence Interval Estimation for Proportion
Chapter Summary
• Finite Populations
• Student’s t distribution
• Sample Size Estimation
• Hypothesis Testing
• Significance Levels: Type I/II errors
• ANOVA