* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Confidence Interval Estimation for the Mean
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Sampling (statistics) wikipedia , lookup
German tank problem wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University Chapter Topics • Sampling: Design and Methods • Estimation: • Confidence Interval Estimation for the Mean (s Known) •Confidence Interval Estimation for the Mean (s Unknown) •Confidence Interval Estimation for the Proportion Chapter Topics • The Situation of Finite Populations • Student’s t distribution • Sample Size Estimation • Hypothesis Testing • Significance Levels • ANOVA Statistical Sampling • Sampling: Valuable tool • Population: • Too large to deal with effectively or practically • Impossible or too expensive to obtain all data • Collect sample data to draw conclusions about unknown population Sample design • Representative Samples of the population • Sampling Plan: Approach to obtain samples • Sampling Plan: States • Objectives • Target population • Population frame • Method of sampling • Data collection procedure • Statistical analysis tools Objectives • Estimate population parameters such as a mean, proportion or standard deviation • Identify if significant difference exists between two populations Population Frame • List of all members of the target population Sampling Methods • Subjective Sampling: • Judgment: select the sample (best customers) • Convenience: ease of sampling • Probabilistic Sampling: • Simple Random Sampling • Replacement • Without Replacement Sampling Methods • Systematic Sampling: • Selects items periodically from population. • First item randomly selected - may produce bias • Example: pick one sample every 7 days • Stratified Sampling: • Populations divided into natural strata • Allocates proper proportion of samples to each stratum • Each stratum weighed by its size – cost or significance of certain strata might suggest different allocation • Example: sampling of political districts - wards Sampling Methods • Cluster Sampling: • Populations divided into clusters then random sample each • Items within each cluster become members of the sample • Example: segment customers for each geographical location • Sampling Using Excel: • Population listed in spreadsheet • Periodic • Random Sampling Methods: Selection • Systematic Sampling: • Population is large – considerable effort to randomly select • Stratified Sampling: • Items in each stratum homogeneous - Low variances • Relatively smaller sample size than simple random sampling • Cluster Sampling: • Items in each cluster are heterogeneous • Clusters are representative of the entire Population • Requires larger sample Sampling Errors • Sample does not represent target population (e. g. selecting inappropriate sampling method) • Inherent error:samples only subset of population • Depends on size of Sample relative to population • Accuracy of estimates • Trade-off: cost/time versus accuracy Sampling From Finite Populations • Finite without replacement (R) • Statistical theory assumes: samples selected with R • When n < .05 N – difference is insignificant • Otherwise need a correction factor • Standard error of the mean s x  s n N n N 1 Statistical Analysis of Sample Data • Estimation of population parameters (PP) • Development of confidence intervals for PP • Probability that the interval correctly estimates true population parameter • Means to compare alternative decisions/process (comparing transmission production processes) • Hypothesis testing: validate differences among PP Estimation Process Population Mean, m, is unknown Sample Random Sample Mean X = 50 I am 95% confident that m is between 40 & 60. Population Parameters Estimated Population Parameter Point Estimate _ Mean m X Proportion p ps Variance s Std. Dev. s 2 s 2 s Confidence Interval Estimation • Provides Range of Values  Based on Observations from Sample • Gives Information about Closeness to Unknown Population Parameter • Stated in terms of Probability Never 100% Sure Elements of Confidence Interval Estimation A Probability That the Population Parameter Falls Somewhere Within the Interval. Sample Confidence Interval Statistic Confidence Limit (Lower) Confidence Limit (Upper) Example of Confidence Interval Estimation Example: 90 % CI for the mean is 10 ± 2. Point Estimate = 10 Margin of Error = 2 CI = [8,12] Level of Confidence = 1 -  = 0.9 Probability that true PP is not in this CI = 0.1 Confidence Limits for Population Mean Parameter = Statistic ± Its Error m  X  Error X  m Z  = Error = m  X X  m Error s X s X  Error  Z s m  X  Zs X x Confidence Intervals X  Z s X  X  Z  s sx_ n _ X m  1 .645 s x m  1 .645 s x 90% Samples m  1 . 96 s x m  1 . 96 s x 95% Samples m  2 .58s x m  2 .58s x 99% Samples Level of Confidence • Probability that the unknown population parameter falls within the interval • Denoted (1 - ) % = level of confidence e.g. 90%, 95%, 99%   Is Probability That the Parameter Is Not Within the Interval Intervals & Level of Confidence Sampling Distribution of the Mean /2 Intervals Extend from s_ x 1- mX  m /2 _ X (1 - ) % of Intervals Contain m. X  ZsX  % Do Not. to X  ZsX Confidence Intervals Factors Affecting Interval Width • Data Variation Intervals Extend from measured by s X - Zs • Sample Size sX  sX / n • Level of Confidence (1 - ) x to X + Z s x Confidence Interval Estimates Confidence Intervals Mean s Known Proportion s Unknown Finite Population Confidence Intervals (s Known) • • Assumptions  Population Standard Deviation is Known  Population is Normally Distributed  If Not Normal, use large samples Confidence Interval Estimate s  m  X  Z / 2  n s X  Z / 2  n Confidence Interval Estimates Confidence Intervals Mean s Known Proportion s Unknown Finite Population Confidence Intervals (s Unknown) • Assumptions Population Standard Deviation is Unknown  Population Must Be Normally Distributed  • Use Student’s t Distribution • Confidence Interval Estimate S S  m  X t X  t / 2 ,n1   / 2 ,n1  n n Student’s t Distribution • • • • • • • Shape similar to Normal Distribution Different t distributions based on df Has a larger variance than Normal Larger Sample size: t approaches Normal At n = 120 - virtually the same For any sample size true distribution of Sample mean is the student’s t For unknown s and when in doubt use t Student’s t Distribution Standard Normal Bell-Shaped Symmetric ‘Fatter’ Tails t (df = 13) t (df = 5) 0 Z t Degrees of Freedom (df) • Number of Observations that Are Free to Vary After Sample Mean Has Been Calculated • Example  Mean of 3 Numbers Is 2 X1 = 1 (or Any Number) X2 = 2 (or Any Number) X3 = 3 (Cannot Vary) Mean = 2 degrees of freedom = n -1 = 3 -1 =2 Student’s t Table Assume: n = 3 =n-1=2 Upper Tail Area df .25 .10 .05 df  = .10 /2 =.05 1 1.000 3.078 6.314 2 0.817 1.886 2.920 .05 3 0.765 1.638 2.353 0 t Values 2.920 t Example: Interval Estimation s Unknown A random sample of n = 25 has X = 50 and s = 8. Set up a 95% confidence interval estimate for m. S S X  t / 2 ,n1   m  X  t / 2 ,n1  n n 50  2 . 0639  8 25 m 46 . 69  m  50  2 . 0639  53 . 30 8 25 Example: Tracway Transmission Sample of n = 30, S = 45.4 - Find a 99 % CI for, m , the mean of each transmission system process. Therefore  = .01 and /2 = .005 t / 2, n 1 t .005,29  2.7564 S  m  X  t / 2 ,n1  n 45.4  m  289.6  2.7564  45.4 289.6  2.7564  30 30 S X  t / 2 ,n1  n 266.75  m  312.45 Confidence Interval Estimates Confidence Intervals Mean s Known Proportion s Unknown Finite Population Estimation for Finite Populations • Assumptions  Sample Is Large Relative to Population  n / N > .05 • Use Finite Population Correction Factor Confidence Interval (Mean, sX Unknown) S S N n N n X  t / 2 ,n1  X  t / 2,n1     m  X n n N 1 N 1 • Confidence Interval Estimates Confidence Intervals Mean s Known Proportion s Unknown Finite Population Confidence Interval Estimate Proportion • Assumptions  Two Categorical Outcomes  Population Follows Binomial Distribution  Normal Approximation Can Be Used  • n·p  5 & n·(1 - p)  5 Confidence Interval Estimate ps ( 1  ps ) ps ( 1  ps )  p  ps  Z / 2  ps  Z / 2  n n Example: Estimating Proportion A random sample of 1000 Voters showed 51% voted for Candidate A. Set up a 90% confidence interval estimate for p. ps ( 1  ps ) ps ( 1  ps )  p  ps  Z / 2  ps  Z / 2  n n .51(1  .51)  p  .51  1.645  1000 .51(1  .51) .51  1.645  1000 .484  p  .536 Sample Size Too Big: •Requires too much resources Too Small: •Won’t do the job Example: Sample Size for Mean What sample size is needed to be 90% confident of being correct within ± 5? A pilot study suggested that the standard deviation is 45. Z s 2 n 2 Error 2  1645 . 5 2 2 45 2  219.2 @ 220 Round Up Example: Sample Size for Proportion What sample size is needed to be within ± 5 with 90% confidence? Out of a population of 1,000, we randomly selected 100 of which 30 were defective. Z 2 p ( 1  p ) 1 . 645 2 (. 30 )(. 70 ) n   227 . 3 2 2 error . 05 @ 228 Round Up Hypothesis Testing • Draw inferences about two contrasting propositions (hypothesis) • Determine whether two means are equal: 1. Formulate the hypothesis to test 2. Select a level of significance 3. Determine a decision rule as a base to conclusion 4. Collect data and calculate a test statistic 5. Apply the decision rule to draw conclusion Hypothesis Formulation • Null hypothesis: H0 representing status quo • Alternative hypothesis: H1 • Assumes that H0 is true • Sample evidence is obtained to determine whether H1 is more likely to be true Significance Level TestTrue Accept Reject False Type II Error Type I Error Probability of making Type I error  = level of significance Confidence Coefficient = 1-  Probability of making Type II error  = level of significance Power of the test = 1-  Decision Rules • Sampling Distribution: Normal or t distribution • Rejection Region • Non Rejection Region • Two-tailed test , /2 • One-tailed test ,  • P-Values Hypothesis Testing: Cases • Two-Sample Means • F-Test for Variances • Proportions • ANOVA: Differences of several means • Chi-square for independence Chapter Summary • Sampling: Design and Methods • Estimation: • Confidence Interval Estimation for Mean (s Known) • Confidence Interval Estimation for Mean (s Unknown) • Confidence Interval Estimation for Proportion Chapter Summary • Finite Populations • Student’s t distribution • Sample Size Estimation • Hypothesis Testing • Significance Levels: Type I/II errors • ANOVA
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            