* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Confidence Interval Estimation for the Mean
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Sampling (statistics) wikipedia , lookup
German tank problem wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University Chapter Topics • Sampling: Design and Methods • Estimation: • Confidence Interval Estimation for the Mean (s Known) •Confidence Interval Estimation for the Mean (s Unknown) •Confidence Interval Estimation for the Proportion Chapter Topics • The Situation of Finite Populations • Student’s t distribution • Sample Size Estimation • Hypothesis Testing • Significance Levels • ANOVA Statistical Sampling • Sampling: Valuable tool • Population: • Too large to deal with effectively or practically • Impossible or too expensive to obtain all data • Collect sample data to draw conclusions about unknown population Sample design • Representative Samples of the population • Sampling Plan: Approach to obtain samples • Sampling Plan: States • Objectives • Target population • Population frame • Method of sampling • Data collection procedure • Statistical analysis tools Objectives • Estimate population parameters such as a mean, proportion or standard deviation • Identify if significant difference exists between two populations Population Frame • List of all members of the target population Sampling Methods • Subjective Sampling: • Judgment: select the sample (best customers) • Convenience: ease of sampling • Probabilistic Sampling: • Simple Random Sampling • Replacement • Without Replacement Sampling Methods • Systematic Sampling: • Selects items periodically from population. • First item randomly selected - may produce bias • Example: pick one sample every 7 days • Stratified Sampling: • Populations divided into natural strata • Allocates proper proportion of samples to each stratum • Each stratum weighed by its size – cost or significance of certain strata might suggest different allocation • Example: sampling of political districts - wards Sampling Methods • Cluster Sampling: • Populations divided into clusters then random sample each • Items within each cluster become members of the sample • Example: segment customers for each geographical location • Sampling Using Excel: • Population listed in spreadsheet • Periodic • Random Sampling Methods: Selection • Systematic Sampling: • Population is large – considerable effort to randomly select • Stratified Sampling: • Items in each stratum homogeneous - Low variances • Relatively smaller sample size than simple random sampling • Cluster Sampling: • Items in each cluster are heterogeneous • Clusters are representative of the entire Population • Requires larger sample Sampling Errors • Sample does not represent target population (e. g. selecting inappropriate sampling method) • Inherent error:samples only subset of population • Depends on size of Sample relative to population • Accuracy of estimates • Trade-off: cost/time versus accuracy Sampling From Finite Populations • Finite without replacement (R) • Statistical theory assumes: samples selected with R • When n < .05 N – difference is insignificant • Otherwise need a correction factor • Standard error of the mean s x s n N n N 1 Statistical Analysis of Sample Data • Estimation of population parameters (PP) • Development of confidence intervals for PP • Probability that the interval correctly estimates true population parameter • Means to compare alternative decisions/process (comparing transmission production processes) • Hypothesis testing: validate differences among PP Estimation Process Population Mean, m, is unknown Sample Random Sample Mean X = 50 I am 95% confident that m is between 40 & 60. Population Parameters Estimated Population Parameter Point Estimate _ Mean m X Proportion p ps Variance s Std. Dev. s 2 s 2 s Confidence Interval Estimation • Provides Range of Values Based on Observations from Sample • Gives Information about Closeness to Unknown Population Parameter • Stated in terms of Probability Never 100% Sure Elements of Confidence Interval Estimation A Probability That the Population Parameter Falls Somewhere Within the Interval. Sample Confidence Interval Statistic Confidence Limit (Lower) Confidence Limit (Upper) Example of Confidence Interval Estimation Example: 90 % CI for the mean is 10 ± 2. Point Estimate = 10 Margin of Error = 2 CI = [8,12] Level of Confidence = 1 - = 0.9 Probability that true PP is not in this CI = 0.1 Confidence Limits for Population Mean Parameter = Statistic ± Its Error m X Error X m Z = Error = m X X m Error s X s X Error Z s m X Zs X x Confidence Intervals X Z s X X Z s sx_ n _ X m 1 .645 s x m 1 .645 s x 90% Samples m 1 . 96 s x m 1 . 96 s x 95% Samples m 2 .58s x m 2 .58s x 99% Samples Level of Confidence • Probability that the unknown population parameter falls within the interval • Denoted (1 - ) % = level of confidence e.g. 90%, 95%, 99% Is Probability That the Parameter Is Not Within the Interval Intervals & Level of Confidence Sampling Distribution of the Mean /2 Intervals Extend from s_ x 1- mX m /2 _ X (1 - ) % of Intervals Contain m. X ZsX % Do Not. to X ZsX Confidence Intervals Factors Affecting Interval Width • Data Variation Intervals Extend from measured by s X - Zs • Sample Size sX sX / n • Level of Confidence (1 - ) x to X + Z s x Confidence Interval Estimates Confidence Intervals Mean s Known Proportion s Unknown Finite Population Confidence Intervals (s Known) • • Assumptions Population Standard Deviation is Known Population is Normally Distributed If Not Normal, use large samples Confidence Interval Estimate s m X Z / 2 n s X Z / 2 n Confidence Interval Estimates Confidence Intervals Mean s Known Proportion s Unknown Finite Population Confidence Intervals (s Unknown) • Assumptions Population Standard Deviation is Unknown Population Must Be Normally Distributed • Use Student’s t Distribution • Confidence Interval Estimate S S m X t X t / 2 ,n1 / 2 ,n1 n n Student’s t Distribution • • • • • • • Shape similar to Normal Distribution Different t distributions based on df Has a larger variance than Normal Larger Sample size: t approaches Normal At n = 120 - virtually the same For any sample size true distribution of Sample mean is the student’s t For unknown s and when in doubt use t Student’s t Distribution Standard Normal Bell-Shaped Symmetric ‘Fatter’ Tails t (df = 13) t (df = 5) 0 Z t Degrees of Freedom (df) • Number of Observations that Are Free to Vary After Sample Mean Has Been Calculated • Example Mean of 3 Numbers Is 2 X1 = 1 (or Any Number) X2 = 2 (or Any Number) X3 = 3 (Cannot Vary) Mean = 2 degrees of freedom = n -1 = 3 -1 =2 Student’s t Table Assume: n = 3 =n-1=2 Upper Tail Area df .25 .10 .05 df = .10 /2 =.05 1 1.000 3.078 6.314 2 0.817 1.886 2.920 .05 3 0.765 1.638 2.353 0 t Values 2.920 t Example: Interval Estimation s Unknown A random sample of n = 25 has X = 50 and s = 8. Set up a 95% confidence interval estimate for m. S S X t / 2 ,n1 m X t / 2 ,n1 n n 50 2 . 0639 8 25 m 46 . 69 m 50 2 . 0639 53 . 30 8 25 Example: Tracway Transmission Sample of n = 30, S = 45.4 - Find a 99 % CI for, m , the mean of each transmission system process. Therefore = .01 and /2 = .005 t / 2, n 1 t .005,29 2.7564 S m X t / 2 ,n1 n 45.4 m 289.6 2.7564 45.4 289.6 2.7564 30 30 S X t / 2 ,n1 n 266.75 m 312.45 Confidence Interval Estimates Confidence Intervals Mean s Known Proportion s Unknown Finite Population Estimation for Finite Populations • Assumptions Sample Is Large Relative to Population n / N > .05 • Use Finite Population Correction Factor Confidence Interval (Mean, sX Unknown) S S N n N n X t / 2 ,n1 X t / 2,n1 m X n n N 1 N 1 • Confidence Interval Estimates Confidence Intervals Mean s Known Proportion s Unknown Finite Population Confidence Interval Estimate Proportion • Assumptions Two Categorical Outcomes Population Follows Binomial Distribution Normal Approximation Can Be Used • n·p 5 & n·(1 - p) 5 Confidence Interval Estimate ps ( 1 ps ) ps ( 1 ps ) p ps Z / 2 ps Z / 2 n n Example: Estimating Proportion A random sample of 1000 Voters showed 51% voted for Candidate A. Set up a 90% confidence interval estimate for p. ps ( 1 ps ) ps ( 1 ps ) p ps Z / 2 ps Z / 2 n n .51(1 .51) p .51 1.645 1000 .51(1 .51) .51 1.645 1000 .484 p .536 Sample Size Too Big: •Requires too much resources Too Small: •Won’t do the job Example: Sample Size for Mean What sample size is needed to be 90% confident of being correct within ± 5? A pilot study suggested that the standard deviation is 45. Z s 2 n 2 Error 2 1645 . 5 2 2 45 2 219.2 @ 220 Round Up Example: Sample Size for Proportion What sample size is needed to be within ± 5 with 90% confidence? Out of a population of 1,000, we randomly selected 100 of which 30 were defective. Z 2 p ( 1 p ) 1 . 645 2 (. 30 )(. 70 ) n 227 . 3 2 2 error . 05 @ 228 Round Up Hypothesis Testing • Draw inferences about two contrasting propositions (hypothesis) • Determine whether two means are equal: 1. Formulate the hypothesis to test 2. Select a level of significance 3. Determine a decision rule as a base to conclusion 4. Collect data and calculate a test statistic 5. Apply the decision rule to draw conclusion Hypothesis Formulation • Null hypothesis: H0 representing status quo • Alternative hypothesis: H1 • Assumes that H0 is true • Sample evidence is obtained to determine whether H1 is more likely to be true Significance Level TestTrue Accept Reject False Type II Error Type I Error Probability of making Type I error = level of significance Confidence Coefficient = 1- Probability of making Type II error = level of significance Power of the test = 1- Decision Rules • Sampling Distribution: Normal or t distribution • Rejection Region • Non Rejection Region • Two-tailed test , /2 • One-tailed test , • P-Values Hypothesis Testing: Cases • Two-Sample Means • F-Test for Variances • Proportions • ANOVA: Differences of several means • Chi-square for independence Chapter Summary • Sampling: Design and Methods • Estimation: • Confidence Interval Estimation for Mean (s Known) • Confidence Interval Estimation for Mean (s Unknown) • Confidence Interval Estimation for Proportion Chapter Summary • Finite Populations • Student’s t distribution • Sample Size Estimation • Hypothesis Testing • Significance Levels: Type I/II errors • ANOVA