Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Business Statistics, 6e Kvanli, Pavur, Keeling Chapter 7 – Statistical Inference and Sampling Slides prepared by Jeff Heyl, Lincoln University Thomson/South-Western Learning™ 1 ©2003 South-Western/Thomson Simple Random Sampling All items in the population have the same probability of being selected Finite Population: To be sure that a simple random sample is obtained from a finite population the items should be numbered from 1 to N Nearly all statistical procedures require that a random sample is obtained ©2003 Thomson/South-Western 2 Estimation The population consists of every item of interest Population mean is µ and is generally not known The sample is randomly drawn from the population Sample values should be selected randomly, one at a time, from the population ©2003 Thomson/South-Western 3 Random Sampling and Estimation Population (mean = µ) X estimates µ Figure 7.1 Sample (mean = X) ©2003 Thomson/South-Western 4 Distribution for Everglo Bulb Lifetime = 50 | | 300 350 | µ = 400 | | 450 500 X Figure 7.2 ©2003 Thomson/South-Western 5 Sample Means Figure 7.3 ©2003 Thomson/South-Western 6 Excel Histogram Frequency Histogram 8 7 6 5 4 3 2 1 0 377 and under 384 384 and under 391 391 and under 398 398 and under 405 405 and under 412 412 and under 419 419 and under 426 426 and under 433 Class Limits Figure 7.4 ©2003 Thomson/South-Western 7 Distribution of X The mean of the probability distribution for X = µX = µ Standard error of X = standard deviation of the probability distribution for X = X = n ©2003 Thomson/South-Western 8 Normal Curves Population (mean = µ, standard deviation = ) Random sample (mean = X, standard deviation = s = 50 X = value from this population Assumes the individual observations follow a normal distribution x = x 50 10 X follows a normal distribution, centered at µ with a standard deviation / n µx = 400 X Figure 7.5 ©2003 Thomson/South-Western 9 Central Limit Theorem When obtaining large samples (n > 30) from any population, the sample mean X will follow an approximate normal distribution What this means is that if you randomly sample a large population the X distribution will be approximately normal with a mean µ and a standard deviation (standard error) of x = n ©2003 Thomson/South-Western 10 Distribution of X Population = 50 µ = 400 x = X 50 20 x = 50 10 = 11.18 = 15.81 X µx = 400 (n = 10) x = 50 50 x = 50 100 =5 = 7.07 µx = 400 (n = 50) X µx = 400 (n = 20) X µx = 400 (n = 100) X Figure 7.6 ©2003 Thomson/South-Western 11 Distribution of X Mean = µx = µ Standard deviation = x = (standard error) n ©2003 Thomson/South-Western 12 Assembly Time =3 Area = P(X > 22) | 14 | 17 | | | µ = 20 22 23 | 26 X = assembly line Figure 7.7 ©2003 Thomson/South-Western 13 Assembly Time x = .77 Area = P(X > 22) | 14 | 19.23 | µx | 20.77 | 22 X Figure 7.8 ©2003 Thomson/South-Western 14 Assembly Time Area = P(19 < X < 21) | 19 | µx = 20 | 21 X Figure 7.9 ©2003 Thomson/South-Western 15 Central Limit Theorem a+b = 100 2 b-a = = 28.87 12 µ= Uniform population a = 50 (n = 2) X µ = 100 (n = 5) b = 150 X X X µx (n = 30) By the CLT, µx = µ = 100 x = 28.87 = = 5.27 30 n Figure 7.10 ©2003 Thomson/South-Western 16 Central Limit Theorem Exponential population µ = 100 = X (n = 2) X (n = 5) X X µx (n = 30) By the CLT, µx = µ = 100 x = = n 100 30 = 18.26 Figure 7.11 ©2003 Thomson/South-Western 17 Central Limit Theorem U-shaped population | µ | (n = 2) X X | (n = 5) X | (n = 30) X Figure 7.12 ©2003 Thomson/South-Western 18 Sampling Without Replacement Mean = µx = µ Standard deviation = x = (standard error) n • N-n N-1 ©2003 Thomson/South-Western 19 Distribution of Sample Mean N-n N-1 n 8500 350 - 45 = 45 350 - 1 x = = (1267.11)(.935) = $1184.75 µx = 48,000 Observed value of X = $43,900 X = average income of 45 female managers Figure 7.13 ©2003 Thomson/South-Western 20 Confidence Intervals known µ=? X Figure 7.14 ©2003 Thomson/South-Western 21 Confidence Intervals 3 x = = .6 minute 25 Area = P(X > 20) = .5 µx = 20 X = average of 25 assembly lines Figure 7.15 ©2003 Thomson/South-Western 22 Confidence Intervals Area = .475 -1.96 Area = .475 0 Z 1.96 Total area = .95 Figure 7.16 ©2003 Thomson/South-Western 23 Confidence for the Mean of a Normal Population ( known) Z= X-µ / n P(-1.96 Z 1.96) = .95 X-µ P -1.96 ≤ ≤ 1.96 = .95 / n P X - 1.96 n ≤ µ ≤ X + 1.96 n = .95 ©2003 Thomson/South-Western 24 Confidence for the Mean of a Normal Population ( known) (1 - ) • 100% Confidence Interval x - Z/2 , x + Z/2 n E = margin of error = Z/2 n n ©2003 Thomson/South-Western 25 Confidence for the Mean of a Normal Population ( known) Area = .1 Area = .05 1.645 1.96 0 1.28 Area = .025 Z Figure 7.17 ©2003 Thomson/South-Western 26 Excel Screens Figure 7.18 ©2003 Thomson/South-Western 27 Excel Screens Figure 7.19 ©2003 Thomson/South-Western 28 Excel Screens Figure 7.20 ©2003 Thomson/South-Western 29 Confidence for the Mean of a Normal Population ( unknown) Student’s t Distribution Population variance unknown Degrees of freedom = n - 1 x - t/2, n - 1 s n to x + t/2, n - 1 s n ©2003 Thomson/South-Western 30 Student’s t Distribution Standard normal, Z t curve with 20 df t curve with 10 df 0 t Figure 7.21 ©2003 Thomson/South-Western 31 Confidence Interval Figure 7.22 ©2003 Thomson/South-Western 32 Confidence Interval Figure 7.23 ©2003 Thomson/South-Western 33 Selecting Necessary Sample Size Known Sample size based on the level of accuracy required for the application Maximum error: E Used to determine the necessary sample size to provide the specified level of accuracy Specified in advance ©2003 Thomson/South-Western 34 Selecting Necessary Sample Size Known E = Z/2 n Z/2 • n= E 2 ©2003 Thomson/South-Western 35 Selecting Necessary Sample Size Unknown To obtain a rough approximation, ask someone who is familiar with the data to be collected: 1. What do you think will be the highest value in the sample (H)? 2. What will be the lowest value (L)? ©2003 Thomson/South-Western 36 Selecting Necessary Sample Size Unknown H-L 4 Z/2 • s n= E 2 ©2003 Thomson/South-Western 37 Other Sampling Procedures Population: the collection of all items about which we are interested Sampling Unit: a collection of elements selected from the population Cluster: a sampling unit that is a group of elements from the population, such as all adults in a particular city block Sampling frame: a list of population elements ©2003 Thomson/South-Western 38 Other Sampling Procedures Strata: are nonoverlapping subpopulations Sampling design: specifies the manner in which the sampling units are to be selected ©2003 Thomson/South-Western 39 Simple Random Sampling Population mean: µ Estimator: ∑x X= n Estimated standard error of X: N-n sx = • n N-1 Approximate confidence interval: X ± Z/2sx ©2003 Thomson/South-Western 40 Systematic Sampling The sampling frame consists of N records The sample of n is obtained by sampling every kth record, where k is an integer approximately equal N/n The sampling frame should be ordered randomly ©2003 Thomson/South-Western 41 Stratified Sampling Stratified sampling obtains more information due to the homogenous nature of each strata Stratified sampling obtains a cross section of the entire population Obtain a mean within each strata as well as an estimate of ©2003 Thomson/South-Western 42 Stratified Sampling Use the following notation: ni Ni N n Xi si = sample size in stratum i = number of elements in stratum i = total population size = ∑Ni = total sample size = ∑ni = sample mean in stratum i = sample standard deviation in stratum i ©2003 Thomson/South-Western 43 Stratified Sampling Population mean: µ Estimator: ∑NiXi Xst = N Estimated standard error of X: sx = st Ni ∑ N 2 Ni - ni Ni si2 ni Approximate confidence interval: Xst ± Z/2sx st ©2003 Thomson/South-Western 44 Cluster Sampling Single-stage cluster sampling: randomly select a set of clusters for sampling Include all elements in the cluster in your sample Two-stage cluster sampling: randomly select a set of clusters for sampling Randomly select elements from each sampled cluster ©2003 Thomson/South-Western 45 Cluster Sampling Population mean: µ Estimator: ∑Ti Xc = ∑n i Estimated standard error of Xc: sx = c M - m ∑(Ti - Xcni)2 m-1 mMN2 Approximate confidence interval: Xc ± Z/2sx c ©2003 Thomson/South-Western 46 Confidence Interval Constructing a Confidence Interval for a Population Mean known unknown Use Table A-4 (Z) Use Table A-5 (t) Can use Table A-4 (Z) to obtain approximate confidence interval if n > 30 Figure 7.25 ©2003 Thomson/South-Western 47