Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 1 Sampling and Confidence Interval How can it be that mathematics, being after all a product of human thought independent of experience, is so admirably adapted to the objects of reality Albert Einstein Some parts of these slides were prepared based on Managing Business Process Flow, Anupindi et al. 2012. Pearson. Essentials of Modern Busines Statistics, Anderson et al. 2012. Cengage. Before coming to class, please watch the following 3 repository lectures on youtube Mean and Variance of Sample Mean.youtube.repository https://www.youtube.com/watch?v=7mYDHbrLEQo The Sampling Distribution of the Sample Mean.youtube.repository https://www.youtube.com/watch?v=q50GpTdFYyI Confidence Interval.youtube.repository https://www.youtube.com/watch?v=lwpobQmUTd8 The link to the excel file Sampling and Confidence Interval-exl http://www.csun.edu/~aa2035/CourseBase/S-Sampling-CI/ArdiCh78.xlsx X: , , y = 2x y = 2, y = 2 Past data on a specific stock shows that the return of this stock has a mean of 0.05 and StdDev of 0.05. Therefore, if we invest $1, our investment after one year will have an average of $0.05 and standard deviation of $0.05. Using simulation in excel show what is the mean and standard deviation of 10,000 and 20,000 investments. A random variable x with mean of , and standard deviation of σ is multiplied by 2 generates the random variable y=2x. x: ( , σ) y: (?,?) Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 4 $10,000 or $20,000 in One Stock Probability is between 0 and 1. rand() is between 0 and 1. Therefore, it is valid if we assume that a rand() is a random probability. Accoordingly =NORM.S.INV(rand()) Will provide us with a random z suppose it is z = 0.441475 x= µ + z = x = 5%+ 0.441475(5%) x= 5%+2.21% = 7.21% On the same line of reasoning, =NORM.INV(probability, µ, ) =NORM.INV(probability, 5%, 5%) Will directly generate a random x from N(µ, ) Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 5 10,000 and 20,000 in One Stock 1 2 3 4 5 6 7 8 9 10 11 997 998 999 1000 1 0.029 0.032 0.054 0.030 0.070 0.116 0.064 0.056 0.068 0.042 0.112 0.081 -0.011 0.021 -0.023 10000 112 837 389 -545 411 946 824 1043 1133 122 -267 600 556 -114 364 20000 -683 841 850 1717 1067 986 1946 2384 1919 389 1762 684 2058 1361 1046 =10,000*NORM.INV(rand(), 5%, 5%) =20,000*NORM.INV(rand(), 5%, 5%) ` Min= Max= Mean= Variance StdDev= CV Mean/SrdDev Sampling Distribution & Confidence Interval 10000 -1293 2708 490 250278 500 1.02 0.98 Ardavan Asef-Vaziri 20000 -2595 4259 964 1119770 1058 1.10 0.9110 Jan.-2016 1.97 4.47 2.12 6 X: , , y = 2x y = 2, y = 2 A random variable x (, σ). A random variable y = 2x. y = 2 σy = 2σ A random variable x (, σ). A random variable y = nx. y = n σy = nσ Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 7 ROP; Variable R, Fixed L Demand is fixed and is 50 units per day. From the time that we order until the time we receive the order is referred to as Lead Time. Suppose average lead time is 5 days and standard deviation of lead time is 1 day. At what level of inventory should we place an order such that the service level is 90% (Probability of demand during the lead time exceeding inventory is 10%). This point is known as Reorder point (ROP). The difference between ROP and Average demand during lead time is referred to as Safety Stock. What is the average demand during the lead time? What is standard deviation of demand during lead time? Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 8 μ and σ of the Lead Time and Fided demand per period x: ( , σ) y: (n , nσ) 1 2 Sampling Distribution & Confidence Interval 3 Ardavan Asef-Vaziri 4 Jan.-2016 5 9 μ and σ of L and Fixed R If Lead time is variable and Demand is fixed L: Lead Time L: Average Lead Time L: Standard deviation of Lead time R: Demand per period LTD: Average Demand during lead time LTD = L × R LTD: Standard deviation of demand during lead time 𝜎𝐿𝑇𝐷 =R𝜎𝐿 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 10 μ and σ of L and Fixed R If Lead time is variable and Demand is fixed L: Lead Time L: Average Lead Time = 5 days L: Standard deviation of Lead time = 1 day R: Demand per period = 50 per day LTD: Average Demand During Lead Time LTD = 5 × 50 = 250 LTD: Standard deviation of demand during lead time 𝜎𝐿𝑇𝐷 =R𝜎𝐿 𝜎𝐿𝑇𝐷 =50(1) =50 =NORM.INV(0.9,250,50) =314.1 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 11 X: , , y = x1+X2, y = 2, 2 y= 22, y= √2 A random variable x has mean of , and standard deviation of σ. A random variable y is equal to summation of 2 random variable x. y = x1+x2 x: ( , σ) y: (?,?) Mean(y) = y = Mean(x1)+ Mean(x1) = + = 2 VAR(y) = σ y 2 = VAR(x1) + VAR(x1) = σ2 + σ2 = 2 σ2 StdDev(y) = σ y = 𝟐σ Using simulation in excel show what is the mean and standard deviation of inversing $20,000 in one stock or inversing two $10,000 each in one stock. Suppose past data shows that return of all these stocks are 0.05, 0.5. Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 12 20,000 One Stock or 20,000 in Two Stocks 1 2 3 4 5 6 7 8 9 10 11 997 998 999 1000 20000 2786 2085 2718 43 372 221 1406 328 2003 1826 -156 377 1058 -189 961 10000 545 782 -178 626 1048 1162 424 874 1094 734 -109 747 1061 597 -48 10000 -153 -125 501 601 723 567 1411 1455 676 218 -315 730 531 972 390 2(10,000) 393 657 324 1227 1771 1729 1835 2328 1770 951 -424 1477 1592 1569 342 Sampling Distribution & Confidence Interval ` Min= Max= Mean= Variance StdDev= CV Mean/SrdDev Ardavan Asef-Vaziri 20000 -2136 3993 1001 983323 992 0.99 1.0091 Jan.-2016 2(10000) -1384 3291 1021 515268 718 0.70 1.4220 1.02 0.52 0.72 13 20,000 One Stock or 20,000 in Two Stocks 4000 3000 2000 1000 0 -1000 -2000 -3000 -4000 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 14 X: , , y = x1+X2, y = 2, 2 y= 22, y= √2 A random variable x has mean of , and standard deviation of σ. A random variable y is equal to summation of 2 random variable x. y = x1+x2 x: ( , σ) y: (?,?) y = 2 σy2 = 2σ2 σ y = 𝟐σ y = x1+x2+ x3+ ……….+xn x: ( , σ) y = (?,?) y = n σy2 = n σ2 σy = 𝒏σ Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 15 100,000 in One Stock or 10,000 in Each of 10 stocks Suppose there are 10 stocks and with high probability they all have Normal pdf return with mean of 5% and standard deviation of 5%. These stocks are your only options and no more information is available. You have to invest $100,000. What do you do? 100000 6268 -607 6237 18040 8487 7579 9032 839 2152 10000 10000 324 302 37 -418 243 798 -273 214 68 1166 470 -430 100 1431 -231 1699 1476 -276 10000 10000 10000 10000 10000 10000 10000 10000 138 -336 352 687 769 1246 935 922 404 243 210 -78 91 112 1010 478 872 394 181 1165 695 330 523 1657 -45 399 991 1435 661 630 238 191 -235 -100 -179 131 555 195 -138 -42 45 -400 504 661 453 607 717 31 519 1016 760 1031 776 1037 761 857 -172 -751 1496 602 -152 28 662 496 204 965 501 687 1143 251 -203 -290 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 100000 5340 2090 6859 4442 1421 2658 8290 3677 4459 16 100,000 vs. 10(10,000) investment in N(5%, 5%) Investment= Min= Max= Mean= Variance StdDev= CV Mean/SrdDev 1.00 -0.09 0.21 0.05 0.00 0.05 0.94 1.06 100000 -11950 25974 4746 26057541 5105 1.08 0.9298 Sampling Distribution & Confidence Interval 94721 11623364914 107812 Ardavan Asef-Vaziri 10(10000) 31 9388 4976 2409193 1552 0.31 3.2060 Jan.-2016 0.95 10.82 3.29 17 Risk Aversion Individual 20000 15000 10000 5000 0 -5000 -10000 -15000 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 18 Problem Game- The News Vendor Problem Daily demand for your merchandise has mean of 20 and standard deviation of 5. Sales price is $100 per unit of product. You have decided to close this business line in 60 days. Your supplier has also decided to close this line immediately, but has agreed to provide your last order at a cost of $60 per unit. Any unsold product will be disposed at cost of $10 per unit. How many units do you order LTD = R ×L =20 ×60 = 1200. Should we order 1200 units or more or less? It depends on our service level. Underage cost = Cu = p – c = 100 – 60 = 40. Overage cost = Co = 60-0+10 =70 SL = Cu/(Cu+Co) = 40/(40+70) = 0.3636. Due to high overage cost, SL*< 50%. Z(0.3636) = ? Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 19 μ and σ of demand per period and fixed L x: ( , σ) y: (n , 𝑛 σ) 1 2 Sampling Distribution & Confidence Interval 3 4 Ardavan Asef-Vaziri 5 Jan.-2016 20 μ and σ of demand per period and fixed L If Demand is variable and Lead time is fixed L: Lead Time R: Demand per period (per day, week, month) R: Average Demand per period (day, week, month) R: Standard deviation of demand (per period) LTD: Average Demand During Lead Time LTD = L × R LTD: Standard deviation of demand during lead time 𝜎𝐿𝑇𝐷 = 𝐿𝜎𝑅 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 21 μ and σ of demand per period and fixed L If demand is variable and Lead time is fixed L: Lead Time = 5 days R: Demand per day R: Average daily demand =50 R: Standard deviation of daily demand =10 LTD: Average Demand During Lead Time LTD = L × R = 5 × 50 = 250 LTD: Standard deviation of demand during lead time 𝜎𝐿𝑇𝐷 = 𝐿𝜎𝑅 𝜎𝐿𝑇𝐷 = 5 10 = 22.4 25 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 22 Now It is Transformed The Problem originally was: If average demand per day is 50 units and standard deviation of demand is 10 per day, and lead time is 5 days. Compute ROP at 90% service level. Compute safety stock. We transformed it to: The average demand during the lead time is 250 and the standard deviation of demand during the lead time is 22.4. Compute ROP at 90% service level. Compute safety stock. =NORM.INV(0.9,250,22.4) =278.7 279 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 23 Comparing the two problems 1 2 3 4 Sampling Distribution & Confidence Interval 5 1 Ardavan Asef-Vaziri 2 3 Jan.-2016 4 5 24 The News Vendor Problem- Example Daily demand for your merchandise has mean of 20 and standard deviation of 5. Sales price is $100 per unit of product. You have decided to close this business line in 64 days. Your supplier has also decided to close this line immediately, but has agreed to provide your last order at a cost of $60 per unit. Any unsold product will be disposed at cost of $10 per unit. How many units do you order Underage cost = Cu = = 100 – 60 = 40. Overage cost = Co = 60 +10 =70 SL = Cu/(Cu+Co) = 40/(40+70) = 0.3636. Due to high overage cost, SL*< 50%. Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 25 The News Vendor Problem- Extended LTD = R ×L =20 ×64 = 1280. LTD = ( 𝐿)*(R) LTD = ( 64)* (5) = 40 SL* = 0.3636 The optimal Q = LTD + z σLTD =NORM.INV(probability, mean, standard_dev) =NORM.INV(0.3636,1280,40) =1266.0459 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 26 Sampling Element. Any unit of data defined for processing is a data element; for example, ACCOUNT NUMBER, NAME, ADDRESS and CITY. A population is a collection of all the elements of interest. Sample is a subset of the population. It contains only a portion of the population. Frame. A sampling frame is the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled, and may include individuals, households or institutions is a list of objects The sample results provide estimates of the values of the population characteristics With proper sampling methods, the sample results can provide “good” estimates of the population characteristics. Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 27 Sampling from a Finite Population St. Andrew’s College received 900 applications for admission in the upcoming year from prospective students. The applicants were numbered, from 1 to 900, as their applications arrived. The Director of Admissions would like to select a simple random sample of 30 applicants. Generate rand() in column next to the names. Then sort the rand column. Select the top 30 names. Sometimes we want to select a sample, but find it is not possible to obtain a list of all elements in the population. As a result, we cannot construct a frame for the population. We cannot use the random number selection procedure. Most often this situation occurs in infinite population cases. Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 28 Sampling from an Infinite Population Populations are often generated by an ongoing process where there is no upper limit on the number of units that can be generated. Examples of on-going processes, with infinite populations, are: parts being manufactured on a production line transactions occurring at a bank telephone calls arriving at a technical help desk customers entering a store These are objects. A random sample from an infinite population is a sample selected such that the following conditions are satisfied. Each element selected comes from the population of interest. Each element is selected independently. Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 29 X: , , y = (x1+X2)/2, y = , 2 y= 2/2, y= / 2 A random variable x has mean of , and standard deviation of σ. A random variable y is equal to the average of 2 random variables x. y = (x1+x2)/2 x: ( , σ) y: (?,?) If it was x1+x2 then x1+x2: (2 , 𝟐σ) Since it is (x1+x2)/2 or 1/2(x1+x2): 1/2(2 , 𝟐σ) That is: , ( 𝟐/2)σ That is: , 𝟐/( 𝟐 𝟐)σ y = (x1+x2)/2: , σ/ 𝟐 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 30 20 Samples of size 25 100.0 Puplulation Mean 35.0 Population StdDev Samples 1 1 80 2 148 3 91 4 112 21 145 22 75 23 94 2 40 55 96 79 42 101 195 3 66 118 106 91 74 108 111 4 43 146 87 94 76 50 139 5 81 112 110 51 67 136 69 6 160 65 136 87 86 114 113 7 60 46 114 121 121 182 53 8 99 83 51 119 119 58 101 9 121 103 104 149 143 106 145 10 131 88 140 68 15 99 118 11 40 108 87 28 128 98 147 12 99 85 58 92 53 123 117 13 68 124 66 78 92 118 56 14 111 80 107 63 106 111 101 15 85 116 117 141 113 86 81 16 94 97 71 111 36 96 86 17 108 69 102 76 172 62 119 18 110 63 111 102 128 101 72 19 93 94 133 71 167 164 184 20 154 89 144 80 67 115 118 24 63 81 154 126 53 58 104 129 52 135 110 68 130 88 121 90 59 75 97 97 25 188 52 83 129 141 89 102 113 179 48 126 53 124 117 112 81 143 106 100 123 96.5 33.3 96.6 33.9 96.0 33.8 97.8 31.4 106.9 40.9 88.2 35.1 106.8 36.8 96.2 36.1 104.3 37.8 84.4 20.9 97.4 43.7 97.5 28.8 104.7 38.2 95.0 28.2 Mean StdDev 102.3 102.5 32.8 41.6 99.6 Mean (n=25) 6.1 StdDev( n=25) 106.0 105.1 33.6 26.6 Xbar(25) 10 16 22 28 34 40 46 52 58 64 70 76 82 88 94 100 106 112 118 124 130 136 142 148 154 160 166 172 178 184 x Sampling Distribution & Confidence Interval 102.8 104.8 43.5 30.1 78 80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110 112 114 116 118 120 Ardavan Asef-Vaziri Jan.-2016 31 Sampling from an Infinite Population A random variable x1: ( and ) A random variable x2: ( and ) …………………………………….. A random variable xn: ( and ) A random variable y = x1+x2+…..+xn: (?,?) Mean(y) = Mean(x1)+ Mean(x2)+ ……. + Mean(xn) (y) = (x1)+ (x2)+ ……. + (xn) (y) = n (x) Var(y) = Var(x1)+ Var(x2)+ ……. Var(xn) 2(y) = 2 + 2 + ……. 2 2(y) = n 2 (x) (y) = 𝑛 (x) Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 32 Sampling from an Infinite Population A random Variable y= SUMx: (n , 𝑛 ) A random Variable SUMx/n = 𝑋: (?, ?) A random Variable SUMx: (n, 𝑛) A random Variable 𝑋 = SUMx/n = (1/n) SUMx Mean (𝑋) = n/n = StdDev (𝑋) = (1/n)* 𝑛 = / 𝑛 x: ( and ) 𝑿: (, / 𝒏) Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 33 Standard Deviation of 𝑋 for N=25 and N=50 Samples 1 1 80 2 148 3 91 4 112 21 145 22 75 23 94 2 40 55 96 79 42 101 195 3 66 118 106 91 74 108 111 4 43 146 87 94 76 50 139 5 81 112 110 51 67 136 69 6 160 65 136 87 86 114 113 7 60 46 114 121 121 182 53 8 99 83 51 119 119 58 101 9 121 103 104 149 143 106 145 10 131 88 140 68 15 99 118 11 40 108 87 28 128 98 147 12 99 85 58 92 53 123 117 13 68 124 66 78 92 118 56 14 111 80 107 63 106 111 101 15 85 116 117 141 113 86 81 16 94 97 71 111 36 96 86 17 108 69 102 76 172 62 119 18 110 63 111 102 128 101 72 19 93 94 133 71 167 164 184 20 154 89 144 80 67 115 118 24 63 81 154 126 53 58 104 129 52 135 110 68 130 88 121 90 59 75 97 97 25 188 52 83 129 141 89 102 113 179 48 126 53 124 117 112 81 143 106 100 123 96.5 33.3 96.6 33.9 96.0 33.8 97.8 31.4 106.9 40.9 88.2 35.1 106.8 36.8 96.2 36.1 104.3 37.8 84.4 20.9 97.4 43.7 106.0 105.1 33.6 26.6 97.5 28.8 104.7 38.2 95.0 28.2 102.8 104.8 43.5 30.1 Mean StdDev 102.3 102.5 32.8 41.6 Mean 102.4 96.6 96.9 97.5 101.5 94.4 101.7 101.3 99.8 103.8 StdDev 37.1 33.3 32.3 38.9 36.5 31.8 38.8 27.7 33.6 37.0 100.0 Puplulation Mean 35.0 Population StdDev 99.6 Mean (n=25) 6.1 StdDev( n=25) 2.9 StdDev (n»50) Sampling Distribution & Confidence Interval 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121 124 127 130 133 136 139 142 145 148 151 154 157 160 163 166 169 172 175 178 99.6 Mean (n50 ) x Ardavan Asef-Vaziri Xbar(25) Jan.-2016 Xbar(50) 34 Sampling from an Infinite Population CENTRAL LIMIT THEOREM: In selecting random samples of size n from a population, the sampling distribution of the sample mean 𝑋 can be approximated by a normal distribution as the sample size becomes large. Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 35 Interval Estimation vs Point Estimation 𝑋 provides a point estimate for Point estimate is based on just one sample, we can not expect it to be equal to the corresponding population parameter. Indeed, each sample will have a different 𝑋 , where none of the 𝑋𝑠 is equal to . But they are all unbiased estimates of . Unbiased means their expected value is equal to . x Point estimate; = x Interval Estimate ; with a certain confidence (probability) x a μ x a Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 36 𝑋 Distribution: (1-) Probability / 𝒏 35/sqrt(25)=7 x 78 80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110 112 114 116 118 120 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 37 Relationship between μ and x 1-α 0.95 X z(10.05 / 2 ) X X z(1 / 2 ) X X z(10.05 / 2 ) X X z(1 / 2 ) X Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 38 Relationship between μ and x X z(1 / 2 ) X X z(1 / 2 ) X X z(1 / 2 ) X X z(1 / 2 ) X X z(1 / 2 ) X X z(1 / 2 ) X Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 39 Relationship between μ and x X z(1 / 2 ) X X z(1 / 2 ) X X z(10.05 / 2 ) X X z(10.05 / 2 ) X X z0.975 X X z0.975 X X 1.96 X X 1.96 X Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 40 𝑋 Distribution 35/sqrt(25)=7 x 78 80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110 112 114 116 118 120 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 41 Interval Estimation of a Population Mean: is known A mail ordering company want to know the quality of its product from the point of view of its customers. Each month a sample of 100 customers are interviewed. Sample size : n= 25 They measure quality on a scale of 0 to 100. Data from previous months indicates that the standard deviation of the measure of the quality is = 25 We want to find an interval estimate for This month interview scors are given below Find the interval estimate for with 95% confidence interval. 102 86 97 56 140 106 96 74 132 105 116 92 81 76 75 106 115 73 77 122 79 111 92 89 73 77 102 91 68 102 70 112 80 108 87 47 89 67 111 121 94 115 89 89 125 84 83 88 79 73 80 111 86 82 37 63 109 102 50 77 83 54 139 32 116 70 70 37 62 101 68 125 73 123 37 62 49 131 50 80 Sampling Distribution & Confidence Interval 87 84 92 29 91 122 126 86 64 25 75 109 78 53 98 56 160 57 68 55 Xbar StdDevXbar Z0.975 UCL LCL Ardavan Asef-Vaziri Jan.-2016 78.69 2.5 1.96 83.58991 73.79009 TRUE 42 Example: National Discount, Inc. National Discount has 260 retail outlets throughout the U.S. National evaluates each potential location for a new retail outlet in part on the mean annual income of the individuals in the marketing area of the new location. The purpose of this example is to show how sampling can be used to develop an interval estimate of the mean annual income for individuals in a potential marketing area for National Discount. Based on similar annual income surveys, the standard deviation of annual incomes in the entire population is considered known with = $5,000. We will use a sample size of n = 64. Question. There is a 0.95 probability that the value of a sample mean for National Discount will provide a sampling error of $????? or less. Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 43 Example: National Discount, Inc. 𝜎𝑋 = 𝜎 𝑛 = 5000 64 = 626 Z0.975 = 1.959964 (Z0.975)(𝜎𝑋 ) = 1.96(625) = 1225 Answer. There is a 0.95 probability that the value of a sample mean for National Discount will provide a sampling error of $1,225 or less. Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 44 Example: National Discount, Inc. Question. National’s management team wants an estimate of the population mean such that there is a 0.95 probability that the sampling error is $500 or less. How large a sample size (n) is needed to meet the required precision? Recall that = 5,000. (Z0.975)(𝜎𝑋 ) = 1.96(𝜎𝑋 ) ≤ 500 𝜎𝑋 ≤ 500/1.96 𝜎𝑋 ≤ 500/1.96 = 255.2 𝜎𝑋 = 5000 𝑛 𝜎 𝑛 ≤ 255.2 ≤ 255.2 5000 ≤ 𝑛 255.2 n ≥ 385 Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 45 2-Ways to Compute CI in Excel: is known Using Z Xbar= SigmaX= CI= n= Sigma(Xbar)= Upper Probability 0.95CI: Upper Z(0.975 0.95CI: Upper Z(0.025 +Side -Side UCL LCL NORM.S.INV 41100 4500 0.95 22 959.40 0.975 1.96 -1.96 1880 -1880 42980 39220 Sampling Distribution & Confidence Interval Using X Xbar= SigmaX= CI= n= Sigma(Xbar)= Upper Probability Lower Probability Ardavan Asef-Vaziri NORM.INV 41100 4500 0.95 40 711.5124735 0.975 0.025 UCL 42495 LCL 39705 Jan.-2016 46 Third Way to Compute CI in Excel: is known Using Conf CONFIDENCE.NORM Xbar= 41100 SigmaX= 4500 CI= 0.95 n= 34 Mrgin UCL LCL Sampling Distribution & Confidence Interval 1513 42613 39587 Ardavan Asef-Vaziri Jan.-2016 47 Interval Estimation of a Population Mean: is unknown Instead of population standard deviation , we have sample standard deviation of s Instead of normal distribution, we have t distribution The t distribution is a family of similar probability distributions. A specific t distribution depends on a parameter known as the degrees of freedom. As the number of degrees of freedom increases, the difference between the t distribution and the standard normal probability distribution becomes smaller and smaller. A t distribution with more degrees of freedom has less dispersion. The mean of the t distribution is zero Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 48 -3.00 -2.75 -2.50 -2.25 -2.00 -1.75 -1.50 -1.25 -1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 -3.00 -2.75 -2.50 -2.25 -2.00 -1.75 -1.50 -1.25 -1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 z and t: n=2, 10, 20 N:fx T-fx N:fx Sampling Distribution & Confidence Interval T-fx Ardavan Asef-Vaziri Jan.-2016 49 Interval Estimation of a Population Mean: Unknown The interval estimate is given by: x t / 2 s n The confidence level is 1 - t/2 is the t value providing an area of /2 in the upper tail of a t distribution with n - 1 degrees of freedom s is the sample standard deviation Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 50 Example: Apartment Rents A reporter for a student newspaper is writing an article on the cost of off-campus housing. A sample of 64 one-bedroom units within a 5-mile of campus is given below Provide a 95% confidence interval estimate of the mean rent per month for the population of one-bedroom units within a half-mile of campus. 1150 630 950 1110 1130 860 1810 1060 Xbar StdDev StdDevXbar t0.025 UCL LCL 910 370 1220 790 1410 1230 1210 600 1006.563 241.36 30.17 2.00 1066.852 946.2726 770 1120 900 930 1040 990 910 1050 1100 800 1220 1070 860 1020 810 1200 1000 1130 900 1370 1110 990 1150 1220 1030 1120 520 990 760 900 1320 750 780 1010 790 670 1190 1420 930 1160 1430 1180 760 1120 1060 850 710 870 TRUE Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 51 Sampling from an Infinite Population A population has a mean of 300 and standard deviation of 60. Given a random sample size of 100, what this the probability that the sample mean will be within +9 of population mean. -9 ≤ 𝑋-µ ≤ 9 |𝑋-µ| ≤ 9 |𝑋-µ|/𝑋 = 9/𝑋 𝑋 = σ/ 100 𝑋 = 60/10 = 6. z = 9/6 = 1.5 =NORM.S.DIST(1.5,1)- =NORM.S.DIST(-1.5,1) = =0.933193 -0.066807 = 86.6% Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 52 Sampling from an Infinite Population What sample size do we need to have such that with 90% confidence the margin of error is within +9 from mean. Margin of Error = z(CL)*σ/ 𝑛 z(90%) =NORM.S.INV(0.95) = 1.644854 Margin of Error = 1.644854 *60/ 𝑛 = 98.7/ 𝑛 98.7/ 𝑛 ≤ 9 98.7/9 ≤ 𝑛 10.966 ≤ 𝑛 121 ≤ n Sampling Distribution & Confidence Interval Ardavan Asef-Vaziri Jan.-2016 53