Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS 1538: Introduction to Simulation Homework 5 Introduction In this assignment, you will perform both input modeling and output analysis. When answering the following questions, please show all work (equations, explanations, etc.) not just the final answer. List any software you use and explain what you used it for. Input Modeling 1. The following data are generated randomly from a Normal distribution. Compute the maximum-likelihood estimators for µ and σ2. -0.3182 -0.0895 0.3708 2.0838 0.3755 0.1913 -1.8758 1.1832 -0.0495 0.6597 1.3174 2.8889 0.0492 -2.0548 0.7597 0.7215 -0.5848 0.2236 0.2184 -0.8680 0.0155 -1.2310 1.0263 -3.3413 0.3406 1.7381 0.1327 1.4639 -0.5454 -1.4124 SOLUTION: Since this is the Normal distribution: µ = sample average = 0.11298 σ2 = sample variance = 1.643 Sample average and variance calculated using Excel. 2. The highway between Atlanta, Georgia and Athens, Georgia has a high incidence of accidents along its 100 kilometers. Public safety officers say that the occurrence of accidents along the highway is randomly (uniformly) distributed, but the news media says otherwise. The Georgia Department of Public Safety published records for the month of September. These records indicated the point at which 30 accidents involving an injury or death occurred, as shown below (the data points representing the distance from the city limits of Atlanta). Use the Kolmogorov-Smirnov test to discover whether the distribution of location of accidents is uniformly distributed. Use the level of significance α = 0.05. 88.3 91.7 98.8 32.4 20.6 76.6 40.7 67.3 90.1 87.8 73.1 73.2 SOLUTION: 36.3 7.0 17.2 69.8 21.6 27.3 27.3 45.2 23.7 62.6 6.0 87.6 36.8 23.3 97.4 99.7 45.3 87.2 Null hypothesis: The data follows a uniform distribution. Alternative hypothesis: The data does not follow a uniform distribution. D = 0.172 Dα = 0.240 Since D < Dα, accept null hypothesis. The occurrence of accidents follows a uniform distribution. 3. The time required for 50 different employees to compute and record the number of hours worked during the week was measured, with the following results in minutes. Use the chi-square test to test the hypothesis that these service times are exponentially distributed. Use six intervals. Use the level of significance α = 0.05. Employee 1 2 3 4 5 6 7 8 9 10 11 12 13 Time (min) 1.88 0.54 1.90 0.15 0.02 2.81 1.50 0.53 2.62 2.67 3.53 0.53 1.80 Employee 14 15 16 17 18 19 20 21 22 23 24 25 26 Time (min) 0.79 0.21 0.80 0.26 0.63 0.36 2.03 1.42 1.28 0.82 2.16 0.05 0.04 Employee 27 28 29 30 31 32 33 34 35 36 37 38 39 Time (min) 1.49 0.66 2.03 1.00 0.39 0.34 0.01 0.10 1.10 0.24 0.26 0.45 0.17 Employe e 40 41 42 43 44 45 46 47 48 49 50 Time (min) 4.29 0.80 5.50 4.91 0.35 0.36 0.90 1.03 1.73 0.38 0.48 SOLUTION: Null hypothesis: The data follows an exponential distribution. Alternative hypothesis: The data does not follow an exponential distribution. Sample mean = 1.206 Estimated λ = 1/1.206 = 0.829 Since we want 6 intervals, that means each interval will have 1/6 (16.67%) probability mass each. F(x) x Observed Expected (O-E)2/E freq freq 0.1667 [0, 0.220) 8 8.3333 0.013 0.3333 [0.220, 0.489) 11 8.3333 0.853 0.5 [0.489, 0.836) 9 8.3333 0.053 0.6667 [0.836, 1.325) 5 8.3333 1.333 0.8333 [1.325, 2.161) 10 8.3333 0.333 1.000 [2.161, ∞) 7 8.3333 0.213 The x column can be calculated using F-1(x) = -ln(1-x)/ λ. So, for the first row, it’s –ln(1-0.1667)/0.829 = 0.2199. Since we created intervals with equal probability mass, the expected frequency column is equal across each interval. It can be calculated as 1/6*(sample size) = 1/6*50 = 8.3333 C = the sum of the last column = 2.8 Degrees of freedom = k – s – 1 = 6 – 1 – 1 = 4 k = number of bins s = number of estimated parameters Critical value = χ24,0.05 = 9.488 Since C < χ24,0.05, we accept the null hypothesis. The service times are likely from an exponential distribution with service rate 0.829. 4. At a small store, you record the service time (in minutes) for 30 transactions (shown below). How are these service times distributed? Develop and test a suitable model. Use one of the goodness-of-fit tests to decide. Use the level of significance α = 0.05. 4.6093 2.4541 2.7272 2.6083 9.5841 3.8583 1.1305 8.7227 4.8375 1.7347 5.2921 1.1191 2.1489 0.2878 2.9482 11.8326 2.9973 3.0065 2.6055 6.7692 3.3745 4.1410 2.3380 1.6949 12.1024 14.3956 2.0066 2.6579 0.3578 4.1612 SOLUTION: Since we’re dealing with service times, it’s good to start with the exponential distribution. Null Hypothesis: The service times follow an exponential distribution. Alt. Hypothesis: The service times do not follow an exponential distribution. Sample mean = 4.28346 Estimated λ = 1/4.28346 = 0.233 I’ll use the Chi-squared goodness-of-fit test with 5 intervals. Thus, each interval will have 20% probability mass. F(x) x Observed Expected (O-E)^2/E freq freq 0.2 [0, 0.9558) 2 6 2.666667 0.4 [0.9558, 2.1881) 6 6 0 0.6 [2.1881, 3.9249) 11 6 4.166667 0.8 [3.9249, 6.894) 6 6 0 1 [6.894, ∞) 5 6 0.166667 C=7 χ23,0.05 = 7.815 Since C < χ23,0.05, we accept the null hypothesis. The service times are likely from an exponential distribution with service rate 0.233. Output Analysis 1. The small store from #4 above desires their service time to be faster, closer to 2.5 minutes. You implement a simulation of their store. Over 10 runs, you record the service time for 30 customers: 6.4678 5.6306 5.3717 6.9439 3.7322 5.2842 5.6135 3.9969 6.1131 4.2907 The store owner has thoughts on how to improve service times. You implement these thoughts in the simulated system and rerun the simulation for 10 runs of 30 customers each. You do your best to keep the same random numbers in run i this time as run i had for the first run. The average service times recorded were: 3.1879 3.4287 2.9015 3.5636 2.7428 4.7058 3.9970 3.1817 3.3992 3.1958 Is there a difference in service times? Construct the appropriate 95% confidence interval to decide. SOLUTION: Since the runs of the two simulations used the same random numbers, we can used the paired-t approach. To do this, take the difference of runi from the first simulation and runi of the second. Compute a confidence interval for that set of data. Difference: 3.2799 2.7291 2.6289 2.9469 1.8555 2.0499 -0.709 2.9314 Sample mean = 1.914 Sample variance = 1.691 Standard error = sqrt(1.691/10) = 0.411 tα/2,n-1 = t0.05/2,9 = 2.262 Confidence interval: (0.984, 2.844) 0.333 1.0949 Since the confidence interval does not contain zero, we are 95% confident that there is a difference between the two samples. Since the mean is positive, this tells us that the first sample (from the original system) is larger than the second sample (the improved system). 2. The store owner is curious if her improvements bring the service time close enough to 2.5 minutes. Using the data above for the improvements, construct a 95% confidence interval to decide. SOLUTION: The confidence interval for the simulation of the improvement is: Sample mean = 3.430 Sample variance = 0.322 Standard error = sqrt(0.322/10) = 0.179 tα/2,n-1 = t0.05/2,9 = 2.262 Confidence interval: (3.024, 3.836) Since the confidence interval does not contain 2.5, we are 95% confident that the improvement will not bring the service time close to 2.5 minutes. 3. The store owner wants to know the 95% confidence interval on the service times from your original simulation run (before the improvements). After seeing the range, she is disappointed that it is too large. She wants the confidence interval to be within 15 seconds (0.25 minutes). How many simulation runs are needed to have a confidence interval that is 30 seconds wide? SOLUTION: n >= (zα/2 S / ε)2 = (1.960 * sqrt(0.322) / 0.25)2 = 17.89 = 18 At least 18 runs are needed. Since 10 have already been run, we need 8 more.