Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Part I – Probability, the Normal Distribution, and the Binomial Distribution a. Recall the Jimmy Rhodes Auto Supply problem from Lecture #0 review notes. Question 1: What is the probability of stock-outs occurring during the 2-day lead time using the theoretical model? The probability of stock-outs is the probability that demand during the 2-day lead time exceeds the quantity (20 gallons) on hand, i.e., P(X>20): (Source: Professor Schroeder’s Lecture #0 Page20) The corresponding Z-score for 20 gallons is computed as Then P(X>20) =P(Z>0.83)=0.2033=20.33% Question 2: What % of the simulated demands exceeds 20, i.e., what is the “probability” of stocking out based on the simulated data? From the random simulation (see Appendix A Table in Columns B and C), there are 9 demands that are greater than 20. Therefore, the percentage of the simulated demands exceeds 20 is Based on the simulated data, the “probability” for stocking out is 45%. Question 3: Are your answers to questions 1 and 2 the same? Comment/Explain. The answers to Questions 1 and 2 are different. In Question 1, it was suggested that the two day demand from the past record (or population) follows a normal distribution. We analyzed the probability for out-of-stock during 2-day lead time based on the entire population; whereas in Question 2, we analyzed the percentage of out-of-stock based on a sample size of 40. There is usually a deviation between the probabilities analyzed from a population vs. those from a sample size. (Remember, probability is a long-run percentage, i.e., the percentage as determined from an “infinitely” large number of trials). Question 4: Use both the theoretical distribution, N(15, 6), and the simulated data to calculate the middle 95% of demand values. Show your work. Using the theoretical distribution: The middle 95% probability provides that the random variable will be within 2 standard deviations away from the mean. That is, P(-1.96 ≤ Z ≤ 1.96) = 0.95 Since X = µ + Zσ X1 = 15+(-1.96)*6 = 3.24 gallons X2 = 15 + (1.96)*6 = 26.76 gallons Therefore, the middle 95% of demand values are between 3.24 and 26.76 gallons. Using the simulated data: Based on the above figure, we should simulate the motor oil demand at 2.50 percentile and at 97.50 percentile as shown below: One Variable Summary Mean Std. Dev. 2.50% 97.50% Motor Oil Demand Data Set #1 14 8 0 27 The simulation shows that the middle 95% of demand values are between 0 and 27 gallons. You could, of course, have counted from your simulated data. The 2.5th percentile is in position (41)(.025) = 1.025 ≈ 1 in the data set. That means, find the smallest data value in your simulated demands. Likewise, the 97.5th percentile is in position (41)(.975) = 39.975 ≈ 40. That means, find the largest data value in the simulated demands. b. Binomial distributions. Let’s simulate some observations from a Binomial distribution with probability of success equal to 0.25. Question 5: If we simulate 40 random observations, how many successes do we expect to see? n = 40 p = 0.25 E = n*p = 40*0.25 = 10 We expect to see 10 successes. Question 6: How many successes are there in your simulated data? Is your answer the same as your answer to Question 5? Comment/Explain. What proportion (%) of successes is this? There are 14 successes in the simulation (see Appendix A Table in Column D), which is different from the answer to Question 5. The expected value for successes (from Question 5) is the average number of successes out of 40 observations assuming the theoretical probability of success (0.25). In our simulation, since we have only included 40 trials, our observed probability of success has not stabilized at the theoretical probability of success. The simulation contains 14/40*100% = 35% of successes, which is higher than the 25% probability of success given. Question 7: What is the exact theoretical probability of observing fewer than 2 successes in 40 trials if the probability of success is 0.25? What is the approximate theoretical probability of observing fewer than 2 successes in 40 trials when the probability of success is 0.25 using the Normal approximation to the Binomial? Explain how you could use simulation to answer such probability questions? Let X = the number of successes observed. The exact theoretical probability of observing fewer than 2 successes in 40 trials is: P(X<2) = P(X=0) + P(X=1) = 40C0(0.25)0(1-0.25)40 + 40C1(0.25)1(1-0.25)40-1 = 1.441*10-4 ≈0 The theoretical expected value and standard deviation are: µ = 10 By using the normal approximation to the Binomial, Z= ( X − .5) − μ σ = 1.5 − 10 2.74 P(Z<-3.10) = 0.0010 The approximate theoretical probability of observing fewer than 2 successes is 0.0010. To determine the simulated % of times that fewer than 2 successes is observed would require multiple simulations of data for the Binomial distribution with n = 40 and p = 0.25 in each case. For each simulated 40 trials, count the number of successes. Use the number of successes from the multiple simulations to find the % of your simulations that had count of successes less than 2. Appendix A Excel Worksheet for Part I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 A B Motor Oil Demand Demand>20? 1=Yes, 0=No C D Binomial Data P=0.25 11 0 1 21 1 1 10 21 18 3 25 13 10 0 1 0 0 1 0 0 0 0 1 0 0 1 1 0 0 1 31 24 12 17 24 19 17 6 1 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 15 0 1 11 17 6 13 9 2 10 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 1 21 1 1 19 0 0 13 0 0 27 1 0 16 0 0 2 0 0 17 0 0 14 0 0 24 1 0 10 0 1 18 0 0 8 0 0 18 0 0 Count 9 Count 14 Part II – Parameter versus Statistic, and the distribution of X versus the distribution of X Question 8: Our sample size is only 15. Why is the small sample size not a problem? Define the theoretical distribution of X and the sampling distribution of X . The small sample size is not a problem because the original population is Normal. The theoretical distribution of X is the normal distribution with a mean of 100 and a standard deviation of 5. The sampling distribution of X is also the normal distribution 5 σ with mean of 100, but has standard error of = ≈ 1.291 n 15 Question 9: Define Parameter. Define Statistic. The one variable summary for X (Data Set #1) and X (Data Set #2) is shown below: Sample One Variable Summary Mean Std. Dev. Skewness Median Minimum Maximum Count 1st Quartile 3rd Quartile X Sample 1 Avg Data Set #1 Data Set #2 100.84 4.39 -0.28 101.86 94.10 106.83 15.00 96.03 103.35 99.77 1.32 0.25 99.60 97.65 102.69 50.00 98.76 100.66 A parameter is a numerical summary measure that describes the population, i.e., the set of ALL possible items of interest. A statistic is a numerical summary measure that describes a sample, i.e., a subset of items selected from the population. Question 10: Use the results of the one variable summary for X and the definition of the theoretical distribution of X to illustrate the difference between a parameter and a statistic. The parameters of the original variable, X, (i.e., of the population) are µ = 100 and σ = 5. In the above “One Variable Summary” table for Data Set #1 (Sample 1) the summary numbers given are statistics because they are based on only 1 sample of 15 observations. Their values are different from the parameter values. The sample average, x , has value 100.84, and the sample standard deviation, s, has value 4.39. Question 11: Define σ, s, and σ X . Use your results to illustrate the difference between them. σ is the standard deviation of the population. s is the standard deviation of the sample size. σ X is the standard deviation of the sample average, X . From our previous discussion, σ is 5, s is 4.39, σ X is 1.291. Question 12: What is the theoretical probability that X is less than 97.5? What % of the values in your first sample is less than 97.5? Comment on the disparity. Since the random variable X follows a normal distribution with µ = 100 and σ = 5, the theoretical probability that X is less than 97.5 is In the first sample, 4 out of 15 simulated values is less than 97.5. That is, 26.7% of the values are less than 97.5. The theoretical probability and the actual percentage from the sample size are pretty close. There is usually a difference between the theoretical probability and the percentage computed from a set of sample data. Again, theoretical probability assumes “infinite” number of observations; we only have 15 in a single sample. Question 13: What is the theoretical probability that X is less than 97.5? What % of the 50 average values is less than 97.5? Again, comment on the disparity. The theoretical probability that X is less than 97.5 is 0.262 as calculated below. σ 5 σX = = = 1.291 n 15 97.5 − 100 P ( X < 97 .5) = P ( Z < ) = P ( Z < −1.94) = 0.0262 = 2.62 % 1.291 None of the 50 average values in my simulations is less than 97.5. Therefore, 0% of the 50 average values is less than 97.5. The disparity, again, is accounted for by the relatively small number (50) of repetitions. Histograms: Question 14: Explain why the probability found in question 13 is so much smaller than the probability found in question 12. The reason why the probability found in question 13 is much smaller than the probability found in question 12 is because the standard deviation used in question 13 is much smaller than the standard deviation used in question 12. Under the same amount of deviation (100 – 97.5 = 2.5), with a smaller standard deviation in question 13 (1.291 vs. 5 in question 12), the absolute value of Z score becomes much larger for question 13. This means many less random variables will fall in the range less than 97.5 in question 13. Therefore, the probability found in question 13 is much smaller than the probability found in question 12. X X