Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Name:_______________________________ EGR 252 – APPLICATIONS OF ENGINEERING MATH EXERCISE 3 Due: March 16, 2006 Part I: Central limit theorem 1. Download the file “exercise3.xls” from the course website. Use this file as a guide as you complete this portion of the exercise. 2. Open a new file for your work. Be sure your name and other identifying information are in the upper left corner of the worksheet. 3. Use the random number generator function to create 500 normally distributed random variables with a mean of 5 and a standard deviation of 2.887 beginning in cell A6, as follows: a. Choose “Data Analysis” from the “Tools” menu. b. Choose “Random Number Generation” from the “Data Analysis” menu (Note that if “Random Number Generation” is not an option on the “Data Analysis” menu, you will need to add in the Analysis Pak from the “Add-Ins” option on the “Tools” menu.) c. Complete the “Random Number Generation” dialog box as shown below. Be sure to choose “Normal” in the distribution drop down menu and to set the “Output Range” to $A$6. Select “OK”. 4. Create a histogram of your 500 data points (note: these should NOT be the same numbers you find on the example file.) Comment on the shape of the histogram (is it “classically” normal in shape? How does it differ from the theoretical “bell-shaped curve”?) 5. Go to sheet 2 in the same worksheet. Using “Random Number Generation” as before, create a set of 500 uniformly distributed random numbers between 0 and 10. Be sure to set the output range to a cell in column A on this sheet. a. What is the mean of a uniformly distributed variable that can take on any value between 0 and 10? What is the standard deviation? b. What are the mean and standard deviation of your 500 random numbers you created using the random number generator in Excel? Comment on the difference, if any, between these values and your answers in part a. 6. Create a histogram of the 500 uniformly distributed data points. Comment on the shape of the histogram. 7. Find the average of samples of 5 from the 500 data points as shown in the example file. Create a histogram of those 100 sample means. Comment on the shape of this histogram. 8. Find the average of samples of 10 from the 500 data points. Create a histogram of these 50 means. Comment on the shape of this histogram. 9. Create a table of the theoretical and actual means and standard deviations of the original 500 data points, the samples of 5, and the samples of 10. Comment on any similarities and differences you see. Part 2: Estimation and Confidence Intervals Work with your data set from Exercise 1 to complete the following: 1. Calculate / report the appropriate statistics (i.e., the mean, median, mode, variance, and/or standard deviation) for your data set. 2. Construct a histogram of your data (be sure to have 7-15 bins.) Do you have reason to conclude that your data is from a population that is normally distributed? (Note: whether you do or not, proceed from here on out as if it is.) 3. Calculate a value that is halfway between the 2nd and 3rd largest value in your data set. Based on the sample mean and variance you calculated in part 1, and assuming a normal distribution, calculate the probability that X is below that number. 4. Calculate the value that is halfway between the 2nd and 3rd smallest value in your data set. Calculate the probability that the X is above that number. What, if anything, does this suggest about the symmetry of your data set? 5. Calculate an appropriate 95% confidence interval for your data. 6. Use the “Random Number Generation” utility to create an “imaginary” data set of 50 normally distributed points with the same standard deviation and a mean that is ½ standard deviation above the mean of your data set (i.e., x2 = x1 + 0.5s). 7. Calculate an appropriate confidence interval around the differences in the means of your real and imaginary data sets. What can you conclude about the difference? 8. Construct a histogram of the imaginary data set. Is the normal distribution apparent in this histogram? REPORT: Submit a written report on Thursday, March 16. Use the following format: a. Introduction – a brief summary of the purpose of this exercise. (Note: if you are not sure, ask me!) b. Method – briefly summarize what you did (do NOT copy the entire procedure from this document.) c. Results – narrative, graphs, and tables describing the results you achieved. Be sure you answer all questions in this assignment. d. Conclusions – what did you learn about the central limit theorem, estimation, and confidence intervals from this exercise? What general statements can you make about the difference between theoretical results and “real” data? e. Appendix – a printout of your Excel worksheet.