Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MAS113 Fundamentals of Statistics I Practical 6 Using simulation to investigate the Normal approximation to the binomial distribution For a specified value of p we can see how the grouped histogram from simulated binomial data compares with the normal curve to show that the distribution of the binomial tends to that of a normal random variable as the parameter n increases. Use Calc → Random Data → Binomial to simulate 1000 rows of data from a binomial distribution with Number of trials n = 10 and probability of success p = 0.2, store the values in C1. So C1 holds the simulations for X. To see a histogram with normal approximation do the following: Stat → Basic Statistics → Display Descriptive Statistics... Select the column of the binomial values in C1 and in Graphs... tick Histogram of data, with normal curve, OK, OK. Look at the histogram. Now simulate 1000 values from the binomial distribution, still with p = 0.2, but use n = 50. Look at the histograms again. As n gets larger the histogram should look closer to the normal curve. Repeat with p = 0.5. Erase the data using Data → Erase variables and close the graphs. Illustrating the Central Limit Theorem via simulations We know from the lectures that if take the sample mean of n independent random variable with finite mean and variance, then as n increases the distribution of the sample mean becomes more like a normal distribution. To investigate this we generate 1000 values from an Exp(1) random variable in each of columns c1-c16 (you can do this at one time by entering c1-c16 as the columns for storing the random data). You can obtain 1000 simulations of the sample mean of 2 observations by Calc → Row Statistics then put a tick in ’mean’ and enter c1-c2 in Input Variables. Store the values in C17. Repeat with means based on columns c1-c4, c1-c8 and c1-c16, storing the results in c18, c19 and c20 respectively. Use the histogram with the normal curve (as for the binomial) to see how well the distribution of the sample mean of 2, 4, 8 and 16 independent exponentials is approximated by the normal distribution. 1 Using simulation to investigate the joint distribution of the sample mean and variance when the sample is from a normal distribution We obtain 1000 simulations of values of Z1 , ..., Z9 , which are an independent sample size 9 from the N (0, 1) then compute the corre√ distribution. We P 2 sponding simulations for U = 9X and V = 8S = 9j=1 (Zj − Z)2 and then demonstrate that U and V are independent with U ∼ N (0, 1) and V ∼ χ28 . Use Calc → Random Data → Normal to simulate 9 columns each of 1000 rows of data from a normal distribution with mean zero and standard deviation one, store the values in C1-C9. So each row of columns C1-C9 holds a simulation of values of Z1 , ..., Zn . Now use Calc → Row Statistics and put a tick in the box for the mean to store the corresponding simulated values of X in C10. Repeat, but put a tick in the box for standard deviation to store simulated values of the sample √ 2 standard deviation S in C11. Now use calculator so that C12=3*C10 and C13=8*(C11**2) store the values of U = 3X and V = 8S 2 for the simulations. Next use Calc → Random Data → Chi-squared to store 1000 simulated values from χ28 in column C14 for comparison with the simulations for V . We first look at independence of U and V . Do a scatterplot. What should this look like if U and V are independent? Calculate the sample coefficient of correlation (exclude the p-value). Copy these into your report and comment on the results. Next we look to see if the distribution of U is N (0, 1). Simulations for U are in column C12. Use Stats → Basic Statistics → Display Descriptive Statistics to display only the sample size, mean, standard deviation and coefficient of skewness for the data in C12 and to plot the histogram with normal curve. Also look at a test for normality using Stat → Basic Statistics → Normality Test to get a probability plot and a test for normality (use the Kolmogorov-Smirnov test). The hypotheses being tested is that the data is from a normal distribution. The probability plot is of the observed ordered values against the expectation of the ordered values (obtained if the normality hypothesis is correct)which should be very close to the straight line if U has normal distribution. Copy the results into your report and comment. Finally we look to see if the distribution of V is χ28 . Display descriptive statistics C13 and 14 for only the sample size, mean, standard deviation and 2 2 coefficient q of skewness. Note that a χn distribution has µ = n, σ = 2n and √ β1 = n8 so the measures from your simulations should be close to these (with n = 8). Also plot histograms on the same panel for columns C13 and C14. Use these to compare simulations of V in C13 with simulations from χ28 in C14. Copy the results into your report and comment. 2