Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Report Writing A report should be self-explanatory. It should be capable of being read and understood without reference to the original project description. Thus, for each question, it should contain all of the following: (a)a statement of the problem; (b) a full and careful description of how it is investigated; (c) All relevant results, including graphical and numerical analyses; variables should be carefully defined, and figures and tables should be properly labelled, described and referenced; (d) relevant analysis, discussion, and conclusions. It should be written in the third person. NOT: I think the Central Limit Theorem is true for this example because I see that the graph is normal. INSTEAD: It can be clearly seen that the graph displays a normal distribution confirming that the Central Limit Theorem holds. The Central Limit Theorem Let X1, X2………. Xn be independent identically distributed random variables with mean µ and variance σ 2. Let S = X1,+ X2+ ………. +Xn Then elementary probability theory tells us that E(S) = nµ and var(S) = nσ 2 . The Central Limit Theorem (CLT) further states that, provided n is not too small, S has an approximately normal distribution with the above mean nµ, and variance nσ 2. In other words, S approx ~ N(nµ, nσ 2) The approximation improves as n increases. We will use R to demonstrate the CLT. Let X1,X2……X6 come from the Uniform distribution, U(0,1) 1 0 1 For any uniform distribution on [A,B], µ is equal to A B 2 2 ( B A ) and variance, σ2, is equal to 12 So for our distribution, µ= 1/2 and σ2 = 1/12 The Central Limit Theorem therefore states that S should have an approximately normal distribution with mean nµ (i.e. 6 x 0.5 = 3) and var nσ2 (i.e. 6 x 1/12 = 0.5) This gives standard deviation 0.7071 In other words, S approx ~ N(3, 0.70712) Generate 10 000 results in each of six vectors for the uniform distribution on [0,1] in R. > x1=runif(10000) > x2=runif(10000) > x3=runif(10000) > x4=runif(10000) > x5=runif(10000) > x6=runif(10000) > Let S = X1,+ X2+ ………. +X6 > s=x1+x2+x3+x4+x5+x6 > hist(s,nclass=20) > Consider the mean and standard deviation of S > mean(s) [1] 3.002503 > sd(s) [1] 0.7070773 > This agrees with our earlier calculations A method of examining whether the distribution is approximately normal is by producing a normal Q-Q plot. This is a plot of the sorted values of the vector S (the “data”) against what is in effect a idealised sample of the same size from the N(0,1) distribution. If the CLT holds good, i.e. if S is approximately normal, then the plot should show an approximate straight line with intercept equal to the mean of S (here 3) and slope equal to the standard deviation of S (here 0.707). > qqnorm(s) > > qqnorm(s) > > qqnorm(s) > 4.4 – 1.8 4 = 0.7 to 1 DP From these plots it seems that agreement with the normal distribution is very good, despite the fact that we have only taken n = 6, i.e. the convergence is very rapid!