* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Poisson Distribution
Survey
Document related concepts
Transcript
Application Confidence Intervals for Mean Suppose that the random variables Y1,Y2, …………Yn model independent observations from a distribution with mean µ and variance σ2 . n Then 1 Y Yi n i 1 is the sample mean. Now by the CLT Y ~ N , n 2 This is because µ is replaced by µ/n and σ by σ /n (for means) Recall from Statistics 2 that, if σ2 is estimated by the sample variance, s2, an approximate confidence interval for µ is given by: s s y z , y z n n _ Here y is the observed sample mean, and z is proportional to the level of confidence required. So for 95% confidence an approximate interval for µ is given by: s s y 2 , y 2 n n 2 is approximate - an accurate value can be obtained from tables or by using the qnorm function on R. > qnorm(0.975) [1] 1.959964 > qnorm(0.995) [1] 2.575829 > qnorm(0.025) [1] -1.959964 > Thus in R, an approximate 95% confidence interval for the mean µ is given by > mean(y)+c(-1,1)*qnorm(0.975)*sqrt(var(y)/length(y)) where y is the vector of observations. A more accurate confidence interval, allowing for the fact that s2 is only an estimate of σ2,is given by use of the function t.test. Example The R vector abbey in the package MASS gives 31 determinations of nickel content (μg g-1) in a Canadian syenite rock. We check whether the data are reasonably modelled by an exponential distribution. There is no predefined function in R to construct an exponential Q-Q plot, so we have to work from first principles. qexp(ppoints(31)) gives the theoretical quantile values at 31 probability points. sort(abbey) gives the sorted experimental values The following command produces a Q-Q plot with axes labelled accordingly. If we ignore the highest observation, the data appear to be reasonably compatible with an exponential distribution with mean around 12.5 (exp(0.08)).