Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Document related concepts

Transcript

Normal, Binomial, Poisson Distribution/CLT/Normality Checking (R Textbook Section 3.5) Four fundamental items can be calculated for a statistical distribution: Density or point probability (starting with d) Cumulated probability, distribution function (starting with p) Quantiles (starting with q) Random numbers (starting with r) In addition, norm represents normal distribution, binom represents binomial distribution and pois represents Poisson distribution. For example, dnorm, pnorm, qnorm, and rnorm calculate density, probability, quantile and random number from a normal distribution, respectively. Example: a) Let X be a normal random variable with given mean and sd. Then, Probability that X is less than or equal to x is pnorm(x,mean,sd) 100pth percentile of X is qnorm(p,mean,sd) b) Let X be a binomial random variable with n trials and success rate p. Then, Probability that X is less than or equal to x is pbinom(x,n,p) 100pth percentile of X is qbinom(p,n,p) Probability that X is equal to x is dbinom(x,n,p) c) Let X be a Poisson random variable with location parameter ld. Then, Probability that X is less than or equal to x is ppois(x,ld) 100pth percentile of X is qpois(p,ld) Probability that X is equal to x is dpois(x,ld) Central Limit Theorem(CLT) Consider an experiment of flipping an unbalanced coin (with .3 chance to get head up). Each time the result is either X =0 (tail) or 1 (head). a) What is the distribution of X? Is it bell shaped (normal)? b) Repeat the experiment 40 times. Let X-bar be the mean of the 40 results. By CLT, will it be approximately normal? c) Simulate the distribution of X-bar. Here is the R code: # simulate 5000 sums of 40 flipping results xbar<-rbinom(5000,40,.3) # mean is the sum over trial size xbar<-xbar/40 # draw the histogram of the simulated data hist(xbar) Exercise: Redo b) and c) with only 10 times of coin flipping. 1 Normality Checking By graphs: If the data does sampled from a normal distribution, its normal Q-Q plot will be more or less a straight line. By tests: later Example: a) Based on the data in Table 3.1 again, check if it is reasonable to assume the math scores (of all third to fifth graders) follow a normal curve. R Code: math<-tb3dot1$Math qqnorm(math,main="normal Q-Q Plot of math score") qqline(math,col=2) b) Simulate a random sample of 20 from norm(0,1) and check the normality of the simulated data by the normal Q-Q plot. What is the result? R Code: data<-rnorm(20,0,1) qqnorm(data) qqline(data,col=2) 2