Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Outline The Sampling Distribution of X̄ The Sampling Distribution of X̄ Alan T. Arnholt Department of Mathematical Sciences Appalachian State University [email protected] Spring 2006 R Notes ∗ 1 c 2006 Alan T. Arnholt Copyright The R Script ∗ Outline The Sampling Distribution of X̄ The Sampling Distribution of X̄ Sampling Distribution of a Sample Statistic The R Script 2 The R Script Outline The Sampling Distribution of X̄ The R Script Sampling Distribution of a Sample Statistic The sampling distribution of a sample statistic is the probability distribution associated with the various values that the statistic could assume in repeated sampling. 3 Outline The Sampling Distribution of X̄ The R Script Sampling Distribution of X̄ when Sampling from a Normally Distributed Population Let X̄ be the mean of a sample of size n from a normally distributed population that has mean µ and standard deviation σ, For all sample sizes n, the sampling distribution of X̄: 1. Is exactly normally distributed. 4 Outline The Sampling Distribution of X̄ The R Script Sampling Distribution of X̄ when Sampling from a Normally Distributed Population Let X̄ be the mean of a sample of size n from a normally distributed population that has mean µ and standard deviation σ, For all sample sizes n, the sampling distribution of X̄: 1. Is exactly normally distributed. 2. Is centered at µX̄ = µ, the mean of the population. 5 Outline The Sampling Distribution of X̄ The R Script Sampling Distribution of X̄ when Sampling from a Normally Distributed Population Let X̄ be the mean of a sample of size n from a normally distributed population that has mean µ and standard deviation σ, For all sample sizes n, the sampling distribution of X̄: 1. Is exactly normally distributed. 2. Is centered at µX̄ = µ, the mean of the population. 3. Has a standard deviation σX̄ = deviation of the population. 6 √σ , n where σ is the standard Outline The Sampling Distribution of X̄ The R Script Sampling Distribution of X̄ when Sampling from a Normally Distributed Population Let X̄ be the mean of a sample of size n from a normally distributed population that has mean µ and standard deviation σ, For all sample sizes n, the sampling distribution of X̄: 1. Is exactly normally distributed. 2. Is centered at µX̄ = µ, the mean of the population. 3. Has a standard deviation σX̄ = deviation of the population. √σ , n where σ is the standard 4. In other words, if X ∼ N (µ, σ), then X̄ ∼ N (µX̄ = µ, σX̄ = √σn ). 7 Outline The Sampling Distribution of X̄ The R Script Central Limit Theorem Let X̄ be the mean of a sample of size n from a population with an unknown distribution. When n is relatively large, the sampling distribution of X̄ is approximately normally distributed. The approximation becomes better as the sample size increases. Let X̄ be the mean of a sample of size n from a distribution with mean µ and standard deviation σ, For sufficiently large sample sizes n, the sampling distribution of X̄: 1. Is approximately normally distributed. 8 Outline The Sampling Distribution of X̄ The R Script Central Limit Theorem Let X̄ be the mean of a sample of size n from a population with an unknown distribution. When n is relatively large, the sampling distribution of X̄ is approximately normally distributed. The approximation becomes better as the sample size increases. Let X̄ be the mean of a sample of size n from a distribution with mean µ and standard deviation σ, For sufficiently large sample sizes n, the sampling distribution of X̄: 1. Is approximately normally distributed. 2. Is centered at µX̄ = µ, the mean of the population. 9 Outline The Sampling Distribution of X̄ The R Script Central Limit Theorem Let X̄ be the mean of a sample of size n from a population with an unknown distribution. When n is relatively large, the sampling distribution of X̄ is approximately normally distributed. The approximation becomes better as the sample size increases. Let X̄ be the mean of a sample of size n from a distribution with mean µ and standard deviation σ, For sufficiently large sample sizes n, the sampling distribution of X̄: 1. Is approximately normally distributed. 2. Is centered at µX̄ = µ, the mean of the population. 3. Has a standard deviation σX̄ = deviation of the population. 10 √σ , n where σ is the standard Outline The Sampling Distribution of X̄ The R Script Central Limit Theorem Let X̄ be the mean of a sample of size n from a population with an unknown distribution. When n is relatively large, the sampling distribution of X̄ is approximately normally distributed. The approximation becomes better as the sample size increases. Let X̄ be the mean of a sample of size n from a distribution with mean µ and standard deviation σ, For sufficiently large sample sizes n, the sampling distribution of X̄: 1. Is approximately normally distributed. 2. Is centered at µX̄ = µ, the mean of the population. 3. Has a standard deviation σX̄ = deviation of the population. √σ , n where σ is the standard 4. In other words, if X ∼ (µ, σ), then provided n is sufficiently large, X̄ ∼ approx N (µX̄ = µ, σX̄ = √σn ). 11 12 Outline The Sampling Distribution of X̄ The R Script What value of n is sufficiently large? To see how the shape of the sampling distribution is affected by the shape of the population and the sample size, we will simulate the sampling distribution of X̄ based on three different sample sizes (5,10, and 30) from three different distributions. Each simulation is based on 20000 realizations of X̄ based on 20000 different random samples. Outline The Sampling Distribution of X̄ The R Script Reproduction of Figure 4.7, page 203 BSDA X~U(0,1) 0.6 1.2 0.000 0 20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 1.0 20 40 60 80 100 0.2 0.4 0.6 0.8 1.0 60 80 100 0.2 0.4 0.6 0.8 1.0 60 80 100 10 15 5 10 15 10 15 x30 0.0 0.3 0 4 8 40 0 x30 0.00 0.15 20 5 0.00 0.25 0.0 x30 0 15 x10 0 2 4 40 0 x10 0.00 0.08 20 10 0.00 0.20 0.0 x10 0 5 x5 0.0 2.5 0 0 x5 0.00 0.05 x5 13 X~Lnorm(1,1) 0.00 0.20 X~N(50,15) 0.0 0.2 0.4 0.6 0.8 1.0 0 5 Outline The Sampling Distribution of X̄ The R Script Reproduction of Figure 4.7, page 204 BSDA Uniform Exponential 0 30 0.2 10 40 0.4 50 20 0.6 60 30 70 0.8 40 80 Normal X5 X10 X5 X30 X10 Uniform X30 −4 −2 0 x30 14 14 10 8 Sample Quantiles 2 0.3 40 4 45 0.4 6 0.5 Sample Quantiles 50 Sample Quantiles 55 0.6 12 60 X10 Exponential 0.7 Normal X5 X30 2 4 −4 −2 0 x30 2 4 −4 −2 0 x30 2 4 Outline The Sampling Distribution of X̄ The R Script Problem 4.26 BSDA Simulate 2000 random samples of size 16 from a normally distributed population with a mean of 30 and a standard deviation of 8. a. Determine the mean of each of the 2000 samples. 15 16 Outline The Sampling Distribution of X̄ The R Script Problem 4.26 BSDA Simulate 2000 random samples of size 16 from a normally distributed population with a mean of 30 and a standard deviation of 8. a. Determine the mean of each of the 2000 samples. b. Construct a histogram of the 2000 sample means. Does it appear to be normally distributed? 17 Outline The Sampling Distribution of X̄ The R Script Problem 4.26 BSDA Simulate 2000 random samples of size 16 from a normally distributed population with a mean of 30 and a standard deviation of 8. a. Determine the mean of each of the 2000 samples. b. Construct a histogram of the 2000 sample means. Does it appear to be normally distributed? c. Construct a normal probability plot of the 2000 sample means. Is normality plausible? Outline The Sampling Distribution of X̄ The R Script Problem 4.26 BSDA Simulate 2000 random samples of size 16 from a normally distributed population with a mean of 30 and a standard deviation of 8. a. Determine the mean of each of the 2000 samples. b. Construct a histogram of the 2000 sample means. Does it appear to be normally distributed? c. Construct a normal probability plot of the 2000 sample means. Is normality plausible? d. Compute descriptive statistics of the 2000 sample means. 18 19 Outline The Sampling Distribution of X̄ The R Script Problem 4.26 BSDA Simulate 2000 random samples of size 16 from a normally distributed population with a mean of 30 and a standard deviation of 8. a. Determine the mean of each of the 2000 samples. b. Construct a histogram of the 2000 sample means. Does it appear to be normally distributed? c. Construct a normal probability plot of the 2000 sample means. Is normality plausible? d. Compute descriptive statistics of the 2000 sample means. e. What is the mean of the 2000 sample means? Is it close to the population mean? Should it be? Outline The Sampling Distribution of X̄ The R Script Problem 4.26 BSDA Simulate 2000 random samples of size 16 from a normally distributed population with a mean of 30 and a standard deviation of 8. a. Determine the mean of each of the 2000 samples. b. Construct a histogram of the 2000 sample means. Does it appear to be normally distributed? c. Construct a normal probability plot of the 2000 sample means. Is normality plausible? d. Compute descriptive statistics of the 2000 sample means. e. What is the mean of the 2000 sample means? Is it close to the population mean? Should it be? f. What is the standard deviation of the 2000 sample means? Is it close to the population standard deviation? Should it be? 20 Outline The Sampling Distribution of X̄ The R Script Solution Problem 4.26 BSDA To solve problem 4.26 we will make use of the following commands: rnorm(), matrix(), hist(), apply(), lines(), qqnorm(), qqline(), shapiro.test(), summary(), mean(), and sd(). If you do not remember the arguments for the functions, it is a good idea to review them by typing ?function. We start by creating a 2000 × 16 matrix of values selected at random from a normal distribution with µ = 30 and σ = 8. > > > > > > > 21 m <- 20000 # Number of samples n <- 16 # size of each sample mu <- 30 sigma <- 8 sigma.xbar <- sigma/sqrt(n) rnv <- rnorm(m*n,mu,sigma) # m samples of size n rnvm <- matrix(rnv,nrow=m) # m*n matrix Outline The Sampling Distribution of X̄ The R Script Solution Problem 4.26 BSDA Part a. > samplemeans <- apply(rnvm,1,mean) Part b. > > > > > 22 hist(samplemeans) # plain hist hist(samplemeans,prob=T,ylim=c(0,.25)) # density hist xs <-seq((mu-4*sigma.xbar),(mu+4*sigma.xbar),length=800) ys <- dnorm(xs,mu,sigma.xbar) lines(xs,ys,type="l") # superimpose normal Outline The Sampling Distribution of X̄ The R Script Histograms for Part b Histogram of samplemeans 0.15 0.05 0.00 0 25 30 samplemeans 23 0.10 Density 2000 1000 Frequency 3000 Histogram of samplemeans 35 25 30 samplemeans 35 Outline The Sampling Distribution of X̄ Solution Problem 4.26 BSDA Part c. > qqnorm(samplemeans) > qqline(samplemeans) > shapiro.test(samplemeans) Shapiro-Wilk normality test data: 24 samplemeans W = 0.9995, p-value = 0.8871 The R Script Outline The Sampling Distribution of X̄ The R Script QQ Plot for Part c 32 30 28 24 26 Sample Quantiles 34 36 Normal Q−Q Plot −3 −2 −1 0 Theoretical Quantiles 25 1 2 3 Outline The Sampling Distribution of X̄ The R Script Solution Problem 4.26 BSDA Code and output for parts d, e, and f. > # d. > summary(samplemeans) Min. 1st Qu. Median 23.47 28.54 29.98 > # e. > mean(samplemeans) [1] 29.94127 > # f. > sd(samplemeans) [1] 2.025663 26 Mean 3rd Qu. 29.94 31.33 Max. 36.81 Outline The Sampling Distribution of X̄ The R Script Fancy Code for Part b If a line of code does not make sense, please ask me do explain more! > > + + > > > > > > > > 27 par(col.main="blue",pty="s") hist(samplemeans,prob=T,col="blue",breaks="scott", xlab=expression(bar(X)[16]), main=expression(paste("Simulated Sampling Distribution of ", bar(X)))) lines(xs,ys,type="l",lwd=2,col="red") # superimpose normal Alpha <- round(mean(samplemeans),5) Beta <- round(sd(samplemeans),5) text(23,.18,bquote(hat(mu)[bar(X)]==.(Alpha)),pos=4,col="blue",cex=1) text(23,.16,bquote(hat(sigma)[bar(X)]==.(Beta)),pos=4,col="blue",cex=1) text(34,.18,bquote(mu[bar(X)]==.(mu)),pos=4,col="red",cex=1) text(34,.16,bquote(sigma[bar(X)]==.(sigma.xbar)),pos=4,col="red",cex=1) par(col.main="black",pty="m") Outline The Sampling Distribution of X̄ The R Script Fancy Histogram for Part b µX = 30 ^ X = 2.02566 σ σX = 2 0.10 ^ = 29.94127 µ X 0.00 0.05 Density 0.15 0.20 Simulated Sampling Distribution of X 24 26 28 30 X16 28 32 34 36 Outline The Sampling Distribution of X̄ Link to the R Script • Go to my web page Script for Central Limit Theorem • Homework: problems 4.17-4.29 • See me if you need help! 29 The R Script