Download The Sampling Distribution of - Appalachian State University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Outline
The Sampling Distribution of X̄
The Sampling Distribution of X̄
Alan T. Arnholt
Department of Mathematical Sciences
Appalachian State University
[email protected]
Spring 2006 R Notes
∗
1
c 2006 Alan T. Arnholt
Copyright The R Script
∗
Outline
The Sampling Distribution of X̄
The Sampling Distribution of X̄
Sampling Distribution of a Sample Statistic
The R Script
2
The R Script
Outline
The Sampling Distribution of X̄
The R Script
Sampling Distribution of a Sample Statistic
The sampling distribution of a sample statistic is the
probability distribution associated with the various values that the
statistic could assume in repeated sampling.
3
Outline
The Sampling Distribution of X̄
The R Script
Sampling Distribution of X̄ when Sampling from a
Normally Distributed Population
Let X̄ be the mean of a sample of size n from a normally
distributed population that has mean µ and standard deviation σ,
For all sample sizes n, the sampling distribution of X̄:
1. Is exactly normally distributed.
4
Outline
The Sampling Distribution of X̄
The R Script
Sampling Distribution of X̄ when Sampling from a
Normally Distributed Population
Let X̄ be the mean of a sample of size n from a normally
distributed population that has mean µ and standard deviation σ,
For all sample sizes n, the sampling distribution of X̄:
1. Is exactly normally distributed.
2. Is centered at µX̄ = µ, the mean of the population.
5
Outline
The Sampling Distribution of X̄
The R Script
Sampling Distribution of X̄ when Sampling from a
Normally Distributed Population
Let X̄ be the mean of a sample of size n from a normally
distributed population that has mean µ and standard deviation σ,
For all sample sizes n, the sampling distribution of X̄:
1. Is exactly normally distributed.
2. Is centered at µX̄ = µ, the mean of the population.
3. Has a standard deviation σX̄ =
deviation of the population.
6
√σ ,
n
where σ is the standard
Outline
The Sampling Distribution of X̄
The R Script
Sampling Distribution of X̄ when Sampling from a
Normally Distributed Population
Let X̄ be the mean of a sample of size n from a normally
distributed population that has mean µ and standard deviation σ,
For all sample sizes n, the sampling distribution of X̄:
1. Is exactly normally distributed.
2. Is centered at µX̄ = µ, the mean of the population.
3. Has a standard deviation σX̄ =
deviation of the population.
√σ ,
n
where σ is the standard
4. In other words, if X ∼ N (µ, σ), then
X̄ ∼ N (µX̄ = µ, σX̄ = √σn ).
7
Outline
The Sampling Distribution of X̄
The R Script
Central Limit Theorem
Let X̄ be the mean of a sample of size n from a population with
an unknown distribution. When n is relatively large, the sampling
distribution of X̄ is approximately normally distributed. The
approximation becomes better as the sample size increases.
Let X̄ be the mean of a sample of size n from a distribution with
mean µ and standard deviation σ, For sufficiently large sample
sizes n, the sampling distribution of X̄:
1. Is approximately normally distributed.
8
Outline
The Sampling Distribution of X̄
The R Script
Central Limit Theorem
Let X̄ be the mean of a sample of size n from a population with
an unknown distribution. When n is relatively large, the sampling
distribution of X̄ is approximately normally distributed. The
approximation becomes better as the sample size increases.
Let X̄ be the mean of a sample of size n from a distribution with
mean µ and standard deviation σ, For sufficiently large sample
sizes n, the sampling distribution of X̄:
1. Is approximately normally distributed.
2. Is centered at µX̄ = µ, the mean of the population.
9
Outline
The Sampling Distribution of X̄
The R Script
Central Limit Theorem
Let X̄ be the mean of a sample of size n from a population with
an unknown distribution. When n is relatively large, the sampling
distribution of X̄ is approximately normally distributed. The
approximation becomes better as the sample size increases.
Let X̄ be the mean of a sample of size n from a distribution with
mean µ and standard deviation σ, For sufficiently large sample
sizes n, the sampling distribution of X̄:
1. Is approximately normally distributed.
2. Is centered at µX̄ = µ, the mean of the population.
3. Has a standard deviation σX̄ =
deviation of the population.
10
√σ ,
n
where σ is the standard
Outline
The Sampling Distribution of X̄
The R Script
Central Limit Theorem
Let X̄ be the mean of a sample of size n from a population with
an unknown distribution. When n is relatively large, the sampling
distribution of X̄ is approximately normally distributed. The
approximation becomes better as the sample size increases.
Let X̄ be the mean of a sample of size n from a distribution with
mean µ and standard deviation σ, For sufficiently large sample
sizes n, the sampling distribution of X̄:
1. Is approximately normally distributed.
2. Is centered at µX̄ = µ, the mean of the population.
3. Has a standard deviation σX̄ =
deviation of the population.
√σ ,
n
where σ is the standard
4. In other words, if X ∼ (µ, σ), then provided n is sufficiently
large, X̄ ∼ approx N (µX̄ = µ, σX̄ = √σn ).
11
12
Outline
The Sampling Distribution of X̄
The R Script
What value of n is sufficiently large?
To see how the shape of the sampling distribution is affected by
the shape of the population and the sample size, we will simulate
the sampling distribution of X̄ based on three different sample
sizes (5,10, and 30) from three different distributions. Each
simulation is based on 20000 realizations of X̄ based on 20000
different random samples.
Outline
The Sampling Distribution of X̄
The R Script
Reproduction of Figure 4.7, page 203 BSDA
X~U(0,1)
0.6 1.2
0.000
0
20
40
60
80
100
0.0
0.2
0.4
0.6
0.8
1.0
20
40
60
80
100
0.2
0.4
0.6
0.8
1.0
60
80
100
0.2
0.4
0.6
0.8
1.0
60
80
100
10
15
5
10
15
10
15
x30
0.0 0.3
0 4 8
40
0
x30
0.00 0.15
20
5
0.00 0.25
0.0
x30
0
15
x10
0 2 4
40
0
x10
0.00 0.08
20
10
0.00 0.20
0.0
x10
0
5
x5
0.0 2.5
0
0
x5
0.00 0.05
x5
13
X~Lnorm(1,1)
0.00 0.20
X~N(50,15)
0.0
0.2
0.4
0.6
0.8
1.0
0
5
Outline
The Sampling Distribution of X̄
The R Script
Reproduction of Figure 4.7, page 204 BSDA
Uniform
Exponential
0
30
0.2
10
40
0.4
50
20
0.6
60
30
70
0.8
40
80
Normal
X5
X10
X5
X30
X10
Uniform
X30
−4
−2
0
x30
14
14
10
8
Sample Quantiles
2
0.3
40
4
45
0.4
6
0.5
Sample Quantiles
50
Sample Quantiles
55
0.6
12
60
X10
Exponential
0.7
Normal
X5
X30
2
4
−4
−2
0
x30
2
4
−4
−2
0
x30
2
4
Outline
The Sampling Distribution of X̄
The R Script
Problem 4.26 BSDA
Simulate 2000 random samples of size 16 from a normally
distributed population with a mean of 30 and a standard deviation
of 8.
a. Determine the mean of each of the 2000 samples.
15
16
Outline
The Sampling Distribution of X̄
The R Script
Problem 4.26 BSDA
Simulate 2000 random samples of size 16 from a normally
distributed population with a mean of 30 and a standard deviation
of 8.
a. Determine the mean of each of the 2000 samples.
b. Construct a histogram of the 2000 sample means. Does it
appear to be normally distributed?
17
Outline
The Sampling Distribution of X̄
The R Script
Problem 4.26 BSDA
Simulate 2000 random samples of size 16 from a normally
distributed population with a mean of 30 and a standard deviation
of 8.
a. Determine the mean of each of the 2000 samples.
b. Construct a histogram of the 2000 sample means. Does it
appear to be normally distributed?
c. Construct a normal probability plot of the 2000 sample means.
Is normality plausible?
Outline
The Sampling Distribution of X̄
The R Script
Problem 4.26 BSDA
Simulate 2000 random samples of size 16 from a normally
distributed population with a mean of 30 and a standard deviation
of 8.
a. Determine the mean of each of the 2000 samples.
b. Construct a histogram of the 2000 sample means. Does it
appear to be normally distributed?
c. Construct a normal probability plot of the 2000 sample means.
Is normality plausible?
d. Compute descriptive statistics of the 2000 sample means.
18
19
Outline
The Sampling Distribution of X̄
The R Script
Problem 4.26 BSDA
Simulate 2000 random samples of size 16 from a normally
distributed population with a mean of 30 and a standard deviation
of 8.
a. Determine the mean of each of the 2000 samples.
b. Construct a histogram of the 2000 sample means. Does it
appear to be normally distributed?
c. Construct a normal probability plot of the 2000 sample means.
Is normality plausible?
d. Compute descriptive statistics of the 2000 sample means.
e. What is the mean of the 2000 sample means? Is it close to the
population mean? Should it be?
Outline
The Sampling Distribution of X̄
The R Script
Problem 4.26 BSDA
Simulate 2000 random samples of size 16 from a normally
distributed population with a mean of 30 and a standard deviation
of 8.
a. Determine the mean of each of the 2000 samples.
b. Construct a histogram of the 2000 sample means. Does it
appear to be normally distributed?
c. Construct a normal probability plot of the 2000 sample means.
Is normality plausible?
d. Compute descriptive statistics of the 2000 sample means.
e. What is the mean of the 2000 sample means? Is it close to the
population mean? Should it be?
f. What is the standard deviation of the 2000 sample means? Is it
close to the population standard deviation? Should it be?
20
Outline
The Sampling Distribution of X̄
The R Script
Solution Problem 4.26 BSDA
To solve problem 4.26 we will make use of the following
commands: rnorm(), matrix(), hist(), apply(), lines(),
qqnorm(), qqline(), shapiro.test(), summary(), mean(),
and sd(). If you do not remember the arguments for the
functions, it is a good idea to review them by typing ?function.
We start by creating a 2000 × 16 matrix of values selected at
random from a normal distribution with µ = 30 and σ = 8.
>
>
>
>
>
>
>
21
m <- 20000
# Number of samples
n <- 16
# size of each sample
mu <- 30
sigma <- 8
sigma.xbar <- sigma/sqrt(n)
rnv <- rnorm(m*n,mu,sigma)
# m samples of size n
rnvm <- matrix(rnv,nrow=m)
# m*n matrix
Outline
The Sampling Distribution of X̄
The R Script
Solution Problem 4.26 BSDA
Part a.
> samplemeans <- apply(rnvm,1,mean)
Part b.
>
>
>
>
>
22
hist(samplemeans)
# plain hist
hist(samplemeans,prob=T,ylim=c(0,.25)) # density hist
xs <-seq((mu-4*sigma.xbar),(mu+4*sigma.xbar),length=800)
ys <- dnorm(xs,mu,sigma.xbar)
lines(xs,ys,type="l")
# superimpose normal
Outline
The Sampling Distribution of X̄
The R Script
Histograms for Part b
Histogram of samplemeans
0.15
0.05
0.00
0
25
30
samplemeans
23
0.10
Density
2000
1000
Frequency
3000
Histogram of samplemeans
35
25
30
samplemeans
35
Outline
The Sampling Distribution of X̄
Solution Problem 4.26 BSDA
Part c.
> qqnorm(samplemeans)
> qqline(samplemeans)
> shapiro.test(samplemeans)
Shapiro-Wilk normality test
data:
24
samplemeans W = 0.9995, p-value = 0.8871
The R Script
Outline
The Sampling Distribution of X̄
The R Script
QQ Plot for Part c
32
30
28
24
26
Sample Quantiles
34
36
Normal Q−Q Plot
−3
−2
−1
0
Theoretical Quantiles
25
1
2
3
Outline
The Sampling Distribution of X̄
The R Script
Solution Problem 4.26 BSDA
Code and output for parts d, e, and f.
> # d.
> summary(samplemeans)
Min. 1st Qu. Median
23.47
28.54
29.98
> # e.
> mean(samplemeans)
[1] 29.94127
> # f.
> sd(samplemeans)
[1] 2.025663
26
Mean 3rd Qu.
29.94
31.33
Max.
36.81
Outline
The Sampling Distribution of X̄
The R Script
Fancy Code for Part b
If a line of code does not make sense, please ask me do explain more!
>
>
+
+
>
>
>
>
>
>
>
>
27
par(col.main="blue",pty="s")
hist(samplemeans,prob=T,col="blue",breaks="scott",
xlab=expression(bar(X)[16]),
main=expression(paste("Simulated Sampling Distribution of ", bar(X))))
lines(xs,ys,type="l",lwd=2,col="red") # superimpose normal
Alpha <- round(mean(samplemeans),5)
Beta <- round(sd(samplemeans),5)
text(23,.18,bquote(hat(mu)[bar(X)]==.(Alpha)),pos=4,col="blue",cex=1)
text(23,.16,bquote(hat(sigma)[bar(X)]==.(Beta)),pos=4,col="blue",cex=1)
text(34,.18,bquote(mu[bar(X)]==.(mu)),pos=4,col="red",cex=1)
text(34,.16,bquote(sigma[bar(X)]==.(sigma.xbar)),pos=4,col="red",cex=1)
par(col.main="black",pty="m")
Outline
The Sampling Distribution of X̄
The R Script
Fancy Histogram for Part b
µX = 30
^ X = 2.02566
σ
σX = 2
0.10
^ = 29.94127
µ
X
0.00
0.05
Density
0.15
0.20
Simulated Sampling Distribution of X
24
26
28
30
X16
28
32
34
36
Outline
The Sampling Distribution of X̄
Link to the R Script
• Go to my web page Script for Central Limit Theorem
• Homework: problems 4.17-4.29
• See me if you need help!
29
The R Script
Related documents