Download Section 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Lesson 9 - 3
Introduction to the
Practice of Statistics
Objectives
• Define statistics and statistical thinking
• Understand the process of statistics
• Distinguish between qualitative and quantitative
variables
• Distinguish between discrete and continuous
variables
Vocabulary
• Central Limit Theorem – the larger the sample size,
the closer the sampling distribution for the sample
mean from any underlying distribution approaches a
Normal distribution
• Standard error of the mean – standard deviation of
the sampling distribution of x-bar
Sample Mean, x̄
The behavior of x̄ in repeated sampling is much
like that of the sample proportion, p-hat.
• Sample mean x̄ is an unbiased estimator of
the population mean μ
• Spread is less than that of X. Standard
deviation of x̄ is smaller than that of X by a
factor of 1/√n
Sample Spread of x̄
If the random variable X has a normal
distribution with a mean of 20 and a standard
deviation of 12
– If we choose samples of size n = 4, then the
sample mean will have a normal distribution with
a mean of 20 and a standard deviation of 6
– If we choose samples of size n = 9, then the
sample mean will have a normal distribution with
a mean of 20 and a standard deviation of 4
Example 1
The height of all 3-year-old females is approximately
normally distributed with μ = 38.72 inches and σ = 3.17
inches. Compute the probability that a simple random
sample of size n = 10 results in a sample mean greater
than 40 inches.
P(x-bar > 40)
μ = 38.72 σ = 3.17 n = 10
σx = 3.17 / 10 = 1.00244
x-μ
1.28
40 – 38.72
Z = ------------- = ----------------- = ----------------σx
1.00244
1.00244
= 1.277
normalcdf(1.277,E99) = 0.1008
normalcdf(40,E99,38.72,1.002) = 0.1007
a
Example 2
We’ve been told that the average weight of giraffes is
2400 pounds with a standard deviation of 300 pounds.
We’ve measured 50 giraffes and found that the sample
mean was 2600 pounds. Is our data consistent with
what we’ve been told?
P(x-bar > 2600)
μ = 2400 σ = 300 n = 50
σx = 300 / 50 = 42.4264
x-μ
200
2600 – 2400
Z = ------------- = ----------------- = ----------------σx
42.4264
42.4264
= 4.714
normalcdf(4.714,E99) = 0.000015
normalcdf(2600,E99,2400,42.4264) = 0.0000001
a
Example 3
Young women’s height is distributed as a N(64.5, 2.5),
What is the probability that a randomly selected young
woman is taller than 66.5 inches?
P(x > 66.5)
μ = 64.5 σ = 2.5 n = 1
σx = 2.5 / 1 = 2.5 !!
x-μ
2
66.5 – 64.5
Z = ------------- = ----------------- = --------σx
2.5
2.5
= 0.80
normalcdf(0.80,E99) = 1 – 0.7881 = 0.2119
normalcdf(66.5,E99,64.5,2.5) = 0.2119
a
Example 4
Young women’s height is distributed as a N(64.5, 2.5),
What is the probability that an SRS of 10 young women is
greater than 66.5 inches?
P(x > 66.5)
μ = 64.5 σ = 2.5 n = 1
σx = 2.5 / 10 = 0.79
x-μ
2
66.5 – 64.5
Z = ------------- = ----------------- = --------σx
0.79
0.79
= 2.53
normalcdf(2.53,E99) = 1 – 0.9943 = 0.0057
normalcdf(66.5,E99,64.5,2.5/√10) = 0.0057
a
Central Limit Theorem
X or x-bar
Distribution
Regardless of the shape of the population, the sampling distribution of
x-bar becomes approximately normal as the sample size n increases.
Caution: only applies to shape and not to the mean or standard deviation
x
x
x
x
x
x
x
x
x
x
x
x
x
Random Samples Drawn from Population
Population Distribution
x
x
x
Central Limit Theorem in Action
n =1
n=2
n = 10
n = 25
Example 5
The time a technician requires to perform preventive
maintenance on an air conditioning unit is governed by
the exponential distribution (similar to curve a from “in
Action” slide). The mean time is μ = 1 hour and σ = 1
hour. Your company has a contract to maintain 70 of
these units in an apartment building. In budgeting your
technician’s time should you allow an average of 1.1
hours or 1.25 hours for each unit?
P(x > 1.1) vs P(x > 1.25)
μ=1
σ=1
n = 70
σx = 1 / 70 = 0.120
x-μ
0.1
1.1 – 1
Z = ------------- = ------------ = --------- = 0.83
σx
0.12
0.12
normalcdf(0.83,E99) = 1 – 0.7967 = 0.2033
a
Example 5 cont
The time a technician requires to perform preventive
maintenance on an air conditioning unit is governed by
the exponential distribution (similar to curve a from “in
Action” slide). The mean time is μ = 1 hour and σ = 1
hour. Your company has a contract to maintain 70 of
these units in an apartment building. In budgeting your
technician’s time should you allow an average of 1.1
hours or 1.25 hours for each unit?
P(x > 1.25)
μ=1
σ=1
n = 70
σx = 1 / 70 = 0.120
x-μ
0.25
1.25 – 1
Z = ------------- = ------------ = --------- = 2.083
σx
0.12
0.12
normalcdf(2.083,E99) = 1 – 0.9818 = 0.0182
a
Summary of Distribution of x
Shape, Center and
Spread of Population
Distribution of the Sample Means
Shape
Normal with mean, μ and
standard deviation, σ
Regardless of sample
size, n, distribution of
x-bar is normal
Population is not normal
with mean, μ and
standard deviation, σ
As sample size, n,
increases, the distribution
of x-bar becomes
approximately normal
Center
Spread
μx-bar = μ
σ
σx-bar = ------n
μx-bar = μ
σ
σx-bar = ------n
Summary and Homework
• Summary
– Take an SRS and use the sample proportion x̄ to
estimate the unknown parameter μ
– x̄ is an unbiased estimator of μ
– Increase in sample size decreases the standard
deviation of x̄ (by a factor of 1/√n)
– If the population is normal, then so is x̄
– Central Limit Theorem: for large n, the sampling
distribution of x̄ is approximately normal for any
population (with a finite σ)
• Homework
– Day 1: pg 595-6; 9.31-4
– Day 2: pg 601-4; 9.35, 36, 38, 42-44