Download File

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Normal Distribution
Normal Distribution Curve
A normal distribution curve is
symmetrical, bell-shaped curve
defined by the mean and standard
deviation of a data set.
The normal curve is a probability
distribution with a total area under
the curve of 1.
Standard Normal
Distribution
The mean of the data in a standard
normal distribution is 0 and the
standard deviation is 1.
A standard normal distribution is the
set of all z-scores.
Variance
Variance is the average
squared deviation from the
mean of a set of data. It is
used to find the standard
deviation.
Variance
1.
Find the mean of the data.
Hint – mean is the average so add up the
values and divide by the number of items.
2. Subtract the mean from each value – the
result is called the deviation from the mean.
3. Square each deviation of the mean.
4. Find the sum of the squares.
5. Divide the total by the number of items.
Variance Formula
The variance formula includes the
Sigma Notation, which represents the
sum of all the items to the right of
2
Sigma.
(x   )

n
Mean is represented by
the number of items.
 and n is
Standard Deviation
Standard Deviation shows the
variation in data. If the data is close
together, the standard deviation will
be small. If the data is spread out, the
standard deviation will be large.
Standard Deviation is often denoted
by the lowercase Greek letter sigma,
.
The bell curve which represents a
normal distribution of data shows
what standard deviation represents.
One standard deviation away from the mean (  ) in
either direction on the horizontal axis accounts for
around 68 percent of the data. Two standard
deviations away from the mean accounts for roughly
95 percent of the data with three standard deviations
representing about 99 percent of the data.
Standard Deviation Formula
The standard deviation formula can be
represented using Sigma Notation:

2
(
x


)

n
Notice the standard deviation formula
is the square root of the variance.
Find the variance and
standard deviation
The math test scores of five students
are: 92,88,80,68 and 52.
1) Find the mean: (92+88+80+68+52)/5 = 76.
2) Find the deviation from the mean:
92-76=16
88-76=12
80-76=4
68-76= -8
52-76= -24
Find the variance and
standard deviation
The math test scores of five
students are: 92,88,80,68 and 52.
3) Square the deviation from the
mean: (16) 2  256
(12)  144
2
(4)  16
2
(8)  64
2
(24)  576
2
Find the variance and
standard deviation
The math test scores of five students
are: 92,88,80,68 and 52.
4) Find the sum of the squares of the
deviation from the mean:
256+144+16+64+576= 1056
5) Divide by the number of data
items to find the variance:
1056/5 = 211.2
Find the variance and
standard deviation
The math test scores of five students
are: 92,88,80,68 and 52.
6) Find the square root of the
variance: 211.2  14.53
Thus the standard deviation of
the test scores is 14.53.
Standard Deviation
A different math class took the
same test with these five test
scores: 92,92,92,52,52.
Find the standard deviation for
this class.
Solve:
A different math class took the
same test with these five test
scores: 92,92,92,52,52.
Find the standard deviation for this
class.
The math test scores of five students
are: 92,92,92,52 and 52.
1) Find the mean: (92+92+92+52+52)/5 = 76
2) Find the deviation from the mean:
92-76=16 92-76=16 92-76=16
52-76= -24
52-76= -24
3) Square the deviation from the mean:
(16)2  256(16)2  256(16) 2  256
  
4) Find the sum of the squares:
256+256+256+576+576= 1920
The math test scores of five
students are: 92,92,92,52 and 52.
5) Divide the sum of the squares
by the number of items :
1920/5 = 384 variance
6) Find the square root of the variance:
384  19.6
Thus the standard deviation of the
second set of test scores is 19.6.
Analyzing the data:
Consider both sets of scores. Both
classes have the same mean, 76.
However, each class does not have the
same scores. Thus we use the standard
deviation to show the variation in the
scores. With a standard variation of
14.53 for the first class and 19.6 for the
second class, what does this tell us?
Analyzing the data:
Class A: 92,88,80,68,52
Class B: 92,92,92,52,52
With a standard variation of 14.53
for the first class and 19.6 for the
second class, the scores from the
second class would be more spread
out than the scores in the second
class.
z-scores
When a set of data values are
normally distributed, we can
standardize each score by converting
it into a z-score.
z-scores make it easier to
compare data values measured
on different scales.
z-scores
A z-score reflects how many
standard deviations above or below
the mean a raw score is.
The z-score is positive if the data
value lies above the mean and
negative if the data value lies below
the mean.
z-score formula
z
x

Where x represents an element
of the data set, the mean is
represented by  and standard
deviation by .

Analyzing the data
Suppose SAT scores among college
students are normally distributed with a
mean of 500 and a standard deviation of
100. If a student scores a 700, what
would be her z-score?
Analyzing the data
Suppose SAT scores among college students
are normally distributed with a mean of 500
and a standard deviation of 100. If a student
scores a 700, what would be her z-score?
700  500
z
2
100
Her z-score would be 2 which
means her score is two standard
deviations above the mean.
Analyzing the data:
Class A: 92,88,80,68,52
Class B: 92,92,92,52,52
Class C: 77,76,76,76,75
Estimate the standard deviation for Class C.
a) Standard deviation will be less than 14.53.
b) Standard deviation will be greater than 19.6.
c) Standard deviation will be between 14.53
and 19.6.
d) Can not make an estimate of the standard
deviation.
Analyzing the data:
Class A: 92,88,80,68,52
Class B: 92,92,92,52,52
Class C: 77,76,76,76,75
Estimate the standard deviation for Class C.
a) Standard deviation will be less than 14.53.
Answer: A
The scores in class C have the same
mean of 76 as the other two classes.
However, the scores in Class C are all
much closer to the mean than the other
classes so the standard deviation will be
smaller than for the other classes.
Summary:
As we have seen, standard deviation
measures the dispersion of data.
The greater the value of the
standard deviation, the further the
data tend to be dispersed from the
mean.
Analyzing the data
• A set of math test scores has a mean of
70 and a standard deviation of 8.
• A set of English test scores has a mean
of 74 and a standard deviation of 16.
For which test would a score of 78
have a higher standing?
Analyzing the data
A set of math test scores has a mean of 70 and a standard deviation of 8.
A set of English test scores has a mean of 74 and a standard deviation of 16.
For which test would a score of 78 have a higher standing?
To solve: Find the z-score for each test.
78-70
math z -score =
1
76-74
8
English z -score=
 .25
16
The math score would have the highest standing
since it is 1 standard deviation above the mean
while the English score is only .25 standard
deviation above the mean.
Analyzing the data
What will be the miles per gallon for a
Toyota Camry when the average mpg
is 23, it has a z value of 1.5 and a
standard deviation of 2?
Analyzing the data
What will be the miles per gallon for a Toyota
Camry when the average mpg is 23, it has a
z value of 1.5 and a standard deviation of 2?
Using the formula for z-scores: z 
x  23
1.5 
2
x

3  x  23 x  26
The Toyota Camry would be expected to
use 26 mpg of gasoline.
Normal Distribution Probability
With a graphing calculator, we can
calculate the probability of normal
distribution data falling between
two specific values using the mean
and standard deviation of the data
Normal Distribution Probability
Example:
A Calculus exam is given to 500
students. The scores have a
normal distribution with a mean of
78 and a standard deviation of 5.
What percent of the students
have scores between 82 and 90?
TI 83/84
Normal Distribution Probability
Example:
A Calculus exam is given to 500 students. The scores
have a normal distribution with a mean of 78 and a
standard deviation of 5. What percent of the students
have scores between 82 and 90?
TI 83/84 directions:
a. Press [2nd][VARS](DISTR) [2] (normalcdf)
b. Press [82] [,] [90] [,] [78] [,] [5] [)][Enter]
normalcdf(82,90,
78,5)
.2036578048
There is a 20.37%
probability that a student
scored between 82 and 90
on the Calculus exam.
Normal Distribution Probability
Extension: A Calculus exam is given to 500
students. The scores have a normal
distribution with a mean of 78 and a
standard deviation of 5.
How many students have scores
between 82 and 90?
Using the probability previously found:
500 * .2037 = 101.85
There are about 102 students who scored
between 82 and 90 on the Calculus exam.
Normal Distribution Probability
Practice:
A Calculus exam is given to 500
students. The scores have a normal
distribution with a mean of 78 and a
standard deviation of 5. What
percent of the students have scores
above 70?
Hint: Use 1E99 for upper limit;
[2nd][,] on T I
Normal Distribution Probability
Practice: A Calculus exam is given to 500 students. The
scores have a normal distribution with a mean
of 78 and a standard deviation of 5. How many
students have scores above 70?
TI 84
500*.9452= 472.6
Normalcdf(70,1E9
9,78,5)
.9452007106
Normal C.D
prob=0.9452
About 473
students have a
score above 70 on
the Calculus exam.
Normal Distribution Probability
Practice:
Find the probability of scoring
below a 1400 on the SAT if the
scores are normal distributed with a
mean of 1500 and a standard
deviation of 200.
Hint: Use -1E99 for lower limit;
[2nd][,] on T I
Normal Distribution Probability
Practice: Find the probability of scoring below a 1400 on
TI 84
the SAT if the scores are normal distributed
with a mean of 1500 and a standard deviation of
200.
Normalcdf(-1E99,
1400,1500,200)
.3085375322
There is a
30.85%
probability that
a student
will score below
a 1400 on the
SAT.