Download standard deviation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

History of statistics wikipedia , lookup

Regression toward the mean wikipedia , lookup

Time series wikipedia , lookup

Transcript
The Standard Deviation
as a Ruler and
the Normal Model






The result of a 50-point test looks bad, and I
Knew that I made the test too hard compared
to the test from last semester. So I decide to
add 5 points to each student’s score.
Ann: 40 pts
Bob: 25 pts
Carol: 37 pts
Don: 14 pts
Eva: 14 pts



Adding (or subtracting) a constant to every
data value adds (or subtracts) the same
constant to measures of position.
Adding (or subtracting) a constant to each
value will increase (or decrease) measures of
position: center, percentiles, max or min by
the same constant.
Its shape and spread - range, IQR, standard
deviation - remain unchanged.

The following histograms show a shift from
men’s actual weights to kilograms above
recommended weight:






After I “curve”, I realize that my syllabus
indicate that each test is worth 150 points.
How can I fix it to align with the policy listed
on the syllabus?
Student Ann: 45 pts  50 pts
Student Bob: 30 pts  35 pts
Student Carol: 42 pts  47 pts
Student Don: 19 pts  24 pts
Student Eva: 19 pts  24 pts

When we multiply (or divide) all the data
values by any constant, all measures of
position (such as the mean, median, and
percentiles) and measures of spread (such as
the range, the IQR, and the standard
deviation) are multiplied (or divided) by that
same constant.

The men’s weight data set measured weights
in kilograms. If we want to think about these
weights in pounds, we would rescale the
data:



Scores on the ACT college entrance exam in a
recent year were roughly normal, with mean
21.2 and standard deviation 4.8. John scores
27 on the ACT.
Scores on the SAT Reasoning college entrance
exam in the same year were roughly normal,
with mean 1511 and standard deviation 194.
Susan scores 1718 on the SAT.
Who has a higher score? Which student is
considered a better student?



The trick in comparing very different-looking
values is to use standard deviations as our rulers.
The standard deviation tells us how the whole
collection of values varies, so it’s a natural ruler
for comparing an individual to a group.
As the most common measure of variation, the
standard deviation plays a crucial role in how we
look at data.

We compare individual data values to their
mean, relative to their standard deviation
using the following formula:
y  y

z
s

We call the resulting values standardized
values, or z-scores.

Standardized values have no units.

z-scores measure the distance of each data

value from the mean in standard deviations.
A negative z-score tells us that the data
value is below the mean, while a positive zscore tells us that the data value is above the
mean.


Standardized values have been converted from
their original units to the standard statistical
unit of standard deviations from the mean.
Thus, we can compare values that are
measured on different scales, with different
units, or from different populations.

Standardizing data into z-scores shifts the
data by subtracting the mean and rescales
the values by dividing by their standard
deviation (SD).
◦ Standardizing into z-scores does not change the
shape of the distribution.
◦ Standardizing into z-scores changes the center by
making the mean 0.
◦ Standardizing into z-scores changes the spread by
making the standard deviation 1.



A useful family of models for unimodal,
symmetric distributions, usually represented by
a bell-shaped curves, is called the Normal
model.
Biological measures (like height, weight, heart
rate, head circumference, etc.) are often
normally distributed.
We write N(μ,σ) to represent a Normal model
with a mean of μ and a standard deviation of σ.


Summaries of data, like the sample mean and
standard deviation, are written with Latin
letters. Such summaries of data are called
statistics.
When we standardize Normal data, we still call
the standardized value a z-score, and we write
z

y

Once we have standardized, we need only one
model, the N(0,1) model is called the standard
Normal model (or the standard Normal
distribution).


Normal models give us an idea of how extreme a
value is by telling us how likely it is to find one
that far from the mean.
It turns out that in a Normal model:
◦ about 68% of the values fall within 1 SD of the mean
◦ about 95% of the values fall within 2 SD of the mean
◦ about 99.7% of the values fall within 3 SD of the mean.



When we use the Normal model, we are
assuming the distribution is Normal.
A newborn baby weighs 8.5 lbs, is it normal?
We cannot check this assumption in practice,
so when we have the actual data, we make a
histogram of the distribution and check the
Nearly Normal Condition:
◦ The shape of the data’s distribution is unimodal?
◦ Bell-shaped curve?
◦ Roughly symmetric?


When a data value doesn’t fall exactly 1, 2, or
3 standard deviations from the mean, we can
look it up in a table of Normal percentiles.
Table Z in Appendix D provides us with
normal percentiles, but many calculators and
statistics computer packages provide these as
well.


Table Z is the standard Normal table. We have to
convert our data to z-scores before using the table.
The figure shows us how to find the area to the left
when we have a z-score of 1.80:


Sometimes we start with areas and need to
find the corresponding z-score or even the
original data value.
Example: What z-score represents the first
quartile in a Normal model?



Look in Table Z for an area of 0.2500.
The exact area is not there, but 0.2514 is
pretty close.
This figure is associated with z = -0.67, so
the first quartile is 0.67 standard deviations
below the mean.





The distribution of scores on tests such as the SAT
college entrance examination is close to normal.
Scores on each of the three sections (math, critical
reading, writing) of the SAT are adjusted so that the
mean score is about 500 and the standard deviation
is about 100.
What percent of scores fall between 200 and 800?
What percent of scores are above 700?
How high must a student score to fall in the top 25%?
What proportion of SAT scores are above 640?


The scale of scores on an IQ test is
approximately normal with mean 100 and
standard deviation 15. The organization
MENSA, which calls itself “the high IQ
society,” requires an IQ score of 130 or
higher for membership. What percent of
adults would qualify for membership?
a) 95%
b) 5%
c) 2.5%
d) 17%



Scores on the ACT college entrance exam in a
recent year were roughly normal, with mean
21.2 and standard deviation 4.8. John scores
27 on the ACT.
Scores on the SAT Reasoning college entrance
exam in the same year were roughly normal,
with mean 1511 and standard deviation 194.
Susan scores 1718 on the SAT.
Who has a higher score?
Page 147 – 152
Problem # 3, 5, 9, 11, 15, 19, 27, 29, 33, 35,
37, 39, 41, 43, 49, 53, 57.