Download standard normal distribu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
The Standard Normal Distribution
Now we consider a special normal distribution N(0,1), called the "standard
normal distribution," with a mean of 0 and a standard deviation of 1. This will be
considered the "standard" and values from other normal distributions can be
standardized by the following formula:
This standardized value is often called a z-score (pronounced “zee-score” but
also pronounced “zet-score”). There are many reasons to standardize, the most
frequent, of which, is to eliminate the effect of scale. In other words, sometimes
we are interested in how far an observation falls from a mean in sterms of
standard deviations that are not integers. We can compare standardized scores
readily, even when they come from different normal distributions.
If a variable has a normal distribution with a mean and standard deviation, then
the standardized variable
has a standard normal distribution.
Let me go through a couple of examples to show you how a z-score can be used
to determine relative standing (i.e., percentiles)
First, let’s look at a basic problem. Suppose that the SAT Math scores for all
students in the United States that same year was 486 with a standard deviation
of 22. What percent of the scores were less than 508?
Recall that in a normal distribution, the 68-95-99.7 (or Empirical) rule tells us
approximately what percentage of the observations lie within one, two or three
standard deviations from the mean. If the mean is 486, then a score of 508 will
be one standard deviation (or 34%) above the mean. Since the mean marks the
50th percentile, it can be seen that 84% of the observations were a score of 508
or lower. If you didn’t see where 84% came from, just add 50% (the area under
the curve from 486 and left) and 34% (the area under the curve between 486 and
508).
What if the question was asking for something a little different? What percentage
of the scores are less than 516? Notice that 516 lies between 508 and 530 so it
is between one and two standard deviations above the mean. But can we be
more precise? The answer is yes, if we use z-scores.
Look at the beginning of this document and look at the formula for z-scores. If I
substitute numbers in for the symbols I can get a z-score and from there I can
find the percent of observations that lie around that score. Look at the math I
provide below:
z
x


516  486 30

 1.36
22
22
(always round z-scores to two decimal places)
So, this means that on the x-axis of the normal distribution a score of 516 lies
approximately 1.36 standard deviations above the mean of 486
But how does this translate to percentiles. In a normal curve, percentages are
found by measuring the area under the normal curve between two numbers. In
this case, we are interested in the percentage of observations that lie below 516
so we are looking at the percentage of observations between the lowest
observation and 516. In the case of the SAT, the lowest observation attainable (I
think) is 200 so we are looking at the percent of observations between 200 and
516. The picture below gives an illustration. The blue shaded area is what we are
interested in
So, to find this area all we need to do is find the z-score in Table A located on the
first page of your textbook and then cross reference it with the percentage
provided. To do this, look at the first column in the table. That is the z-score
column. The table covers both sides of the page and our z-score is positive, so if
you haven’t discovered this yet you need to be on the back side of the page! 
Our z-score is 1.36. So, we go down the column to find the “1.3”. Once we are
there we move right seven columns to place ourselves in the “.06” column. Now
we are located at 1.36 standard deviations. There is a number in this location. It
is 0.9131. This means that 91.31% of the area under the curve is between the
lowest score and our z-score of 1.36.
What if we changed the question to ask “what percentage of the scores are
greater than 5.16”? Our process is the same as far as finding z-scores and a
location in Table A. But now we are interested in percentage of scores between
516 and the top score. See the curve below for a picture
You learned in the first part of this module that the total area under a density
curve is 1. So, the percent of scores greater than 516 is 1-0.9131 = 0.0869. So,
approximately 8.69% of the scores are greater than 516.
One more question. What percent of the scores are between 476 and 516? Look
at the picture below for an illustration
We already know that the area between the lowest score and 516 is 0.9131. We
can also find the area between the lowest score and 476 by:
z
x


476  486  10

  0.45
22
22
(remember to round to two decimal places)
This time go to the front page of Table A and find the z-score of – 0.45 and the
number that is in that location (like we did in the example above). You should get
0.3264.
To get the answer to our question, subtract out the two. So, the percentage of
scores between 476 and 516 is 0.9131 – 0.3264 = 0.5867 or about 58.67%