Download Section 2.2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Gaussian function - Wikipedia, the free encyclopedia
9/ 2
10
links (Bell Curve)
2.2 External
Normal Distributions
In many natural processes, random variation conforms to a particular probability distribution known as the normal
distribution, which is the most commonly observed probability distributions. The normal curve was first used in the 1700’s by
French mathematicians and early 1800’s by German mathematician and physicist Karl Gauss. The curve is known as the
Properties
Gaussian distribution and is also sometimes called a bell curve.
Normal curves
Gaussian functions
by are
applying
exponential
function
toarea used
general
quadratic
function. The Gau
 arise
Curves that
symmetric,the
single-peaked,
and bell-shaped.
They
to describe
normal distributions.
 The mean is at the center of the curve.
functions are thus those
functions whose logarithm is a quadratic function.
The standard deviation controls the spread of the curve.
The
The bigger the St Dev, the wider the curve.
There c
areis
roughly
6 widths
of standard
deviationat
in ahalf
normal
curve, 3 on one
side of centerof
andthe
3 onpeak
the other
side.
parameter
related
to the
full width
maximum
(FWHM)
according
to
all have the same overall shape described by mean (μ) and standard deviation (σ).
[1]
Alternatively, the parameter c can be interpreted by saying that the two inflection points of the function
x = b − c and x = b + c.
(68/95/99.7 Rule)
The fullEmpirical
widthRule
at tenth
of maximum (FWTM) for a Gaussian could be of interest and is
68% of observations are within 1 σ of μ (approx.!!! Really .6827)
95% of observations are within 2 σ of μ
99.7% of observations are within 3 σ of μ
The questions[2]
about “area”, “percent”, “relative frequency” are answered.
Gaussian functions are analytic, and their limit as x ® ∞ is 0 (for the above case of d=0).
Gaussian functions are among those functions that are elementary but lack elementary antiderivatives;
integral of the Gaussian function is the error function. Nonetheless their improper integrals over the wh
line can be evaluated exactly, using the Gaussian integral
1  2  3   1  2  3 
EXAMPLE 1: The distribution of the heights of women is normal with mean of 64.5 and a standard deviation of 2.5. What
percent of women are in the following ranges? :
and one1)obtains
P(x < 64.5) =
4) P(x > 57) =
2) P(x < 69.5) =
3) P(x > 62) =
5) P(57 < x < 67) =
6) P(59.5 < x <67) =
This integral is 1 if and only if a = 1/(c√(2π)), and in this case the Gaussian is the probability density fu
Notations:distributed
N(μ, σ); example
above is Nvariable
(64.5, 2.5) with expected value μ = b and variance σ2 = c2:
of a normally
random
13
FYI:
dx
Homework p137 23
What if the area you are interested in is not 1 or 2 standard deviations away from the mean? (reality…)
These Gaussians are plotted in the accompanying figure.
- 26
Standard Normal Distribution (z)
The conversion z = x - μ changes normal distributions into standard normal distribution.
σ
standard normal distributions are N(0 , 1) and use table A.
z-score - how many standard deviations away from the mean a score is & in what direction.
If sample data, what would the z-score formula look like?
z =
EXAMPLE: Stat test scores are: 92, 91, 85, 77, 79, 88, 99, 69, 73, 84
If you scored ____, how did you do relative to the class?
a) 91:
b) 88:
c) 73:
EXAMPLE: A student took a math test and got an 80. He took a Latin test and got a 90. If the math scores had a mean of 70
with a standard deviation of 8 and Latin had a mean of 95 with a standard deviation of 3, in which class did he do relatively better?
Homework p 118 1 - 4
To find the approximate probability of the test score from the example above, the z-scores need to be looked up on Table A
or using the calculator.
Example using Table A:
What proportion of all young women are greater than 68 inches tall, given that the distribution of heights for all young women
follow N(64.5, 2.5)?
Step one – State: P(x > 68) on N(64.5, 2.5)
Draw and label a normal curve
Step two – standardize x and label picture with z-score
z=
Step three – find the probability by using Table A, and the fact that the total area is equal to 1.
Step Four: Write a conclusion:
The proportion of young women that are_______________ than _____ inches is approximately ____________.
Use Table A to convert the following z-scores to probability. Draw a picture!!
1) P(z < 2.3) =
2) P(z < -1.52) =
3) P(z > -0.43) =
4) P(z > 3.1) =
5) P(-1.52 < z < 2.3) =
6) P(-3 < z < 3) =
Example of whole process:
A man’s wife is pregnant and due in 100 days. The corresponding probability density distribution function for having a child is
approximately normal with mean 100 and standard deviation 8. The man has a business trip and will return in 85 days and have to
go on another business trip in 107 days.
What is the probability that the birth will occur before his second trip?
1. Told births follow an approximately normal distribution.
2. Want:
3. Compute:
Now have:
107
Table gives:
Or
on calculator: normalcdf ( –10000,107,100,8) gives _____________
4. There is about an ________ chance that the baby will be born before the second
business trip.
EXAMPLE #3:
For 14 year old boys, cholesterol levels are ~ N(170 , 30).
a) What percent of boys have a level of 240 or more?
b) What percent of boys are between 170 and 240?
Normal Distribution Calculations
Process:
1) Normality—table is for normal distributions (or at least approximately normal distributions only.)
2) state in terms of x and draw the curve. Label with µ,σ, x
3) standardize with new graph (turn x into z).
4) use table A or calculator: normalcdf (lowerbound, upperbound, µ, σ)
5) answer the question (remember if the distributions is approximately normal, you have an approximate
probability).
HW p 118 1 – 4, p 121 6 – 8 (not 7d)
Finding a Data Value from a z-score:
Z=
x=
He is able to cancel his second business trip, and his boss tells him that he can return home from his first trip so that there
is a ________ chance that he will make it back for the birth. When must he return home?
1. Told that distribution of births is approximately normal
Hint: use table backwards (on calculator use invNorm ( area, mean, SD])
2. We are given the probability and we want the raw score (day to return). First, remember that if there is a __________ chance
that he will make it on time, then there is a _______ chance that he will not (table gives only values “less than”).
Probability statement: P(X < ? ) = _________
Use the table in reverse—find a z-score that gives .01 as the probability.
z
.00 .01
.02 .03 .04
.05 .06 .07 .08 .09
-2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
-2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084
-2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110
Search for the probability value that is closest to __________ and find __________ and
__________. Since __________ is closer to _________, use this value.
The corresponding z-score is -2.33. Now find the x that produces this z.
3. Have: -2.33 x -100
8
x=
or
on calculator: invNorm(
)
4. He must return from his business trip in _____days.
Note: All four steps must still be shown. The calculator is only replacing the z-calculation.
EXAMPLE #4:
SAT-V scores are ~ N(505,110)
1) How high must a student score to be the 30th percentile?
2) How high must a student score to get in the top 10%?
x
3) What scores contain the middle 50% of scores?
HW p 142 29 – 30, p 147 31 – 36 (not 32b)
Assessing Normality
The normal table (and normalcdf) _________________________________________________ because if the distribution is not
approximately normal, the probabilities will be wrong. Sometimes we are told we have a normal distribution. Sometimes we are
given data and can use histograms (dotplots or stemplots) to check for normality. It often easier to use normal probability plots and
look for linearity—________________________________________________________________________________________.
Normal probability plots give a visual way to determine if a distribution is approximately normal. These plots are produced by
doing the following:
1. The data are arranged from smallest to largest.
2. The percentile of each data value is determined.
3. From these percentiles, normal calculations are done to determine their corresponding z-scores.
4. Each z-score is plotted against its corresponding data value.
If the distribution is close to normal, the plotted points will lie ____________________________________.
Systematic deviations from a line indicate a non-normal distribution. In the first example below, candy bar weights, an
approximate normal distribution is shown.
Weights of Mounds Candy Bars
Computer output of a normal probability plot shows lines as boundaries—if the data falls within the lines, it is approximately
normal.
In this example, the histogram and the normal probability plot both show that this data is not approximately normal.
Assessing Normality
Method 1:
1) make a histogram or stemplot to check for big outliers, skews, gaps, etc...
2) calculate x + s, use 68/95/99.7 rule to see if it is normal.
Method 2:
1) make a normal probability plot (also-normal quantile plot)
You have a plot z vs. x
2) if the plot is close to a line, it is close to normal.
Using the calculator: STATPLOT, bottom right graph
right skew: largest observations are above a line drawn through the body of the data.
left skew: smallest are below the line.
EXAMPLE #5:
Is the following data normally distributed? Use both methods to check:
550
557
542
561
553
547
488
562
563
507
529
534
526
544
546
555
534
530
536
579
575
529
510
568
558
527
585
565
539
550