Download Density curve (graph of y=f(x))

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Lecture 2 (Section 1.3 of the textbook)
Normal distribution
Density curve
Density curves
Normal distributions
The 68-95-99.7 rule
The standard Normal distribution
Normal distribution calculations
Normal quantile plots
Proportion of students with vocabulary score ≤ 6:
Density curve (graph of y=f(x))
Normal density curve
is always on or above the horizontal axis,
has area exactly 1 underneath it.
Mathematically:
Median – the equal-areas point.
Mean – the balance point of the density curve.
A right skewed density curve
1
Mean is the balance point of the density curve.
Notation: µ.
Parameters of the distribution vs statistics of the data
µ – mean of the idealized distribution (i.e. of the
density curve)
σ – standard deviation of the idealized distribution
x - mean of the actual observations (sample mean)
s – standard deviation of the actual observations
(sample standard deviation)
We observe/calculate:
We are interested in:
Normal distributions with different σ and µ.
Formula for Normal density:
1
f ( x) =
e
σ 2π
1  x−µ 
− 

2 σ 
2
The 68-95-99.7 rule for Normal data
Approximately 68% of the observations fall
within σ from µ.
Approximately 95% of the observations fall
within 2σ from µ.
Approximately 99.7% of the observations fall
within 3σ from µ.
2
Example
Heights of young women aged 18 to 24.
Normal with µ = 64.5, σ=2.5.
Find ranges for the middle 99%, 95%, 68% of
the population:
Finding probabilities for Normal data
Standardizing
z-score – standardized value of x
(z=how many standard deviations from the mean)
z=
x−µ
σ
Example: Compute the standardized scores for
young women 70 inches tall; 60 inches tall.
The standardized values for any distribution always
have mean 0 and standard deviation 1.
If the original distribution (X) is Normal, then the
standardized values have Normal distribution (Z)
with mean 0 and standard deviation 1:
(X) N(µ, σ) --standardization--> N(0,1) (Z)
Standardization: Z=(X-µ)/σ i.e. X=µ+σZ.
Probabilities for N(0,1) are given in statistical tables.
3
Examples:
What is the proportion of these young women
who are less than 70 inches tall ?
less than 2.2?
greater than -2.05?
In Y2K the scores of students taking SATs
were approximately Normal with mean 1019
and standard deviation 209. What percent of
all students had the SAT scores of :
at least 820? (=limit for Division I athletes to
compete in their first college year)
between 720 and 820? (=partial qualifiers)
Procedure for percentiles of N(µ,σ):
Inverse reading in Normal table
What is z such that P(Z<z)=0.95?
Example:
What proportion of observations of a standard
Normal variable Z take values:
xp=µ+σ *zp
Calculate 95th, and 99th percentile of the
distribution of SAT.
What is z such that P(Z>z)=0.01?
The first z is the 95-th percentile, z95, of N(0,1).
The second z is the 99th percentile, z99.
4
Example (continued)
How high must a student score in order to be
in the top 20 % of all students taking SAT?
Normal quantile plots
If the plotted points lie close to a straight line
then the distribution of data is close to Normal.
Construction: use software or
Summary: in Normal calculations use
z=(x-µ)/σ or x=µ+σz.
Newcomb’s data
Supermarket spending data: skewness (heavy right tail).
arrange the data from smallest to largest and record
corresponding percentiles (w/r to the data),
find z-scores for these percentiles (for example zscore for 5-th percentile is z=-1.645),
plot each data point against the corresponding z.
Interpretation: tails of the distribution etc...
Newcomb’s data without outliers.
IQ scores of seventh-grade students (shorter both tails).
5