Download 2.2a Notes File - Northwest ISD Moodle

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
AP Statistics
Notes 2.2a
Exploring Quantitative Data
1. Always plot your data: make a graph, usually a dotplot, stemplot, or histogram.
2. Look for the overall pattern (shape, center, spread) and for striking departures such as outliers.
3. Calculate m=numerical summaries to briefly describe center and spread.
4. Sometimes the overall pattern of a large number of observations is so regular that we can describe it
by a smooth curve.
Density curve:
A density curve describes the overall pattern of a distribution. The area under the curve and above any range
of values is the proportion of all observations that fall in that range.
Displays the overall pattern (shape) of a distribution.
Has an area of exactly 1 sq. unit underneath it.
Is on or above the horizontal axis.
A histogram becomes a density curve if the scale is adjusted so that the
total area of the bars is 1 sq. unit.
Density curves, like distributions, come in many shapes. A density curve is
often a good description of the overall pattern of a distribution. Outliers,
which are departures from the overall pattern, are not described by the
curve. No set of real data is exactly described by a density curve. The
curve is an approximation that is easy to use and accurate enough for practical use.
Because a density curve is an idealized description of a distribution of data, we distinguish between the mean
and standard deviation of the density curve and the mean x and standard deviation s x computed from the
actual observations. The usual notation for the mean of a density curve is  (the Greek letter mu). We write
the standard deviation of a density curve as  (the Greek letter sigma).
The median of a density curve is the point that divides the area under the curve into halves.
The mean of a density curve is the "balance point" of the curve. (Think of a teeter-totter.)
Example
1. Sketch a density curve of a uniform distribution. The curve takes the constant value 1 over the interval
form 0 to 1 and is zero outside the range of values. This means that the data described by this distribution
take values that are uniformly spread between 0 and 1. Use areas under this density curve to answer the
following questions.
The height of the distribution must
be 1 so that the area under the curve is 1.
0
1
A. What percent of the observations lie above 0.8?
The area under the curve between 0
and 0.8 is 0.8 or 80%
0.8
B. What percent of the observations lie below 0.6?
The area under the curve between 0
and 0.6 is 0.6 or 60%.
C. What percent of the observations lie between 0.25 and 0.75?
The area under the curve between
0.25 and 0.75 is 0.5 or 50%.
D. What is the mean µ of this distribution?
Since the distribution is symmetric, the mean is 0.5.
2. A line segment can be considered a density “curve,” as shown in the
figure. A “broken line” graph can also be considered a density curve.
A. Verify that the graph is a valid density curve.
To verify that the density curve is valid, calculate the area under the
curve. If the area is 1, the density curve is valid.
If we divide the curve into two parts, the trapezoid from 0 > x > 0.4
and the rectangle from 0.4 < x < 0.8
The area of the trapezoid is 0.5(2 + 1)(0.4) = 0.6
The area of the rectangle is 1(0.4) = 0.4
The total area is 0.6 + 0.4 = 1, making the density curve valid.
For each of the following, use areas under this density curve to find the
proportion of observations within the given interval:
B. 0.6  X  0.8
The area is 1(0.2) = 0.2, therefore the proportion of observations is 0.2.
C. 0  X  0.4
The area is 0.5(2 + 1)(0.4) = 0.6, therefore the proportion of
observations is 0.6.
D. 0  X  0.2
The area is 0.5(2 + 1.5)(0.2) = 0.35, therefore the proportion of observations is 0.35.
E. The median of this density curve is a point between X = 0.2 and X = 0.4. Explain why.
The median is the x-value such that the area to the left is 0.5 and the area to the right is 0.5. Since the
area to the left of 0.2 is 0.35 and the area to the right of 0.4 is 0.4, the 0.5 point must be between the
two.
Normal Distributions
One particularly important class of density curves is the Normal curves. The distributions they describe are
called Normal distributions. Any particular normal distribution is completely specified by two numbers; its
mean and standard deviation. The mean of a normal distribution is at the center of the symmetric normal
cure. The standard deviation is the distance from the center to the inflection points on either side. We
abbreviate the Normal distribution with mean µ and standard deviation σ as N(µ, σ).
Normal Distributions are important for three reasons
1. They are good descriptions for some distributions of real data. (i.e. scores on tests taken by many
people (SAT and IQ tests), repeated careful measurements of the same quantity and characteristics of
biological populations
2. They are good approximations to the results of many kinds of chance outcomes, like the number of
heads in many tosses of a fair coin
3. They are used in many statistical inference procedures.
The standard Normal distribution is the normal distribution with mean 0 and standard deviation 1. If a
variable x has any normal distribution N(µ, σ) with mean µ and standard deviation σ, then the standardized
𝑥−𝜇
variable 𝑧 = 𝜎 has the standard Normal distribution.
In the normal distribution,
68% of the observations fall within 1 standard deviation of the
mean.
95% of the observations fall within 2 standard deviations of
the mean.
99.7% of the observations fall within 3 standard deviations of
the mean.
A normal distribution curve has two points where curvature changes. These are called points of inflection, and
they are located 1 standard deviation on either side of the mean.
An observation's percentile is the percent of the distribution that is at or to the left of the observation. If, for
instance, if you have a test score representing the 90 th percentile, then only 10% of the test-takers scored
higher than you did.
Example
3. The distribution of heights of adult American men is approximately normal with mean 69 inches and
standard deviation 2.5 inches. Draw a normal curve on which this mean and standard deviation are correctly
located.
A. What percent of men are taller than 74 inches?
B. Between what heights do the middle 95? of men fall?
C. What percent of men are shorter than 66.5 inches?
D. A height of 71.5 inches corresponds to what percentile of adult male American heights?