Download 2.1ааDescribing Location in a Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Receiver operating characteristic wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
2.1 Describing Location in a Distribution
See p.85 for example at top of the page.
Measuring Position: Percentiles
Definition: Percentile
The p­th percentile of a distribution is the value with p percent of the observations less than it.
See Mr. Pryor's first test example.
79 81 80 77 73 83 74 93 78 80 75
67 73 77 83 86 90 79 85 83 89
84 82 77 72
Alternate Definition: Percentile
1. The summary measures that divide a ranked data set into 100 equal parts (100)
2. The approximate value of the k­th percentile, denoted pk, is
pk = the value of the (k*n)/100­th term in a ranked
data set
Ex. Find the 20th percentile of the data set
24
28
33
33
37
39
47
51
59
n = 9 so (k*n)/100 = (20*9)/100 = 180/100 = 1.8 approx = 2
So we want the 2nd observation. => p20 = 28
1
Ex.
3
Find the 65th percentile of the data set
5
6
6
7
9
9
10
11
12
14
15
k = 65 n = 12 so (65812)/100 = 7.8 approx = 8th observation
so p65 = 10
2
The book notes p. 86 that there are several ways that mathematicians and statisticians calculate and interpret the percentile. In fact, there is not a universal way to determine percentiles.
We will continue to do what the book suggests and the AP test reader will accept any reasonable definition as well as have MC answers that will be correct by any reasonable method.
3
CUMULATIVE RELATIVE FREQUENCY GRAPHS
These are created with percentiles
1. A graph that shows the total # of values that fall below the upper boundary of each class.
2. The plot is always Non­decreasing.
See page 86 ­88 for the example for this topic.
Age of the first 44 presidents when elected example. 100%
0%
This is for the 40­44 class and we plot the point at 4.5% since 45 would be at the 4.5th percentile
Age at Inauguration ­ Example
What can we learn? p87
4
p88 Age of U.S. Presidents
Do CYU on P89
5
Note: The 1st Quartile would be at the 25th Percentile
The 2nd Quartile would be at the 50th Percentile
The 3rd Quartile would be at the 75th Percentile
6
MEASURING POSITION: z ­ Scores
When you convert observations from original values to standard deviation units it is known as standardizing.
Definition: Standardized value (z ­score)
If x is an observation from a distribution that has known mean and standard deviation, the standardized value of x is z = (x ­ mean)/(standard deviation)
A standardized value is often called a z ­ score.
A z­score tells us how many standard deviations from the mean an observation falls and in what direction!
(+) z­score = lies to the right of the mean
(­) z­score = lies to the left of the mean
7
mean
or
Notation: Z = population
sample
standard deviation
In other words: Observation ­ Mean
Stand. Deviation
8
Mr. Pryor's First Test, Again p90
79 81 80 77 73 83 74 93 78 80 75 67
73 77 83 86 90 79 85 83 89 84 82
77 72
9
Jenny Takes Another Test
z = (82­76)/4 = 1.50
Her 82 in chemistry was 1.5 standard deviation above the mean score for the class. CYU on p91
10
Example: Find the z­score of 33
Data: 24
28
33
33
37
39
47
Mean = 39
Standard Deviation = 11.34681
51
59
Z = (33­39)/(11.34681) = ­.53
33 lies .53 standard deviations below the mean.
11
Ex. 2 Find the z­score of 12
Data: 3
5 6 6 7 9 9 10
11
12
14
15
Mean = 8.916667 St. Deviation: 3.679386
Z = (12­8.916667)/(3.679386) = .84
12 lies .84 Standard Deviations above the mean
See extra examples on p18 of Black notebook
12
TRANSFORMING DATA
EFFECT OF ADDING OR SUBTRACTING A CONSTANT
Adding the same number a (either positive, zero, or negative) to each observation.
1. Adds a to measures of center and location (mean, median, quartiles, percentiles)
but
2. Does not change the shape of the distribution or measures of spread (range, IQR, standard deviation)
13
EFFECT OF MULTIPLYING (OR DIVIDING) BY A CONSTANT
Multiplying ( or dividing) each observation by the same number b (positive, negative, or zero)
1. multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b.
2. multiplies (divides) measures of spread (range, IQR, standard deviation) by b (absolute value of b)
but
3. does not change the shape of the distribution.
14
Estimating Room Width p93
8 9 10 10 10 10 10 10 10 11 11 11 11 12 12 13 13
14 14 14 15 15 15 15 15 15 15 15 16 16 16 17 17 17 17 18 18
20 22 25 27 35 38 40
Example: Effect of Subtracting by a Constant
Example: Effect of multiplying by a constant
15
Example page 96 and 97
Too Cool at the Cabin?
F = (9/5)C + 32
a) Mean
9/5(8.43)+32 = 47.17
b) S. Dev.
9/5(2.27) = 4.09
c) 9/5(11) +32 = 51.8 degrees F.
16
DENSITY CURVES
RECALL:
We learned some basic steps to exploring quantitative data.
1. Always plot your data: make a graph, usually a dotplot, stemplot or histogram.
2. Look for the overall pattern (shape, center, spread) and for striking departures such as outliers.
3. Calculate a numerical summary to briefly describe center and spread.
Let's add one more step!
4. Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve.
17
Example: Seventh­Grade Vocabulary Scores p100
From Histogram to density curve
18
Definition: Density Curve
A density curve is a curve that
1. is always on or above the horizontal axis, and
2. has area exactly 1 underneath it.
A density curve describes the overall pattern of a distribution. The area under the curve and above any interval of values on the horizontal axis is the proportion of all observations that fall in that interval.
No set of real data is exactly described by a density curve. The curve is an approximation that is easy to use and accurate enough for practical use.
19
Describing Density Curves
1. The median of a density curve is the "equal­areas point", the point with half the area under the curve to its left and the remaining half of the area to its right.
2. The mean of the density curve is the "balance point" of the distribution. (i.e. The mean is the point at which the curve would balance if made of solid material.)
a) If the density curve is symmetric then the mean and median are the same and they lie at the center of the curve:
b) If the curve is skewed then the mean is pulled away from the median in the direction of the long tail.
Right Skew
Left Skew
CYU on page 103
20
Homework p 105
1, 5, 9­15 odd
Homework p107
19­23 odd, omit 17, 31, 33­38all 21