Download 1.2 Describing Distributions with Numbers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Transcript
AP Statistics
Fuel Economy for 2004 car models
2-seater
Cars
Compact
cars
Model
City
HWY
Model
City
Hwy
Acura NSX
17
24
Ashton Mar
12
19
Audi
20
28
Audi TT
21
29
BMW z4
20
28
BMW 325
19
27
Cadillac XLR 17
25
BMW 330
19
28
Corvette
18
25
BMWM3
16
23
Miata
22
28
Jaguar XK8
18
26
Viper
12
20
Jaguar XKR
16
23
Ferrari 360
11
16
Lexus SC
18
23
Ferrari M
10
16
Mini Cooper
25
32
Honda I
60
66
Mits Eclipse
23
31
Thunderbird
17
23
Porsche 911
14
22
Lotus
15
22
Mits Spyder
20
29
Mean and Median

Mean: The Average

Median: The Middle
Mean Highway mileage for two
seaters
x1  x2  ...xn 518
x

 24.7
21
21
Caution!!!
The mean is sensitive to the influence of a few
extreme observations. It is not a resistant
measure of center.
Measuring Spread
The simplest useful numerical description of a
distribution includes a description of both the
center and the spread.
 Range: simplest measure of spread. The
highest value minus the lowest value.
 Pth percentile of a distribution is the
percent of observations that fall below it.

The Quartiles
Calculating Quartiles
The highway mileage of 20 2-seaters arranged in
numerical order are:
13 15 16 16 17 19 20 22 23 23 |23 24 25 25 26 28 28 28 29 32
The median is marked by |.
The 1st Quartile (Q1) is the median of the 10 numbers to the left of
the median.
The 3rd Quartile (Q1) is the median of the 10 numbers to the right
of the median.
If there are an odd number of observations, the median is the
middle number and it is excluded from calculations of Q1 and
Q3.
The Five–Number Summary and
Boxplots
The Five–Number Summary and
Boxplots
Boxplots of the highway and city gas mileages for
cars classified as two-seaters and minicompacts.
Outliers
Lower Bound: Q1 – 1.5(IQR)
•If an observation falls below the lower bound it is
considered an outlier.
Upper Bound: Q3 + 1.5(IQR)
•If an observation falls above the upper bound it is
considered an outlier.
Measuring Spread: Variance and
Standard Deviation
Some FAQs

Why do we square the deviations?
 If not, they would all add to zero.

Why do we emphasize the standard deviation rather
than the variance?
 The standard deviation measures spread about the mean
in the original scale.

Why do we average by dividing by n-1 rather than n
in calculating the variance?
 Because the sum of the deviations is always zero, the last
deviation can be found once we know the other n-1. The
number n-1 is called the degrees of freedom.
Choosing Measures of Center and
Spread