Download Chapter 2: Describing Distributions with Numbers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Chapter 2: Describing Distributions with Numbers
Describing Data – A brief description of a distribution should include its shape, center,
and spread
Shape – based on inspection of histogram or stemplot
Center – Mean and median
Spread – Quartiles, standard deviation
Mean – ordinary arithmetic average
x  x2      x n
1
or x   xi
x 1
n
n
Example: A student receives test scores of 78, 82, 85, 88, and 94. Find the mean.
Median (M) – midpoint of a distribution – half are smaller and half are larger

Arrange all observations from smallest to largest

If n is odd, M is the center observation (n+1)/2 value

If n is even, M is the mean of two center observations
Example: A student receives test scores of 78, 82, 85, 88, and 94. Find M.
Comparing the mean and median
Example: An instructor gives students the option of using either the mean or the
median for computing their final grade. Which option would you choose if you
typically received:
A. Two good scores and one poor score
B. Two poor scores and one good score
C. One poor, one fair, and one good score, approximately evenly spaced
Example: The Census Bureau reports that in 2001 the median income of American
households was $42,228 and the mean income was $58,208. Explain this
discrepancy.
Quartiles: Measuring Spread

Mark out the middle half of the data

To calculate quartiles:
o Find M
o Q1 is the median of the lower half
o Q3 is the median of the upper half
Example: Find the quartiles of the following data set:
6 19 20 21 26 27 28 30 31 32 34 36 38 40 45 50
Example:
6 19 20 21 26 27 28 30 31 32 34 36 38 40 45 50 52
The Five-Number Summary & Boxplots
Min
Q1
M
Q3
Max
Boxplot

Central box spans the first and third quartiles

Line in box marks M

Line extends from the box out to Min and Max
Example: Give the 5-Number Summary and a Boxplot for the Babe Ruth data
22 25 34 35 41 41 46 46 46 47 49 54 54 59 60
Variance - measures spread by examining how far data values are from their mean
( x1  x) 2  ( x 2  x) 2      ( x n  x) 2
1
2
s 
( xi  x) 2
or s 2 

n 1
n 1
Standard Deviation – square root of the variance s  s 2
Example: (2.8 from your text) The level of various substances in the blood
influences our health. Here are measurements of the level of phosphate in the
blood of a patient, in milligrams of phosphate per deciliter of blood, made on 6
consecutive visits to a clinic:
Phosphate levels (xi)
Deviations (xi – x ) Squared deviations (xi – x )2
5.6
5.2
4.6
4.9
5.7
6.4
Numerical Summaries Review

Always plot your data

Look for an overall pattern and for outliers

Calculate a numerical summary to describe center & spread