Download Section_3_-2__Measur..

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Data mining wikipedia , lookup

Time series wikipedia , lookup

Transcript
Data Description
Measures of Central Tendency
Chapter 2 showed how to organize raw
data into frequency distributions and
then present the data by using
various graphs. This chapter will
show the statistical methods that can
be used to summarize the data. The
most familiar of these methods is
finding the average.
The average speed of a car
crossing midtown Manhattan during
the day is 5.3 mph
The average number of minutes an
American father spends alone daily
with his child is 42 minutes
“average” when you stop and
think of it is a funny concept.
Although it describes all of us it
describes none of us…. While
none of us wants to be the
average American, we all want to
know about him or her.
The vast majority of people have
more than the average number of
legs.
The average American man is five
feet nine inches tall…
The average American woman is 5
feet 3.6 inches tall.
The average American is sick in
bed seven days a year…
On the average day, 24 million
people receive animal bites…
By his or her 70th birthday, the
average American will have eaten
14 steers, 1050 chickens, 3.5
lambs, and 25.2 hogs…
The word average is ambiguous,
since several different methods
can be used to obtain an
average.
Loosely stated, average means
the center of the distribution or
the most typical case.
Even if a shoe store owner
knows that the average man’s
foot is a size ten, they wouldn’t
be in business very long is the
only size they stocked was size
ten.
Measures of Central Tendency
this is where the notes
begin…
Statisticians use samples taken
from populations; however, when
populations are small, it is not
necessary to use samples since
the entire population can be used
to gain information.
Measures found by using all the
data in a population are called
parameters
Measures obtained by using data
values of samples are called
statistics
General Rounding Rule
DO NOT ROUND UNTIL THE
FINAL STEP!!!!
Mean
Arithmetic average
Mean is found by adding the values of
the data and dividing by the total
number of values.
The symbol for Mean is x
(pronounced: x bar)
Formula for sample Mean
x1  x2  x3  ...  xn  x
x

n
n
Formula for population Mean
x1  x2  x3  ...  xn  x


n
n
In statistics, Greek letters are
used to denote parameters, and
Roman letters are used to denote
statistics
In most cases the Mean is not
an actual data value
2, 3, 4, 8, 10
mean = 5.4
Rounding Rule for the Mean
The mean should be rounded to one
more decimal place than occurs in
the raw data. For example, if the
raw data are given in whole
numbers, the mean should be
rounded to nearest tenth. If the
data were given in tenths, then the
mean should be rounded to
hundredths, and so on.
Finding the mean for grouped data
Abbreviation for frequency – f
Abbreviation for midpoint - xm
Finding the mean for grouped
data
First set up your headers
Fill in the data
find the midpoint of each
class
multiply the frequency by the
midpoint for each class
find the sum of column D
Divide the sum of column D
by n to get the grouped mean
Is this mean the mean of the
data?
The Median
The symbol for median is MD
The median is the halfway point in a
data set. Before you can find this
point, the data must be arranged in
order. When the data set is ordered,
it is called a data array.
Steps in computing the
median
1. Arrange the data in order
2. Select the middle point
The number of rooms in the seven
hotels in downtown Pittsburgh is
713, 300, 618, 595, 311, 401, 292.
Find the median.
292, 300, 311, 401, 595, 618, 713
median
The number of tornadoes that have
occurred in the US over an 8 year
period follows. Find the median.
684 1133 764 702 1303 856
median
656 1132
Since the median falls between
two numbers, you have to do a
little math
Add the two numbers on
either side and divide by two.
764  856
 810
2
If there is an odd number of datum
the median is the middle data value
If here is an even number of datum
the median is the average of the
middle pair of data values.
The Mode
The mode is the value that
occurs most often in a data set
A data set may have one
mode, more than one mode,
or no mode at all.
The following data represent the
duration (in days) of Space Shuttle
voyages for the years 1992-94.
Find the mode
8
9
14 10
8
9
14
8
8
7
6
9
7
8
11
10 14 11
The mode is 8
If every datum is unique (no
number repeated) the data set is
said to have no mode.
Do not say mode is zero.
This indicates that there is a
mode and that it is 0.
The mode for grouped data is
the modal class.
The class with the largest
frequency
Example 3-14
mean = 20,000
median = 12,000
mode = 9,000
What is the average?
Midrange
Symbol for midrange is MR
defined as the sum of the lowest
value and the highest value in a
data set, divided by 2
a very rough estimate of the
middle
Weighted Mean
Sometimes, one must find the
mean of a data set in which not
all values are equally represented.
Consider the case of finding the
average cost of gasoline for three
taxis.
Taxi 1 buys 5 gallons @ 1.19/gal
Taxi 2 buys 8 gallons @ 1.27/gal
Taxi 3 buys 17 gallons @ 1.32/gal
What was the average cost of gasoline?
Taxi 1 buys 5 gallons @ 1.19/gal
Taxi 2 buys 8 gallons @ 1.27/gal
Taxi 3 buys 17 gallons @ 1.32/gal
NO!
Taxi 1 buys 5 gallons @ 1.19/gal
Taxi 2 buys 8 gallons @ 1.27/gal
Taxi 3 buys 17 gallons @ 1.32/gal
Taxi 1 buys 5 gallons @ 1.19/gal
Taxi 2 buys 8 gallons @ 1.27/gal
Taxi 3 buys 17 gallons @ 1.32/gal
(5)(1.19)  8(1.27)  17(1.32)
x
 1.285
12  8  17
Assignment
Exercise set 3-2
page 109
1-35