Download Measures of Center

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Categorical variable wikipedia , lookup

Regression toward the mean wikipedia , lookup

Time series wikipedia , lookup

Transcript
Descriptive Statistics
Measures of Center
•
Essentials
•
Notation
•
Measures of Center
•
Mean
•
Median
•
Mode
•
Mid-range
•
Example
•
Which to Use: Mean vs. Median vs. Mode
•
Additional Topics (not addressed)
Essentials: Measures of Center
(The great mean vs. median conundrum.)

Be able to identify the characteristics of the median,
mean and mode, and to which types of data each applies.

Be able to calculate the median, mean and mode, as
appropriate, for a set of data.

Affected by vs. resistant to extreme values. What are
the implications for the mean and median?.
Some Notation

denotes the addition of a set of values
X
(capital)is the variable usually used to
represent the individual data values
xi
(small letter) represents a single value of
a variable from the first value, x1, to the
last value xn
n
represents the number of data values in
a sample
N
represents the number of data values in
a population
Measures of Center

Measures of Central Tendency

Indicate where the center or most typical value of a data set
lies

Are often thought of as averages

Include the Mean, Median, Mode, and Midrange
The Mean (Arithmetic)
The Formula:
n
x
x
i 1
i
n

The “average” of a set of data.

Is the sum of the observations divided by the number of
observations.

Is used only with quantitative data.
Population Mean vs. Sample Mean
A Sample Mean is represented by the
lower case letter x with a bar above it
(called x-bar)
x  x
n
A Population Mean is represented by
the lower case Greek letter m (mu)
m  x
N
Median

The middle observation in a set of data.

Divides the data such that 50% of the observations lie
below the median and 50% lie above it.

Is used only with quantitative data.

To obtain the median, the data must be placed in
increasing order.
MEDIAN: The Formula

First: Arrange the scores in increasing order.
Second: Apply the formula (n+1)/2. (Where n is
the number of data values.)
 If there is an EVEN
If there is an ODD
number of scores, the
number of scores, the
middle score is the
Median lies between
value of the Median.
the two middle scores.

e.g: 1, 3, 6 => Median is
(n+1)/2 = (3+1)/2 = 2
(position). So, the Median
is value in the second
position of the list of values.
Here the second value is
the number 3.

e.g: 1, 2, 8, 15 => Median is
(n+1)/2 = (4+1)/2 = 2.5
(position). So, the Median
is the data value that lies
1/2 way between the
second and third data
values. Here that value
would be 5.
Remember, the formula computes a position, not a data value.
Calculating a Median:

Determine the median for the following
backpack weights:

Backpack weights (lb): 10, 14, 12, 18, 32,
15, 22, 19, 23, 61.
MODE: The Formula

The most frequently occurring score in a data set.

Obtain the frequency of each value.

A Frequency Table based upon Single-Value
Grouping or a Dot Plot would display this information.

Used with both qualitative and quantitative data.

It is the only measure of center for qualitative data.

There may be more than one Mode

If there are two modes, the data set is bimodal.

If there are more than two modes, the data set is multimodal.

If there is the same number of each value, then there is no
mode
Midrange

The Midrange is a measure of
center of a distribution. It indicates
the value midway between the
highest and lowest values in a data
set. To find the midrange.
Highest Value + Lowest Value
2
Example: Comparing the
Mean, Median, and Mode
Find the mean, median, and mode of the sample ages of a class
shown. Which measure of central tendency best describes a
typical entry of this data set? Are there any outliers?
Ages in a class
20
20
20
20
20
20
21
21
21
21
22
22
22
23
23
23
23
24
24
65
Source: Larson/Farber 4th ed.
Solution: Comparing the Mean,
Median, and Mode
Ages in a class
20
20
20
20
20
20
21
21
21
21
22
22
22
23
23
23
23
24
24
65
Mean:
x 20  20  ...  24  65
x

 23.8 years
n
20
Median:
21  22
 21.5 years
2
Mode:
20 years (the entry occurring with the
greatest frequency)
Source: Larson/Farber 4th
ed.
Solution: Comparing the
Mean, Median, and Mode
Mean ≈ 23.8 yrs.
Median = 21.5 yrs.
Mode = 20 yrs.
• The mean takes every entry into account, but is
influenced by the outlier of 65.
• The median here was determined by taking the
middle two entries into account, and it is not
affected by the outlier.
• In this case the mode exists, but it doesn't
appear to represent a typical entry.
Source: Larson/Farber 4th ed.
Solution: Comparing the
Mean, Median, and Mode
Sometimes a graphical comparison can help you
decide which measure of central tendency best
represents a data set.
In this case, it appears that the median best describes
the data set.
Source: Larson/Farber 4th ed.
Mean vs. Median vs. Mode

Which is the best Measure of Center????



MEAN:

Is sensitive to the influence of extreme scores (outliers),
which will “pull” the mean away from the center.

Involves ALL data values in the calculation
MEDIAN:

Is resistant to the influence of extreme values.

Only uses One or Two points in its calculation.
MODE:

May not be anywhere near the center of the data.

Not really aimed at finding the middle of the data.

Is the ONLY “Measure of Center” for Qualitative Data.
Additional Topics
Weighted Means

Weighted Mean – a mean computed with different scores
assigned different weights. To find the weighted mean

(
wx
)
x
w
Weighted Example: Finding a
Weighted Mean
You are taking a class in which your grade is
determined from five sources: 50% from
your test mean, 15% from your midterm,
20% from your final exam, 10% from your
computer lab work, and 5% from your
homework. Your scores are 86 (test mean),
96 (midterm), 82 (final exam), 98 (computer
lab), and 100 (homework). What is the
weighted mean of your scores? If the
minimum average for an A is 90, did you get
an A?
Source: Larson/Farber 4th ed.
Solution: Finding a Weighted Mean
Source
Score, x
Weight, w
Test Mean
86
0.50
86(0.50)= 43.0
Midterm
96
0.15
96(0.15) = 14.4
Final Exam
82
0.20
82(0.20) = 16.4
Computer Lab
98
0.10
98(0.10) = 9.8
Homework
100
0.05
100(0.05) = 5.0
Σw = 1
x∙w
Σ(x∙w) = 88.6
( x  w)
88.6
x 

 88.6
w
1
Your weighted mean for the course is 88.6. You did not get
an A.
Source: Larson/Farber 4th ed.
Weighted Means Example

(
wx
)
x
w

Calculating a GPA.

Given the following four grades, calculate the semester GPA.

Statistics A (of course; 3 CrHrs; numeric value for an A = 4)

History B (3 CrHr; B = 3)

Physics C (3 CrHr; C = 2)

Physical Education C (1 CrHr)

The grade numeric equivalents are the x values. The credit hour
values are the weights.

Calculate the student’s GPA.
Finding a Mean From a
Frequency Table (Grouped Data)
When we view data in a frequency
table, it is impossible to know the
exact values falling in a particular
class. To find this value, obtain the
product of each frequency and class
midpoint (here “x”), add the
products, and then divide
by
the sum

(
fx
)
x
of the frequencies.
f
Finding the Mean of a Frequency
Distribution
In Words
In Symbols
1. Find the midpoint of each
class.
(lower limit)+(upper limit)
x
2
2. Find the sum of the
products of the midpoints
and the frequencies.
( x  f )
3. Find the sum of the
frequencies.
n  f
4. Find the mean of the
frequency distribution.
Source: Larson/Farber 4th ed.
( x  f )
x
n
Example: Find the Mean of a
Frequency Distribution
Use the frequency distribution to approximate the mean number
of minutes that a sample of Internet subscribers spent online
during their most recent session.
Source: Larson/Farber 4th ed.
Class
Midpoint
Frequency,
f
7 – 18
12.5
6
19 – 30
24.5
10
31 – 42
36.5
13
43 – 54
48.5
8
55 – 66
60.5
5
67 – 78
72.5
6
79 – 90
84.5
2
Example: Find the Mean of a Frequency
Distribution
Use the frequency distribution to approximate the mean
number of minutes that a sample of Internet subscribers
spent online during their most recent session.
Class
Midpoint, x
Frequency, f
(x∙f)
7 – 18
12.5
6
12.5∙6 = 75.0
19 – 30
24.5
10
24.5∙10 = 245.0
31 – 42
36.5
13
36.5∙13 = 474.5
43 – 54
48.5
8
48.5∙8 = 388.0
55 – 66
60.5
5
60.5∙5 = 302.5
67 – 78
72.5
6
72.5∙6 = 435.0
79 – 90
84.5
2
84.5∙2 = 169.0
n = 50
Σ(x∙f) = 2089.0
( x  f ) 2089
x

 41.8 minutes
n
50
Source: Larson/Farber 4th ed.
End of Slides