Download Unit 20 - Central tendency and dispersion

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CMV6120
Mathematics
Unit 20 : Measures of Central Tendency and Dispersion
Learning Objectives
The students should be able to:

determine the mean, median and mode from ungrouped data

determine the mean, median and mode from grouped data

determine the range, inter quartile range and standard deviation.
Activities
Teacher demonstration and student hand-on exercise.
Use MS Excel spreadsheet, internal functions and data analysis to measure central tendency and
dispersion.
Reference
Suen, S.N. (1998) “Mathematics for Hong Kong 5A”; Rev. Ed.; Canotta
Unit 20 Central tendency & dispersion
Page 1 of 13
CMV6120
Mathematics
Measures of Central Tendency and Dispersion
1. Measure of central tendency: mean, median and mode from grouped and
ungrouped data
For a set of data, we determine a quantity used to summarise the whole set of data. This quantity is
termed a measure of central tendency. The most commonly used measures are mean, medium and
mode.
1.1 mean
For ungrouped data,
Mean ( x) 
x1  x2  x3  ...  xn
n
Example 1
Find the mean for the set data: 3, 7, 2, 1, 7
Solution
x
3  7  2 1 7
5
=
For grouped data,
Mean( x) 
x1 f1  x2 f 2  x3 f 3  ...  xn f n
f1  f 2  f 3  ...  f n
Example 2
a)
b)
Find the mean of the set of data: 25, 36, 42, 38, 36
Find the mean from the set of grouped data
Class mark
Frequency
10.5
19
30.5
6
50.5
3
70.5
2
90.5
1
110.5
2
Solution
a)
mean =
Unit 20 Central tendency & dispersion
Page 2 of 13
CMV6120
Mathematics
b)
x
10.5
30.5
50.5
70.5
90.5
110.5
sum
f
19
6
3
2
1
2
33
xf
199.5
mean =
Example 3
The HK Consumer Price Index B from 1997 to 2003 was as following:
1996 99.7
1997 105.5
1998 108.5
1999 103.4
2000 99.4
2001 97.7
2002 94.7
2003 92.1
Calculate the average consumer price index B:
a)
For the first 4 years, (1997 – 2000).
b)
For the next 3 years, (2001 – 2003)
c)
For all 7 years
d)
Suppose the original data was lost, and only the 4- and 3-year averages in a) and b)
were available. Would it still be possible to calculate the overall 7-year average? How?
Solution
a)
From 1997  2000, n = 4.
The average price index = (105.5 + 108.5 + 103.4 + 99.4)  4 =
b)
From 20012003, n = 3.
The average price index = (97.7 + 94.7 + 92.1)  3 =
c)
From 1997  2003, n = 7.
The average price index
= (105.5 + 108.5 + 103.4 + 99.4 + 97.7 + 94.7 + 92.1)  7
=
d)
The average price index over 7 years = (
=
Unit 20 Central tendency & dispersion
4+
 3 )  (4 + 3)
Page 3 of 13
CMV6120
Mathematics
1.2 Median
For ungrouped data,
Median = the middle datum, when n is odd.
Median = the mean of the two middle data, when n is even.
e.g.1
For the set of data
2, 4, 7, 9, 21
e.g.2
For the set of data
3, 5, 7, 7
middle datum
middle of two data
median = 7
median = (5 + 7)  2
=6
For grouped data,
Step 1:
Step 2:
Draw the cumulative frequency polygon.
The median is the datum corresponding to the middle value of the cumulative
frequency.
Example 4
a) Find the median of 2, 3, 10, 12, 999.
b) Find the median of 2, 3, 10, 12, 22, 123.
c) The cumulative frequency polygon for maths marks of a class is given below, find the median
mark.
Cumulative frequency polygon for marks in maths
Solution
40
a) Median =
35
b) Median =
30
The rank of median
= 40/2 =
frequency
c) Total frequency
= 40
25
20
15
10
From the cumulative
polygon,
median =
5
0
9.5
19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5
Marks
Unit 20 Central tendency & dispersion
Page 4 of 13
CMV6120
Mathematics
Example 5
The provisional figures on the population by age group in Hong Kong as at 9/2001 are tabulated
below. Draw a cumulative frequency polygon and determine the median age for the population.
Age group
> 70
0  9 10  19 20  29   49 50  59 60  69
Population
676
885
1208
677
503
499


(‘000)
Less than Cumulative frequency polygon for population
Solution
Population
('000)
x < 10
10 ≦ x < 20
20 ≦ x < 30
30 ≦ x < 40
40 ≦ x < 50
50 ≦ x < 60
60 ≦ x < 70
70 < x
676
885
1000
1267
1208
677
503
499
cumulative
population
(‘000 000)
0.676
1.561
8.0
7.0
Population (‘000 000)
Age group
x
6.0
5.0
4.0
3.0
2.0
The rank of median = 6.715/2 =
1.0
0
The median age =
10
20
30
40
50
60
70
70+
Age group
1.3 mode
For ungrouped data, mode is the datum that has the highest frequency.
For grouped data, modal class is the class that has the highest frequency.
Example 6
a) Find the mode of the data:
1, 2, 2, 2, 3, 3, 9
b) Find the modal class
Class
Frequency
10 - 14
2
15 - 19
8
20 - 24
7
25 - 29
3
Solution
a) The mode is
b) The modal class is
Unit 20 Central tendency & dispersion
Page 5 of 13
CMV6120
Mathematics
Example 7
The temperature in degree Celsius each day cover a three week period were follow:
17, 18, 20, 21, 19, 16, 15, 18, 20, 21, 21,,22, 21, 19, 20,19, 17,16,16,17.
Compute the mean, median, and mode of these raw dates by using two-degree intervals starting
with 15-16. Draw a cumulative frequency polygon.
Solution
Tally
Frequency Class mark
f
x
Sum
Temperature cumulative
frquency
(℃)
< 14.5
< 16.5
< 18.5
< 20.5
< 22.5
< 24.5
fx
Cumulative frequency polygon
for temperature
24
20
Frequency
Temperature
(℃)
15  16
17  18
19  20
21  22
23  24
16
12
8
4
0
14.5 16.5 18.5 20.5 22.5 24.5
Temperature (℃)
The mean temperature =
The modal class of temperature is
The median temperature is
Remark
Mean seems to be the most commonly used (and often misused) quantity for measuring central
tendency. If the distribution of the data set shows a strong degree of skewness, then mean is
not a reliable measure as it is strongly affected by the extreme values. In this case, medium may
be a better choice. Mode is used when there is reason to choose the most commonly occurring
data value as the representative for the whole data set.
Unit 20 Central tendency & dispersion
Page 6 of 13
CMV6120
Mathematics
2. Measure of dispersion: Range, Inter-quartile range and Standard deviation
Apart from using a measure of central tendency to summarise a set of data, we need a quantity to
measure the degree of dispersion of the set of data (so that we can determine the reliability of the set
of data). Range is a measure that is very simple to use but it provides relatively little information on
dispersion. Quartile deviation is used in association with the median whereas standard deviation
goes with the mean.
2.1 Range
For ungrouped data, the range is the difference between the largest datum and the smallest
datum.
For grouped data, the range is the difference between the highest class boundary and the lowest
boundary.
Example 8
a) Find the range of the data:
1, 2, 2, 2, 3, 3, 9
b) Find the range of the grouped data
Class
Frequency
10 - 14
2
15 - 19
8
20 - 24
7
25 - 29
3
Solution
a) The range =
b) The range =
2.2 Inter quartile range
Inter quartile range = Q3 – Q1
where Q1, Q2, Q3 are called quartiles which divide the data (which have been ranked, i.e.
arranged in order) into four equal parts.
Moreover,
Q2 is the median of the whole set of data,
Q1 is the median of the lower half,
Q3 is the median of the upper half.
Quartile deviation, Q.D. = ½ (Q3  Q1)
Unit 20 Central tendency & dispersion
Page 7 of 13
CMV6120
Mathematics
Example 9
Find the inter quartile range of
a) 1, 2, 3, 5, 11, 12, 13.
b) 1, 2, 3, 4, 11, 12, 13, 14.
Solution
a) inter-quartile range =
b) inter-quartile range =
Example 10
The following frequency distribution gives the life hours of a sample of 50 light bulbs:
Life hours (‘000) Frequency
0.6 to under 0.7
0.7 to under 0.8
0.8 to under 0.9
0.9 to under 1.0
1.0 to under 1.1
1.1 to under 1.2
1.2 to under 1.3
2
4
6
14
13
7
4
Life hours Cumulative
Up to (‘000) frequency
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
Find the median and the inter-quartile range of the data.
The rank of median
= ½ × 50 =
The median of life hours is
Less than cumulative frequency polygon
for life hours of 50 sample light bulbs
hrs.
56
Frequency
The rank of upper quartile
= ¾ × 50 =
= 38 , to the nearest integer
The upper quartile Q3 is
hrs
48
40
32
The rank of lower quartile
= ¼ × 50 =
= 13 , to the nearest integer
The lower quartile Q1 is
hrs.
The inter-quartile range = Q3  Q1
=
24
16
8
0
0.6
0.7
0.8
0.9 1.0
Life hour (‘000)
1.1
1.2
1.3
Quartile deviation = ½ (Q3  Q1)
=
Unit 20 Central tendency & dispersion
Page 8 of 13
CMV6120
Mathematics
Example 11
Find the range, inter-quartile range and quartile deviation for the data in example 4 and example 7
respectively.
2.3 Standard deviation
For ungrouped data x1, x2,…,xn, with a mean x , the standard deviation () is
( x1  x ) 2  ( x 2  x ) 2  ( x 3  x ) 2  ...  ( x n  x ) 2
n
For grouped data with class marks x1, x2,…,xn; corresponding frequencies f1,f2,…,fn, and a mean x ,
the standard deviation () is
( x1  x ) 2 f 1  ( x 2  x ) 2 f 2  ( x 3  x ) 2 f 3  ...  ( x n  x ) 2 f n
f 1  f 2  f 3  ...  f n
Example 12
Find the standard deviation for
a) the ungrouped data 8, 9, 10, 10, 11
b) the grouped data
x
f
17
2
22
4
27
7
32
8
37
7
42
4
47
2
Solution
a)
mean x  
i
xi
=
n
standard deviation  
Calculator key-in method:
Model 3600
Set Statistic mode MODE 3
 ( xi  x ) 2
i
=
n
Model 3900
Model 506R
MODE 2
2ndF MODE 3
0
Clear memory
Key-in data
mean
s.d.
KAC
8 DATA
10 DATA
11 DATA
SHIFT 1
SHIFT 2
9 DATA
10 DATA
Unit 20 Central tendency & dispersion
KAC
8 DATA 9 DATA
10 DATA 10 DATA
11 DATA
SHIFT 4
SHIFT 5
2ndF CA
8 DATA 9 DATA
10 DATA 10 DATA
11 DATA
2ndF 4
2ndF 6
Page 9 of 13
CMV6120
b)
Mathematics
mean x  
i
xi
=
n
standard deviation  
Calculator key-in method:
Model 3600
Set Statistic mode MODE 3
 ( xi  x ) 2
i
=
n
Model 3900
Model 506R
MODE 2
2ndF MODE 3
0
Clear memory
Key-in data
mean
s.d.
KAC
17
22
27
32
37
32
47







KAC
2
4
7
8
7
4
2
DATA
DATA
DATA
DATA
DATA
DATA
DATA
SHIFT 1
SHIFT 2
17
22
27
32
37
32
47







2
4
7
8
7
4
2
2ndF CA
17 , 2 DATA
DATA
DATA
DATA
DATA
DATA
DATA
DATA
22 , 4 DATA
27 , 7 DATA
32 , 8 DATA
37 , 7 DATA
42 , 4 DATA
47 , 2 DATA
SHIFT 4
SHIFT 5
2ndF 4
2ndF 6
Example 13
The life hours of 50 light bulbs has the following frequency distribution. Complete the table with
class marks. Calculate the mean and standard deviation.
Life hours (‘000)
0.6 to under 0.7
0.7 to under 0.8
0.8 to under 0.9
0.9 to under 1.0
1.0 to under 1.1
1.1 to under 1.2
1.2 to under 1.3
Class mark
0.65
0.75
0.85
0.95
1.05
1.15
1.25
Frequency
2
4
6
14
13
7
4
Solution
mean x  
i
xi
=
n
standard deviation  
Unit 20 Central tendency & dispersion
 ( xi  x ) 2
i
n
=
Page 10 of 13
CMV6120
Mathematics
Example 14
The height of Basil team members at the 2002 FIFA World Cup is listed as following:
Marcos
1.93
Cafu
1.76
Lucio
1.88
Roque Junior
1.86
Edmilson
1.85
Carlos
1.68
Richardino
1.76
Silva
1.85
Ronaldo
1.83
Rivaldo
1.86
Ronaldinho
1.80
Calculate the average height, and the standard deviation:
Solution
Example 15
Find the mean and standard deviation for the data given below:
Age group x
5
15
25
35
45
55
65
75
Population ('000)
676
885
1000
1267
1208
677
503
499
Solution
Unit 20 Central tendency & dispersion
Page 11 of 13
CMV6120
Mathematics
Practice
1.
The Hong Kong unemployment rate in the year of 4/2003 – 4/2004 was as following:
5/2003
8.3
6/2003
8.6
7/2003
8.7
8/2003
8.6
9/2003
8.3
10/2003
8.0
11/2003
7.5
12/2003
7.3
1/2004
7.3
2/2004
7.2
3/2004
7.2
4/2004
7.1
5/2004
7.0
Calculate the average, median, mode and the standard deviation of unemployment rate:
a)
For 5/2003 – 12/2003
b)
For1/2004 – 5/2004
c)
For all 13 months.
2.
Find the mean, median, mode of the following:
10, 13, 14, 14, 14, 15, 15,16, 17, 22
3.
Which student has the highest average mark?
Student
English
Chinese
Mathematics
4.
A
78
80
59
B
63
85
71
C
55
72
95
The frequency distribution of the lengths of 100 leaves from a certain species of plant is
given below:
length (mm)
20 – 24
25 – 29
30 – 34
35 – 39
40 – 44
45 – 49
50 – 54
Unit 20 Central tendency & dispersion
Frequency
6
10
18
25
22
15
4
Page 12 of 13
CMV6120
5.
Mathematics
The following table shows the distribution of heights of 50 students:
Height (cm)
160 – 164
165 – 169
170 – 174
175 – 179
180 – 184
185 – 189
Frequency
8
12
14
7
6
3
Find the range and standard deviation of heights.
6.
The mean of one set of six numbers is 9 and the mean of a second set of eight numbers is
12.5. Calculate the mean of the combined set of fourteen numbers.
7.
The mean of the numbers a, b, c, d is 8 and the mean of the numbers a, b, c, d, e, f, g is 11.
What is the mean of the numbers e, f, g?
8.
Find the mean and standard deviation of the 5 numbers in term of x:
x5, x-3, x2, x+1, x+4.
9.
The mean of the five numbers 6, 9, 2, x, y is 5 and the standard deviation is 6 . Find the
values of x and y.
Answer:
1.a) 8.16: 8.3; undefined; 0.52
2) 15: 4.5; 4
3) 72; 73; 74
8) x1; 3.16
9) (3, 5); (5, 3)
Unit 20 Central tendency & dispersion
b) 7.16: 7.2; 7.2; 0.11 c) 7.78: 7.5; undefined; 0.65
4) 37.4
5) 30; 7.14
6) 11
7) 15
Page 13 of 13
Related documents