Download Lecture 2 – Grouped Data Calculation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Transcript
Grouped Data Calculation
Mean, Median and Mode
2. First Quantile, third Quantile and
Interquantile Range.
1.
Measure of the Central Tendency
Mean – Grouped Data
o The mean may often be confused with the median, mode or range. The
mean is the arithmetic average of a set of values, or distribution.
Example: The following table
gives the frequency distribution
of the number
of orders received each day
during the past 50 days at the
office of a mail-order
company. Calculate the mean.
Solution:
Number
of order
10 – 12
13 – 15
16 – 18
19 – 21
Number
of order
10 – 12
13 – 15
16 – 18
19 – 21
f
x
fx
4
12
20
14
n = 50
11
14
17
20
44
168
340
280
= 832
f
4
12
20
14
n = 50
X is the midpoint of the
class. It is adding the class
limits and divide by 2.
x=
 fx = 832 = 16.64
n
50
Median and Interquartile Range
– Grouped Data
o a median is described as the numerical
value separating the higher half of a
sample, a population, or a probability
distribution,
Median = Lm
 n
 2 -F
+
 fm



i


Step 1: Construct the cumulative frequency
distribution.
Step 2: Decide the class that contain the
median.
Class Median is the first class with
the
value of cumulative
frequency equal at least n/2.
Step 3: Find the median by using the
following formula:
Where:
n = the total frequency
F = the cumulative frequency before class median
f = the frequency of the class median
m
i
= the class width
Lm = the lower boundary of the class median
Example: Based on the grouped data below, find the median:
Time to travel to work
1 – 10
11 – 20
21 – 30
31 – 40
41 – 50
Frequency
8
14
12
9
7
Solution:
1st Step: Construct the cumulative frequency distribution
Time to travel
to work
1 – 10
11 – 20
21 – 30
31 – 40
41 – 50
So,
n 50

 25
2
2
F = 22, fm = 12,
Frequency
8
14
12
9
7
class median is the 3rd class
Lm = 20.5 and i = 10
Cumulative
Frequency
8
22
34
43
50
Therefore,
n

F


Median = Lm   2
i
 fm 


 25 - 22 
= 21.5  
10
 12 
= 24
Thus, 25 persons take less than 24 minutes to travel to work and another 25 persons
take more than 24 minutes to travel to work.
Quartiles
o a quartile is one of three points that divide a data set into four equal groups,
each representing a fourth of the distributed sampled population.
Using the same method of calculation as in the Median,
we can get Q1 and Q3 equation as follows:
n

 3n

F
F
4



Q1  LQ1 + 
i
Q3  LQ3 +  4
i
f
f
Q
1


Q3






Example: Based on the grouped data below, find the Interquartile Range
Time to travel to work
1 – 10
11 – 20
21 – 30
31 – 40
41 – 50
Frequency
8
14
12
9
7
Solution:
1st Step: Construct the cumulative frequency distribution
Time to travel
Frequency
to work
1 – 10
8
11 – 20
14
21 – 30
12
31 – 40
9
41 – 50
7
2nd Step: Determine the Q1 and Q3
n 50

 12.5
4
4
Class Q1 is the 2nd class
Therefore,
Class Q1 
Cumulative
Frequency
8
22
34
43
50
n

F


Q1  LQ1   4
i
 fQ1 


 12.5 - 8 
 10.5  
10
14


 13.7143
3n 3  50 
Class Q3 

 37.5
4
4
Class Q3 is the 4th class
Therefore,
n

F


Q3  LQ3   4
i
f
 Q3 


 37.5 - 34 
 30.5  
10
9


 34.3889
Interquartile Range
IQR = Q3 – Q1
IQR = Q3 – Q1
calculate the IQ
IQR = Q3 – Q1 = 34.3889 – 13.7143 = 20.6746
Mode – Grouped Data
Mode
•Mode is the value that has the highest frequency in a data set.
•For grouped data, class mode (or, modal class) is the class with the highest frequency.
•To find mode for grouped data, use the following formula:


Δ1
Mode = Lmo + 
i
Δ
+
Δ
 1
2 
Where:
i is the class width
1 is the difference between the frequency of class mode and the frequency
of the class after the class mode
 2 is the difference between the frequency of class mode
and the frequency of the class before the class mode
Lmo is the lower boundary of class mode
Calculation of Grouped Data - Mode
Example: Based on the grouped data below, find the mode
Time to travel to work
Frequency
1 – 10
11 – 20
21 – 30
31 – 40
41 – 50
8
14
12
9
7
Solution:
Based on the table,
Lmo = 10.5, 1 = (14 – 8) = 6,  2 = (14 – 12) = 2 and
i = 10
 6 
Mode = 10.5  
10  17.5
6

2


Variance and Standard Deviation
-Grouped Data
Population Variance:
Variance for sample data:
2 
s 
2
 fx
  fx 

2
2
N
N
 fx
2

  fx 
n 1
2
n
Standard Deviation:
Population:
Sample:
2  2
s2  s2
o the variance is used as a measure of how far a set of numbers are spread
out from each other.
o Standard deviation is a widely used measurement of variability or
diversity used in statistics and probability theory. It shows how much variation
or "dispersion" there is from the average (mean, or expected value).
Example: Find the variance and standard deviation for the following data:
No. of order
f
10 – 12
13 – 15
16 – 18
19 – 21
4
12
20
14
Total
n = 50
Solution:
No. of order
10 – 12
13 – 15
16 – 18
19 – 21
Total
f
x
fx
fx2
4
12
20
14
n = 50
11
14
17
20
44
168
340
280
832
484
2352
5780
5600
14216
Variance,
s2 
 fx
2
  fx 

n 1
2
n
832 
14216 
2
50
50  1
 7.5820

2
Standard Deviation, s  s  7.5820  2.75
Thus, the standard deviation of the number of orders received at
the office of this mail-order company during the past 50 days is 2.75.