Download Estimating the mean, median, and modal class for a frequency

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Corecursion wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
Estimating the mean, median, and modal class for a frequency distribution
Sometimes you have access to a frequency distribution but do not have access to the
original data. Though you cannot calculate the exact mean, median, and mode, you
can estimate these measures of central tendency. For example, suppose you read an
article that includes the following frequency distribution.
Number of times shop at a
mall per Year
> 0 but < 10
> 10 but < 20
> 20 but < 30
> 30 but < 40
> 40 but < 50
> 50 but < 60
Number of Respondents
96
45
30
18
8
3
200
Estimating the Mode of a Frequency Distribution
With respect to the mode, identify the class with the highest frequency. This is called
the modal class. In this example, the class with the highest frequency (96) is the first
class, > 0 but < 10. This just means that more people are in this class than any other
class in the frequency distribution.
Estimating the Mean of a Frequency Distribution
1. Find the midpoint of each class.
2. Then, proceed as if you are calculating a weighted average.
a. Sum the weights (number of respondents or sample size).
b. Create a column that weights the midpoint each category by the number of
respondents.
c. Sum the weighted midpoints.
d. Divide the total weighted midpoints by the sum the weights (number of
respondents or sample size) to obtain the estimated mean for the frequency
distribution.
Number of times shop at a
mall per Year
> 0 but < 10
> 10 but < 20
> 20 but < 30
> 30 but < 40
> 40 but < 50
> 50 but < 60
Number of
Midpoints Respondents
5
15
25
35
45
55
Weighted
Midpoints
96
45
30
18
8
3
200
Estimated Mean for Frequency Distribution = 3060/200 = 15.3
480
675
750
630
360
165
3060
Important Note: This procedure assumes that data within each class are evenly (or at
least close to evenly) distributed across the class that includes the median. This
assumption is reasonable in most (almost all) cases. Still, if you are pretty sure that
this assumption is not true (because of your experience with the type of data/problem
being considered), perhaps because the data within some classes is likely to be
skewed, then you should adjust your estimate in accordance with what you know.
Estimating the median of a frequency distribution
Number of times shop at a
mall per Year
> 0 but < 10
> 10 but < 20
> 20 but < 30
> 30 but < 40
> 40 but < 50
> 50 but < 60
Number of Respondents
96
45
30
18
8
3
200
1. Identify the position of the median: (100 + 101)/2 = 100.5th value.
2. Identify the class in which the median falls. Since the first class
contains 96 values and the second class contains 45 values, the median
falls somewhere in the second class. Assuming that the values are
evenly distributed in the second class, use the following formula to
estimate the median for the frequency distribution.
  (n  1)


CF


b
 2
 i
Median  L  
Fi




Where
L = lower limit of class in which median falls (10 in this example)
n = sample size (200 in this example)
CFb = Cumulative frequency before median class (96 in this example)
Fi = class frequency (45 in this example)
i =size of the interval (10 in this example)
For the current example,
  (100  1)


96



2
  10  11.
Median 10  
45




Here is a brief conceptual explanation of the formula. If the median is the 100.5th
value in the data set, it is the 4.5th value in the median class. Recalling that the
median class has 45 values, the 5.5th value is 1/10th of the way through the interval.
Eleven is 1/10th of the way from 10 to 20.
Important Note: This procedure assumes that data within the median class is evenly
(or at least close to evenly) distributed. This assumption is reasonable in most (almost
all) cases. Still, if you are pretty sure (because of your experience with the type of
data/problem being considered), then you should adjust your estimate in accordance
with what you know.