Download 3. basis analytical tools in economics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time series wikipedia , lookup

Transcript
Author: PRINCE FODAY
3. BASIS ANALYTICAL TOOLS IN ECONOMICS
The science and methods of economic analysis in modern sense starts from the mere
recording and tabulation of numerical data followed by the processes of accepting facts from
mathematical theory.
The first stage in any economic analysis is the collection of facts or data. The data themselves
can be of any kind as long as they can be counted. The facts collected have attributes or
characteristics. Attributes maybe described as one which is not capable of numerical
definition (e.g. colour of eyes) or one, which can be, expressed in numerical terms (e.g.
heights in inches), salary in pounds or dollars, marks in examination.
The facts collected may be a continuous variable or discrete variable. A continuous variable
takes any value within the range of its observed minimum and maximum values (e.g. the
heights or weights of school children). Discrete variables are variables that are exact in
measurement and we cannot have less or more than a whole number (e.g. number of students
in a class.)
Generally, all economic facts are subject to collection, tabulation and presentation, analysis
based on mathematical theory, and conclusion.
2.1 Tabulation
Before tabulation takes place, the information or data collected from individual question
Aires (sets of questions meant to achieve objectives) needs to be entered on a separate
summary sheet. These totals are then transferred to the relevant columns of prepared table.
The purpose of tabulation is to reduce the data into few so as to ease its comparison.
The Construction of Tables
The construction of table depends on the nature of the data. The nature of the data can be raw,
ungrouped and grouped.
Table 1
a) Tabulation: Raw Data
A raw data is a numerical fact that has been collected from field or desk research. To
illustrate the normal procedure in tabulating a raw data, the data in table 1 relates to the
number of rejects in each successive period of five minutes.
The smallest number of rejects is at the beginning of the group, and the largest at the end (i.e.
in order of magnitude). Such an arrangement of data is Called ARRAY. The table shows that
the minimum and maximum numbers of rejects are 3 and 33. The difference between the
highest and lowest value is called the RANGE.
3
3
7
8
9
11
12
13
13
15
16
17
17
18
19
19
20
20
21
21
22
22
22
22
2
22
23
23
23
23
23
24
24
24
24
24
25
25
26
26
26
27
27
28
28
28
29
30
31
33
b) Tabulation: Ungrouped Data
Table 2 is an ungrouped data or ungrouped frequency distribution because there are still
many figures to absorb.
Table 2
Number of rejects
3
7
8
9
11
12
13
15
16
17
18
19
Frequency
2
1
1
1
1
1
1
1
1
2
1
2
Number of rejects
20
21
22
23
24
25
26
27
28
29
30
31
33
Frequency
2
2
6
5
5
2
3
2
3
1
1
1
1
Comparison is possible to some extent with an ungroup frequency distribution. One would
now say that 22 occur most frequently. This is because has the highest frequency.
C) Tabulation: Grouped Data
Tabulation of grouped data or group frequency distribution is as a result of the many
unabsorbed figures in ungrouped data. Instead of the frequencies of each single number of
rejects being shown separately, the range is sub-divided ‘classes’. In table 3 that follows class
consists 4 classes with or class interval. Thus the first class (i.e. 3-7) covers all 4 values
inclusively. Conventionally, the number of rejects is usually called independent variable. The
corresponding frequencies are referred to as the dependent variable.
Table 3
Number of rejects
3-7
8-12
13-17
18-22
23-27
28-32
33-37
Frequency
3
4
6
13
17
6
1
How To Find The Class With
In most cases the class width or class interval is given. However, where the class width is not
given, the following steps can be of help.
-
Find the smallest and largest valve from the raw data. For example, in table 1 the
smallest and largest value is 3 and 33;
Find the range (i.e. 33-3=30)
Express the number of class needed, say 7 as in table3;
Divide the range value by the number of classes (i.e. 30/7=4.29=4); and
Conclude that the class width is 4.
2.2 Presentation of Data
Presentation refers to using graphs and charts to explain the information of explaining
data or information in clear manner. This type of communication of facts is visual. By
mere view one can understand a diagram.
Presentation of the collected data is not of importance to the novice, but very useful to the
economic analyst or quantitative technician. For example, a well designed but simple diagram
showing the trend of revenue and expenditure will be explain better than a mass of detailed
monthly figures.
Graphs
a) Histogram
The histogram comprises of series of rectangles touching on another. The independent
variable (classes) is plotting along the horizontal axis and the dependent variable
(frequencies) against the vertical axis. The bars are with equal width and the
frequencies within each class are represented by height of bar. If the class-width
varies, then the height of the bar must be adjusted. The area of the different rectangles
is proportional to the frequencies in the respective class:
i)
Histogram: Equal class width
Consider table 3 that shows a machine that produces the number of rejects in each
successive period of five minutes and draw the histogram of the data.
Table 3
Number of rejects
3-7
8-12
13-17
18-22
23-27
28-32
33-37
Frequency
3
4
6
13
17
6
1
The grouped frequency expressed in table 3 cannot give as the joint bars so; we need to adjust
the table. The adjustment can be done by subtracting 0.5 from the lower class and adding 0.5
to the upper class of the first class, and then taking the upper class as the first class of the
second class on a continuous as shown on table 4
Table 4
Number of rejects
2.5-7.5
7.5-12.5
12.5-17.5
17.5-22.5
22.5-27.5
27.5-32.5
32.5-37.5
Frequency
3
4
6
13
17
6
1
The figure 1 is the histogram of an equal class-width.
II.Histogram: unequal class width
Consider table 5, which relates to the length of life of bad debts and draw the
histogram.
Table 5
Working days
0-5
5-10
10-20
20-30
30-35
Number of bad debts
30
20
32
14
4
There is an unequal class width in table 5 that must be adjusted. The class 10-20 is twice the
width of the 1st and 2nd class. The class 10-20 can be divided into two classes that make the
3rd and 4th classes (i.e. 10-15 and 15-20). The number of bad debts has to be
divided equally (i.e. 32/2=16) between the two classes. The 5th class (i.e. 20-30) is dealt with
in the same way, which will result in table 6 that follows
Table 6
Working days
0-5
5-10
10-15
15-20
20-25
25-30
30-35
Number of bad debts
30
20
16
16
7
7
4
The figure 2 is the right histogram.
b) Frequency Polygon
Frequency polygon is arrived at by joining the midpoints of the top of the rectangles of the
histogram on the machine that produces the number of rejects in each successive period of
five minutes (i.e. the histogram for equal class width). The frequency polygon is shown in the
figure 3.
Figure 1
HISTOGRAM: EQUAL CLASS WIDTH
FREQUENCY
18
16
14
12
10
8
6
SCALE
1) H.A: 2cm=5 rejects
2) V.A: 2cm=2 units
4
2
0
2.5
7.5
12.5 17.5 22.5
NUMBER OF REJECTS
27.5
32.5
37.5
FIGURE 2
HISTOGRAM: UNEQUAL CLASS WIDTH
NUMBER OF BAD DEBTS
30
SCALE
25
20
15
10
5
1) H.A: 2cm=5working days
2) V.A: 2cm=5number of bad debts
0
5
10
15
20
25
30
35
WORKING DAYS
FIGURE 3
FREQUENCY POLYGON
FREQUENCY
12
18
16
14
12
10
8
6
SCALE
1) H.A: 2cm=5rejects
2) V.A: 2cm=2units
4
2
0
2.5
7.5
12.5
17.5
22.5
27.5
32.5
375
NUMBER OF REJECTS
Cumulative Frequency Curve
This is sometime called OGIVE. The cumulative frequency curve is arrived at by plotting the
cumulative frequencies (dependent variable or y-axis) against the upper class boundaries
(independent variable or x-axis). Consider the marks of economics students in 124 in Gambia
senior secondary school that follows:
Marks
0-10
11-21
22-32
33-43
44-54
55-65
66-76
Number of Students
1
2
5
12
6
3
7
Construct the cumulative frequency curve for the frequency distribution.
Solution
The first step requires the adjustment of the data into class boundaries and thereafter makes
use of the upper class boundaries and cumulative frequencies to draw the OGIVE.
Marks
Number of
Students
Marks
(boundaries)
0-10
11-21
22-32
33-43
44-54
55-65
66-76
1
2
5
12
6
3
7
-0.5-10.5
10.5-21.5
21.5-32.5
32.5-43.5
43.5-54.5
54.5-65.5
65.5-76.5
Marks (upper
class
boundaries)
10.5
21.5
32.5
43.5
54.5
65.5
76.5
Cumulative Frequency
Curve (OGIVE)
Cumulative
NO, of
Students
1
3
8
20
26
29
36
CUMULATIIVE NUMBER
OF STUDENTS
40
30
20
10
10.5
21.5
32.5 43.5 54.5 65.5 76.5
MARKS (UPPER CLASS BOUNDARIES)
C) Logarithmic Scale Graphs
Logarithmic scale graphs are concern about relative changes instead of absolute change as in
ordinary graphs. The logarithmic scale graphs are ideal for business that is growing in size.
For example, turnover and profits may be increasing each year, but the rate of increase may
be falling. Comparison between different data is soundly expressed by measuring the relative
than absolute changes. The logarithmic scale graph is sometimes called ‘Semi-logarithmic
graph’ because only one of the two scales on the graph is logarithmic. Logarithmic scale
graph has nothing like origin as other graphs or charts where all the points are measured from
the origin. To show the use of ordinary graphs
and logarithmic scale graphs, consider the hypothetical data in Table 7, which shows the
turnover and profits before taxation for standard chartered bank for the period 1980-1988.
Table 7
Year
1980
1981
1982
1983
1984
1985
1986
1987
1988
Turnover (DM)
1,025
1,230
1,472
1,650
1,866
2,210
2,463
2,855
3,220
Profits before taxation
100
121
164
177
185
230
242
278
313
We can plot this data on ordinary graph paper. The years should be on the horizontal axis,
and turnover and profit on the vertical axis. The figure 4 that follows illustrates an
ORDINARY GRAPH for turnover. The logarithmic scale graph can be drawn by finding the
logarithms of turnover and profits before taxation as expressed table 8.
Table 8
Year
1980
1981
1982
1983
1984
1985
1986
1987
1988
Turnover
DM
1,025
1,230
1,472
1,650
1,866
2,210
2,463
2,855
3,220
log
3.0107
3.0899
3.1679
3.2175
3.2709
3.3444
3.3915
3.4556
3.5079
Profit before taxation
DM
log
100
121
164
177
185
230
242
278
313
2.0000
2.0828
2.2148
2.2478
2.2672
2.3617
2.3838
2.4441
2.4955
Figure 4
Ordinary Graph
DM
4,000
3,000
2,000
1,000
1980 1981 1982
1983
1984
1985 1986
YEARS
1987
1988
d) Lorenz Curve
Lorenz curve is useful when wanting to express visually inequality between data. Before
plotting the Lorenz curve, the cumulative totals for both sets of data is needed. Furthermore,
it is necessary to express as a percentage of the total. To illustrate the Lorenz curve, suppose
the following figures come from the report on the census of production 1990: textile
machinery and accessories.
Establishments
No.
48
42
38
21
26
16
13
Net Output
D’ooo
1,406
2,263
3,699
2,836
3,152
5,032
214
20,385
We need to take the first step by finding the cumulative totals for both sets of data and there
after express each of the cumulative entries as percentage of the total as shown in table that
follows
Table 9
Establishment
Nos
(i)
48
42
38
21
26
16
23
Cumulative
Establishment
(ii)
48
90
128
149
175
191
214
Cumulative
Percentage
(iii)
22
42
60
70
82
89
100
Net
Output
(iv)
1,406
2,263
3,699
2,836
3,152
5,032
20,385
Cumulative
Net Outputs
(v)
1,406
3,669
7,368
10,204
13,356
18,388
28,773
Cumulative
Percentage
(vi)
4
10
19
26
35
47
100
Column (iii) and (vi) are plotted in figure 5. Column (iii) is plotted on the horizontal axis,
column (vi) on the vertical axis. The diagonal line drawn from the origin is called ‘the line of
equality’ the plot shows that the actual curve is away from the line of equality, which shows
the inequality. The further the curve is away from the diagonal line, the greater the inequality.
FIGURE 5
LORENZ CURVE
NET OUTPUT D’000
100
80
60
LINE OF EQUALITY
40
20
20
40
60
80
100
ESTABLISHMENT NOS
Charts
a) Simple bar chart
This is used when a data needs no comparison. Consider the data in table 10 on the utilisation
of milk for the production of butter, cheese, condensed milk and others.
Table 10
Milk Utilisation
1980
1981
Butter
Cheese
Condensed Milk
Others
Total
D56
126
73
56
311
D58
129
77
65
329
The simple bar chart for 1980 can be represented diagrammatically.
Simple Bar Chat
150
Butter
100
Cheese
50
Condensed Milk
0
1980
Others
b) Multiple Bar Charts
Multiple bar charts are appropriate for comparing between years. The performance of a firm
over the years can be visually communicated by the use of the multiple bar charts.
Considering the data in table 10, the multiple bar charts for 1980 to 1981 can be
diagrammatically illustrated.
Multiple Bar Char
800
Butter
700
600
500
Cheese
400
300
Condensed
Milk
200
100
Others
0
1980 1981
C) Component bar chart
Component bar chart is necessary when one wish to consider how the different types of milk
utilisation products make up the totals of milk utilisation. We can arrive at the component bar
chart by extending table 10. The extension is made by finding the cumulative for the data on
milk utilisation products for 1980 and 1981 as expressed in table 11.
Table 11
Milk utilisation
1980
Butter
Cheeses
Condensed Milk
Others
D56
126
73
56
1980
(Cumulative)
56
182
255
311
1981
D58
129
77
65
1981
(Cumulative)
58
187
264
329
The component bar chart can now be illustrated diagrammatically.
Component Bar Chart
1200
Others
1000
800
Condensed
Milk
600
400
Cheese
200
0
1980
1981
Butter
d) Percentage component bar chart
Percentage component bar chart expresses each milk utilisation product as a percentage of the
total milk utilisation. Table 12 will give us an idea on what needs be done before drawing the
diagram in figure 6 (table on milk utilisation products is used for explanation purpose).
Table 12
Milk Utilisation
1980
1980
(Cum.)
1980
(%)
1981
1981
(Cum.)
1981
(%)
Butter
Cheeses
Condensed Milk
Others
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
D56
126
73
56
56
182
255
311
18
59
82
100
D58
129
77
65
58
187
264
329
18
57
80
100
Others
Condensed
Milk
Cheese
Butter
1980 1981
e) Pie Chart
A pie chart serves as an alternative presentation to a percentage component bar chart. The pie
chart is usually preferred when faced with the selection between pie chart and percentage
component bar chart. To construct a pie chart, the student needs ruler, pencil, compass and
protractor. Using the data on milk utilisation for 1980, we can construct the pie chart in figure
7 from the table 13.
Table 13
Milk Utilisation
1980
(D’ooo)
1980
(Angle)
Butter
56
Cheese
126
Condensed Milk
73
Others
56
56
 3600=64.8
311
126
 3600=145.9
311
73
 3600=84.5
311
56
 3600=64.8
311
311
PIE CHART
Butter
Cheese
Condensed Milk
Others
Some Basic Statistics Measures
The concern here is to know about the arithmetic mean, mode, and their simple application.
a) Arithmetic Mean
The arithmetic mean involves finding the sum of individual items and dividing the sum by
the number of observations (or total frequencies). There are different ways of calculating the
arithmetic mean depending on the nature of the data (i.e. raw, ungrouped and group)
i)
Arithmetic Mean: Raw data
Consider the raw data 1,2,3,4,5; the arithmetic mean can be as follows:
Arithmetic (x ) = 1+2+3+4+5
5
= 15
5
Arithmetic mean (x ) = 3
The formula to use is,
n
Arithmetic mean (x ) =

i 1
Xi
= X1+X2+---+ Xn
n
Where
X= Arithmetic mean
 (Large sigma)= the sum of the individual items
n=
total frequencies
(ii) Arithmetic mean: ungroup data
Consider the ungroup data in table 14 that follows
Table 14
Marks
10
20
30
40
50
Number of Students
1
2
8
3
1
The arithmetic mean can be calculated as in table 15
Table 15
Marks
(x)
10 x1
20 x2
30 x3
40 x4
50 x5
Total
Number of Students
(f)
1
f1
2
f2
8
f3
3
f4
1
f5
f; =15
F1 x1
1x10=f1, x1=10
2x20=f2, x2=40
8x30=f3, x3=240
3x40=f4, x4=120
1x50=f5, x5=50
f; x; =460
The formula to use is
f1, x1+ f2, x2 +----+fn, xn
Arithmetic Mean =
f1, x1+ f2, x2 +----+fn
=10+40+---+50
1+2+---+1
Arithmetic mean = 30.666
Arithmetic mean = 30.670 (2 decimal places)
(iii) Arithmetic mean: Group data
Table 16 shows the age distribution of the estimated population of country A at 30th June
1980.
Table 16
Age
0–9
Number (ten thousand)
795
10 – 19
20 – 29
30 – 39
40 – 49
50 – 59
60 – 69
70 – 79
782
670
720
707
692
494
292
The arithmetic mean can be found as in table 17
Table 17
Age
X;
0-9
10-19
20-29
30-39
40-49
50-59
60-69
70-79
Number (ten
thousand)
F;
795
782
670
720
707
692
494
292
=5152
Arithmetic mean = f1, x1+ f2, x2 +----+fn, xn
f1, x1+ f2, x2 +----+fn
=3577.5+11339+---+21754
795+782+---+292
X;
F;.x;
4.5
14.5
24.5
34.5
44.5
54.5
64.5
74.5
795x4.5=3577.5
782x14.5=11339
670x24.5=16415
720x34.5=24840
707x44.5=31461.5
692x54.5=37714
494x64.5=31863
292x74.5=21754
=178964
=34.7368
Arithmetic mean = 34.74 (decimal places)
Where
X= Class mark (e.g. =4.5)
F= Frequencies
 = Summation or total
b) Median
Median divides the distribution into two equal parts, or it is the middle value after arranging
the distribution in ascending or descending order.
Median: Raw data
Consider the raw data 5,3,1,4 and 2, arranging them in ascending order 1,2,3,4and5, you
would arrive at the median 3. The median is 3 because it divides the data into two equal parts.
There may be situations where the data is even like 1,2,3,4,5,6. You will realise that the
median is 3+4 =3.5
2
We were able to get exactly 3 as the median for the data 1,2,3,4,5 because the distribution is
odd.
The median is the middle item, and it can be traced at the position n/2, where n is the total
number of observations. In some cases n+1/2 is used instead of n/2.
(i)
Median: Ungrouped data
Consider the ungrouped data in table 18
Table 18
Marks
x
10
20
30
40
50
Total
Number of students
(f)
1
2
8
3
1
f = N= n =15
The median can be known by finding the cumulative number of students and thereafter using
the formula n/2 or n+1/2 to trace the median as expressed in table19
Table 19
Marks
10
20
30
40
50
Number of students
1
2
8
3
1
N= n= f =15
Cumulative
1
1+2=3
3+8=11
11+3=14
14+1=15
Using the formula n+1/2 or n/2 (n/2 preferred), we have 15/2 =7 ½th value. The 7 ½th value
falls under the mark 30. Hence, the median mark is 30.
iii. Median of Grouped Data
Table 20 shows a shop which holds 100 units of an item at the start of each week is
concerned that there is too high a level of stock. Weekly sales of the item during the past two
years have been as follows:
Table 20
No. of items sold
1 - 20
21 – 40
41 – 60
61 – 80
81 – 100
No. of weeks
6
20
40
38
3
The median is measured by finding first the cumulative number of employees and thereafter
using the formula n/2 to trace the median class. The upper class boundary should be
considered and it would be ideal to use the less than type of cumulative frequency. If the
classes are like 8 –10, 10 –12, etc, then there is no need for adjustment. Furthermore, if the
classes are like 1-20, 21-40, etc, then adjustment needs to be done by subtracting 0.5 from the
lower boundaries and adding 0.5 to the upper boundaries (i.e. 0.5-20.5, 20.5-40.5.etc). Table
21 that follows will explain the approach to use before the calculation.
Table 21
No. Of items sold
(Adjusted)
0.5-20.5
20.5-40.5
40.5-60.5
60.5-80.5
80.5-100.5
TOTAL
No. Of items sold No. Of Weeks
(Upper Classes)
Less than 20.5
6
“ “ 40.5L1
20
“ “ 60.5L2
40
“ “ 80.5
38
“ “ 100.5
3
n=107
No. Of Weeks
(Cumulative)
6
F1 6+20=26
F2 26+40=66 (median Class)
66+38= 104
104+3=107
The formula to use in calculating the median is
Median
= L1 + n/2 - F1 x
F2 + F1
(L2 – L1)
Where
L1 = Lower class boundary of the median class
L2 = Upper class boundary of the median class
n = Number of observations ( total number of weeks)
F1 = Cumulative frequency curve corresponding to the lower class boundary of the
median class.
F2 = Cumulative frequency curve corresponding to the upper class boundary of the
median class.
From the table,
L1=40.5; L2=60.5; F1= 26; F2 = 66;
Median = L1 +
n/2 - F1 x
F2 + F1
(L2 – L1)
= 40. 5 + 107/2 – 26 x (60.5 – 40.5)
66 + 26
= 40.5
+ 53.5 - 26
66 + 26
x 20
= 40.5
+ 27.5 x 20
92
= 40.5
+
0.2989 x 20
= 40.5
+
5.978
= 46.478
= 46.5 (1 Decimal Place)
Therefore, the median is 46.5.
Median: Graphical Method
The median can be graphically arrived at by first drawing the cumulative frequency curve
followed by the use of n/2 to trace the median as illustrated in the diagram. The cumulative
frequency curve is drawn by plotting the cumulative frequency against the upper class
boundaries of the different classes.
Figure 8
MEDIAN: GRAPHICAL METHOD
Cumulative Number of Weeks
120
100
Cumulative
NO. Of Weeks
80
60
CUMULATIVE
FREQUENCY
CURVE
40
20
20.5
40.5
60.5
80.5
100.5
No. of Items Sold
MEDIAN = 40.5+ 6 = 46.5 (1 Decimal Place)
Mode
Mode is defined as the value or class that occur most frequently. For example, if we have
the observations, , , the mode is. The mode is because it occurs twice whilst the other
values occurred once. It is possible for two or more values to become the mode, provided
they have the same number of occurrences, for example, , , , , the values and are the
mode.
i. Mode: Raw Data
The heights of five students in a class are 3.5, 5.5, 6.2, 4.5 and 5.5.
The modal height is 5.5. The answer is 5.5 because it is the value that
occurred most frequently.
ii. Mode: Ungrouped data
Use the distribution in table 22 to find the mode
MARKS
10
NO. OF STUDENTS
1
20
30
40
50
2
8 Mode
5
1
The modal mark is 30 because it occurred most frequently compared to the other marks.
i
Mode: Grouped data
Table 23 shows the weekly income of employees in xyz Ltd.
WEEKY INCOME
(D)
8-10
10-12
12-14 L1
14-16 L2
16-18
18-20
20-22
The mode is calculated by using the formula
Mode = L1 + fa
fa
NUMBER OF EMPLOYEES
34
58
69fa
100 Modal group
95fb
70
35
x (L2 - L1)
+ fb
Where
L1 = Lower limit of the modal group
fa = Frequency in group below the modal group.
fb = Frequency in group after the modal group
L2 = Upper limit of the modal group
The values of the symbols are indicated in the table above and are as
follows:
L1=14; L2=16; fa=69; fb=95
Substituting the values in the formula,
Mode = L1 + fa
fa
=14 +
x (L2 - L1)
+ fb
69
x ( 16 – 14)
164
=14 + 0.4207 x 2
= 14+0.8415
=14.8415
Mode = 14.8 (1 decimal place)
Therefore, the mode of the grouped data is 14.8.
Mode: Graphical Method
Graphically, the mode can be found by first drawing the histogram
and later on tracing the mode from the histogram, as explained in
figure 9.
FIGURE 9
MODE: Graphical Method
Number of Employees
100
90
80
70
60
50
40
30
20
10
0
8
10
12
14
16
18
MODE=14+0.8 = 14.8
20
22
Weekly Income