Download chapter 2 - WordPress.com

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mathematics of radio engineering wikipedia , lookup

Elementary mathematics wikipedia , lookup

Transcript
CHAPTER # 02: PRESENTATION OF DATA.
2.1 Introduction
The data collected for the purpose of a statistical inquiry sometimes consists of a few fairly
simple figures which can be easily understood without any special treatment. But
sometimes the mass of raw data is without any structure that is not easily understood or
interpreted. In order to make the data simple and easily understandable, the first task is to
simplify the data in such a way that irrelevant data are removed and their significant
features are stand out prominently.
The procedures adopted for this purpose are classification and tabulation.
2.2 Classification
The process of arranging data into groups/classes according to some common
characteristics is called classification.
2.3 Types of classification
One-way classification
When the data are classified according to one characteristics, the classification is named as
one-way classification. For example, the blank table given below may be used to show the
number of adults in different occupations in a locality.
The number of adults in different occupations in a locality
Occupation
No of adults
Total
Two-way classification
When the data are classified according to two characteristics, the classification is named as
two-way classification.
Three-way classification
When the data are classified according to three characteristics, the classification is named
as three-way classification.
2.4 Tabulation
The systematic arrangement of data into rows and columns is called tabulation.
The table can be simple, double, treble and complex.
2.5 Frequency Distribution
The organization of raw data in the form of table using classes and frequencies is known as
frequency distribution where each class is defined by two numbers (lower and upper) and
frequency is the no. of values in a specified class of distribution and is denoted by f.
2.6 Components of frequency distribution
1.
2.
3.
4.
Class limits
Class boundaries
Class mark
Class interval/width
Class limits
The values which describe the classes; the smaller number is the lower class limit and the
larger number is the upper class limit. For example, 15-19, 20-24 etc.
Class boundaries
The precise numbers which separate one class from another is called class boundary. For
example, 14.5-19.5, 19.5-24.5 etc.
Class mark/Midpoint
If the sum of the lower and upper boundaries of the class is divided by 2, the value obtained
is called class mark or class midpoint.
Class interval/Width
The difference between the class boundaries is called class interval. It is denoted by h.
2.7 Steps for constructing a grouped frequency distribution
A. Decide on the no. of classes into which the data are to be classified by the given
formula:
K= 1+3.3logN
Where k= no. of classes and N is total no. of observations.
B. Determine the range of variation in data as:
Range= largest value−smallest value.
C. Divide the range of variation by the no. of classes to determine class width.
D. Start with the smallest value of data.
E. Distribute the raw data into classes and determine the class frequency in each class
by listing the actual values or tally marks.
Example.
The height of 30 students measured at the time of registration is given by
91, 89, 88, 87, 89, 91, 87, 92, 90, 98, 95, 97, 96, 100, 101, 96, 98, 99, 98, 100, 102, 99, 101, 105,
103, 107, 105, 106, 107,112.
Make a suitable frequency distribution.
Solution.
 k=1+3.3logN
=1+3.3log (30)
k=1+4.87=5.87
k=6
 Range=largest value-smallest value
00=112-87
Range=25
 h=25/6=4.167
h=5
Class
limits
Class boundaries
Midpoint
Tally
Frequency
86-90
85.5-90.5
88
|||| |
6
91-95
90.5-95.5
93
||||
4
96-100
95.5-100.5
98
|||| ||||
10
101-105
100.5-105.5
103
|||| |
6
106-110
105.5-110.5
108
|||
3
111-115
110.5-115.5
113
|
1
30
Total
2.8 Relative frequency
The frequency of a class divided by the total frequency is called relative frequency.
Class boundaries
85.5-90.5
90.5-95.5
95.5-100.5
100.5-105.5
105.5-110.5
110.5-115.5
F
6
4
10
6
3
1
Total
30
R. F
6/30=0.200
4/30=0.133
10/30=0.333
6/30=0.200
3/30=0.100
1/30=0.033
1.000
2.9 Cumulative frequency
The frequency obtained by adding each successive frequency to the cumulative total of
frequencies for the preceding classes is known as cumulative frequency.
Class boundaries
85.5-90.5
90.5-95.5
95.5-100.5
100.5-105.5
105.5-110.5
110.5-115.5
f
6
4
10
6
3
1
C. F
6
6+4=10
10+10=20
20+06=26
26+03=29
29+01=30
2.10 Grouped data
Data presented in the form of frequency distribution is called grouped data.
2.11 Ungrouped data
Data in its original form is known as ungrouped data.
2.12 Graphical representation
The visual display of statistical data in the form of points, lines, areas and other
geometrical forms and symbols is known as graphical representation.
It has two major areas named as diagrams and graphs.
2.13 Diagrams
It may be one, two or three dimensional form of visual representation of data.
2.14 Types of diagrams





Simple bar diagram
Multiple bar diagram
Component bar diagram
Pictograms
Pie diagram
Simple bar diagram
It consists of horizontal/vertical bars of equal widths and lengths proportional to the values
they represent. The values of variable are taken on x-axis and the frequencies are taken on
y-axis.
Multiple bar diagram
The extension of simple bar diagram used to represent two or more related sets of data in
the form of grouped bars.
Component bar diagram
A diagram in which each bar is divided into two or more sections proportional in size to the
component parts of a total being displayed by each bar.
Pictograms
The representation of data by means of pictures or small symbols.
A picture is worth ten thousand words.
Pie/sector diagram
A graphic device consisting of a circle divided into sectors or pie-shaped pieces whose areas
are proportional to the various parts into which the whole quantity is divided.
2.15 Graphs
It is the representation of data by a continuous curve usually shown on a graph paper.
2.16 Types of graphs




Histogram
Frequency polygon
Frequency curve
Cumulative frequency curve
Histogram
The graphical representation of data to get a visual impression about its distribution is
called histogram. It is constructed from the grouped data by taking class boundaries along
x-axis and the corresponding frequencies along y-axis.
Historigram
The graph of time series data is called historigram.
Frequency polygon
A closed geometric figure used to display a frequency distribution graphically is called
frequency polygon. Here, the mid values of class boundaries are taken on x-axis while the
relevant frequencies are taken along y-axis.
Frequency curve
When a frequency polygon/histogram constructed over class intervals made sufficiently
small for a large no. of observations, is smoothed, it approaches to a continuous curve
called frequency curve.
Cumulative frequency polygon
A graph obtained by plotting the cumulated frequency of distribution against the upper or
lower class boundaries.
Assignments
Exercise
Q # 01 Construct the frequency table for the following data. Also calculate relative and
cumulative frequencies.
100, 96, 92, 88, 86, 84, 82, 80, 78, 91, 87, 83, 79, 77, 75, 73, 71, 69, 58, 56, 73, 50, 57, 55, 53,
51, 48, 46, 63, 59, 55, 51, 49, 47, 45, 41, 43, 58, 54, 50, 56, 44, 42, 40, 38, 36, 46, 53, 50, 43.
Q # 02 Draw multiple bar diagram for the following data.
Items
Clothing
House rent
fuel
Miscellaneous
Family A
600
100
400
100
Family B
800
100
500
300
Q # 03 Draw frequency polygon for the given data.
Mid
32
values
frequency 3
37
42
47
52
57
62
67
17
28
47
54
31
14
4