Download View File

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Chapter - 7
INFORMATION PRESENTATION
7.1
7.2
7.3
7.4
7.5
May 06th, 2008
Statistical analysis
Presentation of data
Averages
Index numbers
Dispersion from the average
1
7.1 – Statistical Analysis
 statistical analysis is a management tool used in
decision making
 Statistics is the technique for comparing
numbers and drawing conclusions from them
 Two important factors are involved
– using numbers to arrive at a solution
– comparing numbers
 A number in isolation provides very little
information
 To be able to compare numbers they must be
measured in the same way, be in the same units
and must refer to the same items
May 06th, 2008
2
 vital statistics: to record population details
– as required by rulers for raising taxes and armies
 Governments use it to plan social services
 Within industry, statistics is used for competitive
analysis
 The numbers which the technique provides
cannot lie, but they are open to misinterpretation
(deaths in bed)
 Companies also generate data, mainly for their
own use. e.g. data on orders received,
advertising effectiveness, and defect rates etc.
May 06th, 2008
3
7.2 – Presentation of data
 The items which are measured are referred to as
variables
 There can be two types of variable, discrete and
continuous
 Tables
– the position of columns relative to each other
 Pictorial Presentation
– It gets over the essential facts very quickly and creates an
impact.
– For larger amounts of data it is also a good way for indicating
trends, which may not be easy to deduce from the table
 Picture elements used in this representation can vary
depending on the items being compared, for example
people for employment.
May 06th, 2008
4
Types of Pictorial Presentation
 Pictograms
Fig 7.1
– Picture elements used in this representation can vary depending
on the items being compared, for example people for
employment
– It is best used when whole items are being compared, each
symbol representing a unit
 Bar charts
–
–
–
–
–
It may consist of single bars, multiple bars or component bars
length of bar shows the size of the item being compared
the component bar chart
????
Bar charts can be drawn in two dimensions or three dimensions
Legends can be added to bar charts (numbers can be placed
next to bar)
– not very good when many items are involved
May 06th, 2008
5
Types of Pictorial Presentation
 Histograms
– Special form of bar chart in which the areas
under the rectangles (that make up the bars)
represent the relative frequency of occurrence
of the item. Usually the height of the bars
determines the frequency
 Pie charts
– Showing subdivisions of the whole
– Not possible to read off absolute values
May 06th, 2008
6
Graphs
 A straight line with dependent and independent variables
 Honesty must also be used in drawing graphs (scale
should not be deceiving if two lines are drawn on the
same graph for comparison)
– Performance of two students A & B on different scales:
A has 78, 76, 79, 81 & 82
B has 50, 61, 65, 68 & 72
 Types of graphs include:
–
–
–
–
–
Logarithmic scale graphs
Strata or band graphs
Ogives
Frequency polygons
Lorenz curves
May 06th, 2008
7
Types of graphs
 Logarithmic scale graphs
– The logarithmic scale graph (ratio graph) shows the
gradations as a logarithmic or ratio. These graphs
are used to show relative changes in data
 Strata or band graphs
– Several curves are drawn on the same paper
– actual values are by the difference between them
 Ogives
– Ogives are a graphical method for representing
cumulative data; as ‘more than’ and ‘less than’
– Figure 7.8
May 06th, 2008
8
Types of graphs
 Frequency polygons
– A frequency polygon may be considered to be a
graphical version of a histogram
– Derived from a histogram by joining the midpoints of
the class intervals
Figure 7.9
– It is also conventional that the frequency polygon is
extended a half cell interval at either end
 Lorenz curves
– In statistical analysis it is often found that a small
proportion of items have the greatest influence. For
example a small percentage of the population have
the highest income in a country
– This law of inequality can be shown graphically by a
Lorenz curve, and it allows management attention to
be focused on the few critical elements that have the
greatest influence.
Fig 7.10
May 06th, 2008
9
7.3 – Averages
 Presenting a single number, referred to as
the central tendency, to represent many
numbers
 There are several types of average
– The arithmetic mean
– The median
– The mode
– The geometric mean
– The harmonic mean
May 06th, 2008
10
Types of average
 The arithmetic mean
–
–
–
–
the total of the values divided by the total number of items
It is easy to calculate and that it takes account of all the numbers
It is affected by extreme values
When calculating the arithmetic mean of percentage figures; the
weighted average must be obtained
 The median
– The median is the middle figure, placing the figures in an
ascending or descending order
– For even number of items, the arithmetic mean of the two central
numbers is taken as being the median
– It can be used even when the items cannot be expressed as a
number; e.g. colours put in order
– not affected by extreme values; the extreme numbers can
change without having any effect on the median
– not representative of the result; if the numbers are irregular and
widely spread
May 06th, 2008
11
Types of average
 The mode
– The mode is the item that appears most often
– It is not affected by extreme values
– There can be several modes (multimode), or there
may be no mode
– Not suitable for calculations; can give large errors
when dealing with widely spread and erratic numbers
 The geometric mean
– Items are multiplied together and the root of the result
is taken, the base of the root being equal to that of the
number of items
– Mostly used, where the value of the quantity depends
on its previous value
– It cannot be used if any item is zero or a negative
May 06th, 2008
12
Types of average
 The harmonic mean
– Reciprocal of the mean of reciprocals of each
individual item
– For example; km/hr → distance per unit time
 if a car travels three equal distances (say 100 km)
at different speeds, then the average speed is
found as the harmonic mean
 but if the car traveled for equal intervals of time,
(say 20 minutes) then the average speed is found
as the arithmetic mean
May 06th, 2008
13
7.4 – Index Numbers
 An index number is a special type of average which is
used when comparing different types of items
 It is an average of a group of items, measured over time,
to show the change in the items
 In compiling an index number it is important to decide on
(1) the type of items to be compared, (2) the number of
items and (3) the base year
 Because when one is dealing with different types of
items: a weighted average is used
 Different types of index numbers can be calculated,
depending on the type of average used (such as a
geometric index or an arithmetic index)
 Calculations can also be made using a variable base,
especially if weights are changing rapidly
May 06th, 2008
14
7.5 – Dispersion from the average
 Measure of the spread or dispersion gives some
more info; for example, are some students
performing better than others ?, and if so,
by how much ?
 Dispersion can be seen from tables or graphs,
but they are often required to be represented by
one or two numbers; different techniques by
which this can be done are:
–
–
–
–
–
The range
Quartile deviation
Mean deviation
Standard deviation
Skew ness
May 06th, 2008
15
Types of Dispersion
 The range
– difference between the largest and smallest numbers
being considered
 Quartile deviation
– A quartile divides the series of figures into four equal
parts as the median was seen to divide it into two
equal parts
– The inter quartile range is the difference between the
first and third quartile numbers and the quartile
deviation is half the inter quartile range
– It is easy to calculate and is not affected by extreme
values
– However, it in effect ignores half the values in the
series and gives no indication of clustering
May 06th, 2008
16
Types of Dispersion
 Mean deviation
– The mean deviation is the average of all deviations
from the arithmetic mean, signs being ignored
– Since signs are ignored, it is not suitable for use in
further mathematical analysis
 Standard deviation
– Most frequently used measure of dispersion from the
average
– Square the deviation from the mean (so eliminating
signs), find their average, and then take the square
root of the result
– Variance is just the square of the standard deviation
May 06th, 2008
17
Types of Dispersion
 Skewness
– The distribution of numbers within a series usually lies
unequally on either side of the middle
– The distribution is said to be positively skewed if it is
biased to low values (mean > median) and it has a
negative skew if biased the other way
– Skewness of the distribution gives a measure of the
deviation between the mean, median and mode
– Usually this skewness is stated in relative terms, to
make comparisons between different series easier
May 06th, 2008
18