Download Descriptive statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
DESCRIPTIVE STATISTICS
The simplest form of single-variable analysis is to count the number of cases
in each category; the resulting count is called a frequency distribution.
A frequency distribution is often not very interesting without additional
statistical manipulations. The nature of these manipulations depends on the
type of variable or, more accurately, on the level of measurement.
CENTRAL TENDENCY OF DATA
Measures of central tendency are used to describe the location or
"centre" of the data.
The mean, the mode, and the median are measures of central tendency. To
decide whether the mean, the mode or the median best represents the
central tendency for a certain distribution, one must consider the strengths
and weaknesses of each measure.
The mean
(“µ” in the population in the
sample) is obtained by adding
the scores and dividing the total
by the number of scores.
The mode
measures the greatest
frequency. In a frequency
distribution, the mode
represents the highest point.
The median
Is the middle value when the
scores are arranged in order
of increasing magnitude
Advantages
Disadvantages
 Because of the statistical
properties of the mean, such as
efficiency and consistency, and
because the deviation of the mean
is always equal to zero, the mean
is generally considered to be the
best estimation of the central
tendency.
 It is simple to calculate and
comprehend
 The mean is affected by extreme values.
The occurrence of a few extreme values
can completely distort the appearance of
the distribution
 The mean could hide peculiar distortions
of data distribution (e.g. it would not show
if the data set is bimodal)
 There can be more than one mode value
(e.g. bimodal data set) or none (i.e. no
data is repeated)
 The mode is very sensitive to changes in
size and in the number of class intervals
in the frequency distribution, i.e. the
modes could change just by changing the
way one builds the frequency distribution
 It is less sensitive to group data in
intervals
 Being a stronger measure, it is
useful when the distribution of the
characteristics are asymmetrical
1 3 4 7 10 13 18 23 125
UNICEF M&E Training Resource
Module 6.1.1
Descriptive statistics - Page 1/2
VARIANCE IN THE DATA
Often, the evaluator may want to know how individuals within a group differ
from the average or central tendency of the group.
The range
Advantages
Disadvantages
 It is very easy to calculate
 It depends only on the
highest and lowest score:
can be misleading
 It is a very powerful
measurement to describe the
variation of scores
 Knowing the standard deviation
allow us to differentiate
between values that are usual
and those that are unusual

Is the difference between the highest value and
the lowest value. For example, if the average
age of a group of mothers using ORT were 28,
a key question would be “What is the range in
ages of the group?” In other words, what is the
age of the youngest mother using ORT and the
oldest?
The standard deviation
gives the average distance of individual
measurement observations from the group
mean.
If the scores are distributed according to a normal distribution (i.e. the bell shaped curve – a Gaussian –
shown in the picture), then:
99,7 % of data are within
3 std deviation of the mean
95% within
2 std deviation
68% within
1 std deviation
mean
Standard
deviation
UNICEF M&E Training Resource
Module 6.1.1
Descriptive statistics - Page 2/2