Download introduction to statistics st.pauls university

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Regression toward the mean wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
INTRODUCTION TO STATISTICS ST.PAULS UNIVERSITY
MEASURES OF DISPERSION
Definition of dispersion
 It is the degree to which numerical data tends to spread about an average value
 It is the extent of the scattered ness of items around a measure of central tendency
Significance of measuring dispersion
 To determine the reliability of an average
 To serve as a basis for the control of the variability
 To compare two or more series with regard to their variability
 To facilitate the use of other statistical measures
Properties of a good measure of dispersion
It should be:  Simple to understand
 Easy to compute
 Rigidly defined
 Based on each and every item in the distribution
 Amenable to further algebraic calculations
 Have sampling stability
 Not be unduly affected by extreme values
Measures of dispersion
 Range
 Quartile deviation
 Mean deviation
 Standard deviation
The Range: it is the difference between the smallest value and the largest value of a series
Advantages of the Range
 It is the simplest to understand and compute
 It takes the minimum time to calculate the value of the range
Limitations
 It is not based on each and every value of the distribution
 It is subject to fluctuations of considerable magnitude from sample to sample
 It cannot be computed in case of open-ended distributions
 It does not explain or indicate anything about the character of the distribution within the
two extreme observations.
Uses of the range
 Quality control
 Fluctuations of prices
 Weather forecast
 Finding the difference between two values e.g. wages earned by different employees.
The standard deviation
It is the square root of the arithmetic average of the squares of the deviations measured from the
mean. It measures how much “spread” or “ Variability” is present in the sample. A small
standard deviation means a high degree of uniformity of the observations as well as the
homogeneity of a series and vice versa.
1
INTRODUCTION TO STATISTICS ST.PAULS UNIVERSITY
Ways of computing the standard deviation
Direct method
Ungrouped data

 dx
2
where
n
Grouped data

 fdx
 dx
2
= sum of squares of the deviations from arithmetic mean
2
n
Indirect method
Ungrouped data


Dx 2
n
2
  Dx 
 where Dx are deviations from an assumed mean.

 n 


Grouped data


fDx 2
n
  fDx 


 n 


2
Step deviation method
  i*

fD' x 2
n
2
  fD' x 
 where i  common factor D' x = step deviations from the



n


assumed mean.
Coefficient of standard deviation
Coefficient of variation =

Mean
=

Mean
* 100
Combined arithmetic mean and combined standard deviation
Combined arithmetic mean for two sets of data with arithmetic means x1 , x 2 and the number of
observations n1 n 2 is given by X 
n1 x1  n2 x 2
n1  n2
Combined standard deviation of two series is given by (with n1 n 2 large)
n1 s1  n2 s 2  n1 d1  n2 d 2
n1  n2
2
S
2
2
2
where d1  x1  x and d 2  x 2  x
Example
An analysis of the monthly wages paid to workers of two firms A and B belonging to the same
industry gives the following results:
2
INTRODUCTION TO STATISTICS ST.PAULS UNIVERSITY
Firm A
Firm B
No. of wage earners
586
648
Average monthly wage
52.5
47.5
Standard deviation
10
11
Compute the combined standard deviation.
Advantages of the standard deviation
 It is rigidly defined and is based on all the observations of the series
 It is applied or used in other statistical techniques like correlation and regression analysis
and sampling theory.
 It is possible to calculate the combined standard deviation of two or more groups.
Disadvantages of the standard deviation
 It cannot be used for comparing the dispersion of two or more series of observations given
in different units.
 It gives more weight to extreme values.
Examples
1. The following marks belong to 99 students of a secondary school in Keroka Municipality
Marks
Number of students
0 – 10
10
10 – 20
?
20 – 30
25
30 – 40
30
40 – 50
?
50 – 60
10
On later analysis, it was discovered that two class interval frequencies were missing. The
median score was found to be 30.
Required:
i.
Find the missing frequencies.
ii.
Determine the modal mark of the students
iii. Find the mean mark
iv.
Find the standard deviation.
2.
The following table indicates the marks obtained by students in a statistics test.
Marks
Number of students
0 – 20
5
20 – 40
7
40 – 60
60 – 80
8
80 - 100
7
The arithmetic mean for the class was 52.5 marks. You are required to determine the
value of:
i.
ii.
iii.
iv.
The missing frequency
The median mark
The modal mark
The standard deviation
3
INTRODUCTION TO STATISTICS ST.PAULS UNIVERSITY
3.
The following are the daily wages in Ksh. of 30 workers of a flower farm in Ruiru, which
grow the flowers and export them to European Countries.
140 139 126
114
100 88
62
77
99
103
108 129 144
148
134 63
69
148
132
118
142 116 123
104
95
80
85
106
123
133
The Company (flower farm) gives bonus of Sh. 10, 15, 20, 25, 30 and 35 for individuals in
the respective salary; exceeding 60 but not exceeding 75, exceeding 75 but not exceeding 90
and so on up to exceeding 135 but not exceeding 150.
Required to calculate
i.
The average wage and average bonus paid by the flower farm
ii.
The median wage and median bonus
iii. The modal wage and modal bonus.
iv.
The standard deviation for the wages and also for the bonus
4