Download 2.1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Central Tendency
Quartiles and Percentiles (‫)الربيعيات والمئينات‬
1. Quartiles: There are three Quartiles (Q1,Q2,Q3) ,
Q1 or first quartile which separate the25% of
sorted data, Q2 or second quartile same as
median and Q3 which separate the75% of
sorted data.
2. Percentiles they are a numerical values of the
variable that divide a set of ordered data into
100 equal parts, each set of data has 99
percentiles.
Central Tendency
Quartiles and Percentiles (‫)الربيعيات والمئينات‬
1. The procedures of determines the value of kth
percentiles:
2. The data must be ordered.
3. The position for percentile is founded by first
k ( n  1)
calculating the value
100
4.The percentile it self is obtained by finding the
corresponding value in ordered data.
Central Tendency
Quartiles and Percentiles (‫)الربيعيات والمئينات‬
Example: suppose we have the observations {7, 4,
3, 5, 6, 8, 10 ,1}, find the 30th and 50th
percentiles.
1. Sort values as:{1, 3, 4, 5, 6, 7, 8, 10}
2. The position of 30th = 30(8  1)  2.7
100
3.Then the 30th is
3  4 =3.5
2
Central Tendency
Quartiles and Percentiles (‫)الربيعيات والمئينات‬
4.The position of
5. Then the
50th
50th
=
50(8  1)
 4.5
100
56
 5 .5
is
2
Central Tendency
• Mode
Mode for ungrouped data is the highest
frequency.
• Example {1, 2, 4, 2, 2, and 4} mode
here is 2.
Central Tendency
•
•
•
•
For grouped data
Step (1): determine the modal class
(class with the highest frequency).
Step (2): Calculate D1= difference
between the largest frequency and
frequency immediately preceding it.
Step (3): Calculate D2= difference
between the largest frequency and the
frequency immediately following it.
Central Tendency
•
Step (4): Calculate the mode using the
following formula
 D1 
mod e  L  
C
 D1  D2 
Where: L is the lower bound of modal
class.
C: is the modal class width.
Central Tendency
•
Example: Calculate the mode
from table
Age
Frequency
20-25
2
25-30
14
30-35
29
35-40
43
40-45
33
45-50
9
Total
130
Central Tendency
•
•
•
•
•
Solution: Modal class is 43
L=35
D1= 43-29=14
D2= 43-33=10
C=5
 D1 
 14 
mod e  L  
 5  37.92
  C  35  

14  10 
 D1  D2 
Central Tendency
• Advantages of mode
1. It is more appropriate to know the most
common value.
2. Easy to understand, not difficult to be
calculated.
3. It is not affected by extreme values.
Central Tendency
Disadvantages of Mode
1. It ignores dispersion around the mode.
2. It is unsuitable for further statistical
analysis.
3. It is affected by the most popular class when
a distribution is significantly skewed.
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
• Measure of variation is so important in
statistics because it gives information
on the spread or variability of the data
values.
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
Why study dispersion is important?
• A measure of location such as the
mean or the median, only describes the
center of the data, but it doesn’t tell us
anything about the spread of the data.
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
• Example: In a hospital where each patient’s
pulse rate is taken three times a day, that of a
patient A is 72, 76, and 74. While that for
patient B is 72, 91 and 59.
• The mean pulse of rate of the two patient is
the same 74 but observe the difference in
variability, whereas patient A’s pulse rate is
stable that of patient B is not.
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
Range
• The range is the difference between the
highest (maximum) and lowest
(minimum) observation.
• Range Xmax – Xmin
• The range can be calculated quickly,
but it is not very useful.
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
• Mean deviation
• It is the deviation of all observation
from the mean.
• Formula
n
Mean Deviation 
x
i 1
i
n
x
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
• Example: the table below has a data
about graduated students from national
school of engineering during five years.
Calculate the mean of deviation.
Year of graduation
No. of students
2004
2005
2006
2007
2008
4
6
5
8
7
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
• Solution
1. Calculate mean
x

x
i
n
x x
2. Mean Deviation =  n
i
(4  6  5  8  7) 30


6
5
5

46  66  56  86  76
5
 1.2
Interpretation: The mean deviation of the data about mean is equal
1.2
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
Variance
1. The variance is a measure which uses
the mean as a point of reference.
2. The variance is less when all values
are close to the mean.
3.Variance is the average (approximately)
of squared deviations of values from
the mean.
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
Formula:
1. For ungrouped data population variance:
N

2

 (x
i 1
i
 )2
N
2. For ungrouped data sample variance
n
S 
2
 (x
i 1
i
 x)
n 1
2
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
Standard Deviation (S)
1. It is the square root of variance.
2. Most commonly used measure of
variance.
3. Shows variation about mean.
4. It used to compare between more than
one data set when the means are equal,
the best one is the minimum.
S S
2
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
• Example


( x  x) ( x  x) 2
Year of
graduation
No. of
Students
2004
4
-2
4
2005
6
0
0
2006
5
-1
1
2007
8
2
4
2008
7
1
1
Total
30
0
10
• Calculate variance and standard deviation
Measures of variation
(dispersion) [‫]مقاييس التشتت‬
n
• Solution: Variance
S 
2
 (x
i 1
i
 x )2
n 1

10
 2.5
4
• Standard Deviation S  S 2  2.5  1.58
• Interpretation: The observations fall 1.58
units from the mean.
Related documents