Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
STATISTIKA
CHATPER 4
(Perhitungan Dispersi
(Sebaran))
4-1 Range
4-2 Quartil , Desil, Persentil
4-3 Standar Deviasi
4-4 Variance
SULIDAR FITRI, M.Sc
March 18,2014
STMIK AMIKOM Yogyakarta
UKURAN DISPERSI

Suatu metode analisis yang ditujukan untuk mengukur
besarnya penyimpangan / penyebaran dari distribusi data
yang diperloleh terhadap nilai sentralnya.
Istilah Dispersi = Sebaran (Spread/ variation)
 Jika semua nilai data sama, maka data tersebut sama
dengan rata-ratanya dan tidak ada variasi/sebaran data
 Variasi data ada apabila beberapa nilai data berbeda dari
nilai rata-rata.


We will discuss the following measures of spread: range,
quartiles, variance, and standard deviation
RANGE

Salah satu cara untuk mengukur nilai sebaran adalah
dengan mendapatkan nilai terkecil (minimum) dan
terbesar (maximum) dalam dataset.
Range = max  min

Nilai Range sangat dipengaruhi oleh data outlier
Quartiles
 Three
numbers which divide the ordered
data into four equal sized groups.
 Q1
has 25% of the data below it.
 Q2
has 50% of the data below it.
 Q3
has 75% of the data below it.
Chapter 2
(Median)
BPS - 5th Ed.
4
Quartiles
Uniform Distribution
1st Qtr
BPS - 5th Ed.
Q1
Chapter 2
2nd Qtr
Q2
3rd Qtr
Q3
4th Qtr
5
Obtaining the Quartiles

Order the data.

For Q2, just find the median.

For Q1, look at the lower half of the data values,
those to the left of the median location; find the
median of this lower half.

For Q3, look at the upper half of the data values,
those to the right of the median location; find
the median of this upper half.
Chapter 2
BPS - 5th Ed.
6
Cara
Menemukan
Quartil:
How to find the Quartile location?
(Median location + 1) / 2
Note: if the median location is a fractional value (as when n is even), the
fraction should be dropped before computing the quartile location
Example Dataset (N is odd):
1
3
5
5
6
7
8
9
13
Median location: (N + 1) / 2 = (9 + 1) / 2 = 5
1
3
5
5
6
7
8
9
13
Quartile location = (5 + 1) / 2 = 3
Count up 3 from the bottom &3 down from the top for the quartiles
1
Q1
3
5
5
6
7
8
IQR = 8 – 5 = 3
9
13
Q3
Example Dataset (N is even):
1
3
5
5
6
7
Median location: (N + 1) / 2 = (8 + 1) / 2 = 4.5
1
3
5 5
6
7
8
9
Average of 5 & 6 = 5.5
8
9
Quartile location = (4 + 1) / 2 = 2.5
Count up 2.5 from the bottom & 2.5 down from the top for the quartiles
1
Q1
3
5 5
6
7
IQR = 7.5 – 4 = 3.5
8
9
Q3
RUMUS LAIN:

Weight Data: Sorted
L(M)=(53+1)/2=27
L(Q1)=(26+1)/2=13.5
100
101
106
106
110
110
119
120
120
123
Chapter 2
124
125
127
128
130
130
133
135
139
140
148
150
150
152
155
157
165
165
165
170
170
170
172
175
175
180
180
180
180
185
185
185
186
187
192
194
195
203
210
212
BPS - 5th Ed.
215
220
260
10
Weight Data: Quartiles
Q1=
127.5
Q2= 165 (Median)
Q3= 185
Chapter 2
BPS - 5th Ed.
11
10
11
12
first quartile 13
Quartiles
14
15
16
median or second quartile
17
third quartile 18
19
20
21
22
23
24
25
26
Weight Data:
BPS - 5th Ed.
Chapter 2
0166
009
0034578
00359
08
00257
555
000255
000055567
245
3
025
0
0
12
Five-Number Summary
 minimum
 Q1
M
= 100
= 127.5
= 165
 Q3
= 185
 maximum
= 260
Interquartile
Range (IQR)
= Q3  Q1
= 57.5
IQR gives spread of middle 50% of the data
Chapter 2
BPS - 5th Ed.
13
Ex. 4
Given the sorted weights of 30 female athletes, find the three
Quartiles:
94
101
105
107
108
109
110
112
113
115
119
123
124
124
124
127
130
130
135
136
136
141
148
149
150
156
160
160
162
163
(1) There are 30 values in the sample: n=30
(2) Q1 is the value at, or right above the value in the position
(0.25)30:
Q1 ::::  0.25 30   7.5
Since there is no position 7.5, we round 7.5 up to the next
whole number 8. Then Q1 is the value in 8th position :
Q1  112
(3) Q2 is the value in the middle of the 30 values:
Q2 ::::  0.530   15
15 is a whole number, therefore Q2 ( the median value) is
half-way between 15th and 16th values of the data set:
124  127
Q2 
 125 .5
2
(4) Q3 is the value in position: Q3 ::::  0.75 30   22 .5
or the value in the 23rd position:
Q3  148
Percentiles
Just as there are three quartiles separating data
into four parts, there are 99 percentiles
denoted P1, P2, . . . P99, which partition the data into
100 groups.
 If the position of the given percentile is a whole number, the data
value that corresponds to this percentile is half-way between the
value in this position and the next value.
 If the position of the given percentile is a decimal number, round it
up to the next whole number. The data value that corresponds to this
percentile is in that position.
Finding the value of a percentile:
Find the
athletes:
94
101
105
107
108
109
110
112
113
115
119
123
124
124
124
90th
127
130
130
135
136
136
141
148
149
150
156
160
160
162
163
percentile of the given sorted weights of the 30 female
Ex. 5
P90 is the value of the set that is the 90th percentile of the
set, and is therefore located at or right after position:
0.930   27
Since 27 is a whole number, P90 is the value that is half-way
between the 27th and the 28th values of the set:
P90
160  160

 160
2
160 lb is the 90th percentile of the 30 female athlete weights,
meaning 90% of the sampled athletes weigh less than 160
lbs.
Finding the percentile of a value:
What percentile is 135 in this set of values?
127
130
130
135
136
136
141
148
149
150
156
Ex. 6
135 is value which is higher than 3 values of the sorted set.
Total number of values in this data set is 11.
The proportion of all values of this set that are lower than
135 is then:
3
100 %  27 %
11
135 lbs is 27th percentile of this data set, meaning that 27% of
values in the data set are less than 135 lbs.
DESIL

Bilangan yang membagi data menjadi 10 bagian yang sama

Sehingga dalam 10 data terdapat 9 desil.
UKURAN PENYIMPANGAN

X
1
0
6
1
d
1–2
0–2
6–2
1-2
-1
-2
+4
-1
DEVIASI RATA-RATA

Jika dicari nilai mutlak untuk deviasi
rata-rata

Deviasi rata-rata data yang
dikelompokan

DEVIASI STANDAR

Adalah standar penyimpangan data dari rata-ratanya
2
2


(
X


)


(
X


)
2
2
pulation SD Population
 =  SD  = =
=
N
N
Standard Deviation in a Sample:
s=
 (x - x)
n-1
shortcut formula for the
the need to know the mean):
s=
2
SD in a Sample (eliminates
nx ) - (x)
n (n - 1)
2
2
Given the following data on the amount of pocket money, in dollars,
of 4 sampled individuals, find the sample SD:
x
xx
(x  x)2
1
-4
16
4
-1
1
5
0
0
10
+5
25
mean ẋ = 5
s=
Ex. 2
42
= 3.7
4-1
∑ = 42
(1) Add a deviation column:
(2) Add a column for the squares of the deviations:
(3) Divide the sum of the squares of the deviations by the number
of values decreased by 1, then take the square root
Pocket money amounts, in dollars, in the sample are spread $3.7,
on average, away from the mean amount of $5.
Again:
Standard Deviation in a Population:
 x   
2

N
Standard Deviation in a Sample:
 x  x 
2
s
n  1
 division by (n-1) makes the Sample SD target the value of the
Population SD closer and is necessitated by reasons studied in
later statistics classes.
Deviasi standar untuk data
dikelompokan

Some more definitions:
Variance is the square of the
Standard Deviation :
2
 in a sample: var  s
2
 In a Population: VAR  
 Range is the difference between
the maximum value and the minimum
value in the set:
range  max  min
Definitional Formulas
Population Variance 2
Sample Variance s2
(X  X) 2
s =
n 1
(X   ) 2
 =
N
2
2
(X -  ) = deviation
(X - X) = deviation
(X -  ) 2 = squared-deviation
(X - X) 2 = squared-deviation
 (X -  ) 2= sum of sq. deviations = SS
 (X - X) 2 = sum of sq. deviations=SS
N = population size
n-1 = ( sample size – 1) = “degrees of
freedom”
(X   )
N
2
= mean sq. deviation =
variance = 2
(X  X) 2
= mean sq. deviation=
n 1
variance = s2
Any Queries ?
Related documents