Download 01_03

Document related concepts
no text concepts found
Transcript
Statistics for Managers
using Microsoft Excel
3rd Edition
Kafli 1
Inngangur og gagnaöflun
© 2002 Prentice-Hall, Inc.
Chap 1-1
Efni kaflans

Hvers vegna þurfa stjórnendur á tölfræði að
halda

Vöxtur og viðgangur nútíma tölfræði

Helstu skilgreiningar

Lýsandi- og ályktannatölfræði
Guðmundur Ólafsson lektor
Kafli 1-2
Efni kaflans

Til hvers eru gögn

Tegundir gagna og uppruni

Hönnun rannsókna

Úrtaksaðferðir

Kannanaskekkjur og villur
Guðmundur Ólafsson lektor
(framhald)
Kafli 1-3
Hvað þurfa stórnendur að vita
um tölfræði


Að geta sett fram upplýsingar á
viðeigandi hátt
Að vita hvernig draga á ályktanir um
safn (Þýði) út frá úrtaki

Að geta bætt vinnuferla

Að geta gert spár sem má treysta
Guðmundur Ólafsson lektor
Kafli 1-4
Vöxtur og viðgangur nútíma
tölfræði
Þörf hins opinbera fyrir
gögn um einstaklinga
Þróun líkindastærðfræði
Tölvur koma til sögunnar
Guðmundur Ólafsson lektor
Kafli 1-5
Lykilhugtök




Þýði (úrtaksrúm, population) er safn þeirra tilvika
sem til greina koma
Úrtak (sample) er hluti af þýði, sem tekið er til
rannsóknar, hugsanlega þýðið allt (hlutimengi í þýði)
Stiki (parameter) er einkennistala sem reiknuð er út
til þess að lýsa þýði
Lýsitala (statistic, reiknihending) er einkennistala
sem reiknuð er út til þess að lýsa úrtaki
Guðmundur Ólafsson lektor
Kafli 1-6
Þýði og úrtak
Þýði
Úrtak
Lýsitölur lýsa
einkennum úrtaks
Stikar lýsa
einkennum þýðis
Ályktanir um þýði út frá úrtaki
Guðmundur Ólafsson lektor
Kafli 1-7
Tölfræðilegar aðferðir

Lýsandi tölfræði


Söfnun og lýsing gagna
Ályktunartölfræði

Draga ályktanir um þýði, sem byggðar eru á
vinnslu úrtaka úr þýðinu
Guðmundur Ólafsson lektor
Kafli 1-8
Lýsandi tölfræði

Safna gögnum


Setja gögn fram


t.d. með skoðanakönnun
t.d. með töflum og línuritum
Lýsa einkennum gagna

t.d. með úrtaksmeðaltali =
Guðmundur Ólafsson lektor
X
i
n
Kafli 1-9
Ályktanatöfræði

Mat


T.d..: Meta meðalþunga þýðis
út frá meðalþunga úrtaks
Tilgátuprófun

T.d.: Kanna þá fullyrðingu að
meðalþungi í þýði sé 60 kg
Draga ályktanir og/eða taka ákvarðanir varðandi þýði,
sem byggðar eru á niðurstöðum úr úrtakskönnun.
Guðmundur Ólafsson lektor
Kafli 1-10
Til hvers eru gögn?

Þau eru notuð í könnunum

Þau eru nauðsinleg í rannsóknum


Til þess að meta frammistöðu í þjónustu eða
framleiðslu
Til þess að meta hvort farið sé eftir stöðluðum
reglum

Til þess að hjálpa til við að finna aðrar lausnir

Til þess að svala forvitni
Guðmundur Ólafsson lektor
Kafli 1-11
Uppruni gagna
Frumgögn
Afleidd gögn
Gagnaöflun
Gagnasöfnun
Athugun
Könnun
Prentuð eða
rafræn gögn
Tilraunir
Guðmundur Ólafsson lektor
Kafli 1-12
Tegundir gagna
Gögn
Töluleg gögn
Flokkunargögn
Strjál
Guðmundur Ólafsson lektor
Samfelld
Kafli 1-13
Uppsetning kannanna

Veljið viðeigandi svarform

Frumform sem hægt er að treysta




einstaklingsviðtöl
símaviðtöl
póstkannanir
Síður áreiðanleg form, sjálfvalið (gagnast ekki til
ályktana um þýðið í heild)




sjónvarpskannanir
internetkannanir
Prentaðar kannanir í blöðum og tímaritum
Vöru- eða þjónustukannanir
Guðmundur Ólafsson lektor
Kafli 1-14
Uppsetning kannana

Finnið almenn atriði


Gerið fullkomin lista yfir atriði sem ekki skarast
og snerta hið kannaða
Gerið nákvæmar spurningar


(framhald)
Spurningar skulu vera skýrar og ótvíræðar
Notið almennt viðurkennd og vel skilgreind orð
Prófið könnunina

Gerið tilraun með könnunina á litlum hópi til
að hún verði skýrari og til þess að afmarka
lengd
Guðmundur Ólafsson lektor
Kafli 1-15
Uppsetning kannana

(framhald)
Skrifið vandað kynningarbréf




Setið fram markmið og hlutverk könnunarinnar
Útskýrið mikilvægi svörunar
Tryggið nafnleynd svarenda
Bjóðið hugsanlega upp á hvatningu, til dæmis
gjafir
Guðmundur Ólafsson lektor
Kafli 1-16
Ástæður fyrir úrtöku

Tekur skemmri tíma en að kanna heildina

Kostar minna en að kanna heildina

Einfaldara að framkvæma úrtakskönnun
en að taka heildina fyrir
Guðmundur Ólafsson lektor
Kafli 1-17
Úrtökuaðferðir
Úrtök
Ekki líkindaúrtök
Líkindaúrtök
Einfalt
tilviljana
Mat
Chunk
Kvóti
Guðmundur Ólafsson lektor
Lagskipt
Hópaúrtak
Kerfisbundið
Kafli 1-18
Líkindaúrtök

Valið samkvæmt þekktum líkum
Líkindaúrtök
Einföld
tilviljana
Kerfisbundin
Guðmundur Ólafsson lektor
Lagskipt
Hópa
Kafli 1-19
Einföld tilviljanaúrtök



Sérhvert stak í þýði er jafn líklegt til að lenda í
úrtakinu
Val á staki getur verið endurtekið eða ekki
Úrtök gerð samkvæmt tilviljanatöflum eða
tilviljanaforritum í tölvu
Guðmundur Ólafsson lektor
Kafli 1-20
Kerfisbundin úrtök

Ákveðið úrtaksstærð: n

Skiptið N stökum upp í k hópa: k=n/n

Veljið eitthvert stak úr fyrsta hópi tilviljanakent

Veljið síðan næsta stak k sætum neðar
N = 64
n=8
k=8
Guðmundur Ólafsson lektor
Fyrsti
hópur
Kafli 1-21
Lagskipt úrtök

Þýði skipt upp í tvo eða fleirri hópa samkvæmt
almennum einkennum

Einfallt tilviljanaúrtak valið úr hverjum hópi

Tvö eða fleirri úrtök eru síðan samaeinuð
Guðmundur Ólafsson lektor
Kafli 1-22
Hópaúrtök

Þýðinu skipt upp í hópa sem einkenna það

Einfallt tilviljanaúrtak tekið úr hverjum hópi

Úrtökin sameinuð í eitt
Þýði skipt
upp í 4
hópa.
Guðmundur Ólafsson lektor
Kafli 1-23
Kostir og gallar

Einföld tilviljanaúrtök og kerfisbundin



Stratified sample


Einföld í notkun
Tiltekin einkenni þýðis kunna að vera vanmetin
Tryggja að tillit sé tekið til allra hópa í þýði
Cluster sample


Ódýrari
Ekki eins nákvæm (úrtök þurfa stundum að vera
stærri til þess að ná tiltekinni nákvæmni)
Guðmundur Ólafsson lektor
Kafli 1-24
Mat á gæðum könnunar






Hvert er markmiðið?
Er könnun byggð á líkindaúrtaki?
Er um bjögun að ræða, ef ekki næst í alla
Er reynt að kanna þá sem ekki taka afstöðu
Mælingaskekkjur?
Eru villur í framkvæmd könnunar
Guðmundur Ólafsson lektor
Kafli 1-25
Könnunarskekkjur

Ekki næst í alla

Svara ekki

Könnunarskekkjur

Mælingaskekkjur
Ekki með í
úrtaki.
Eftirfylgni,
aukaspurningar
Ekki spurt um að
sama alstaðar.
Vondar spurningar!
Guðmundur Ólafsson lektor
Kafli 1-26
Yfirlit




Fjallað um hvað stjórnendur þurfa að vita um
tölfræði
Fjallað um vöxt og viðgang nútíma tölfræði
Lýsandi og ályktanatölfræði kynntar
Mikilvægi gagna tekið fyrir
Guðmundur Ólafsson lektor
Kafli 1-27
Yfirlit




(framhald)
Mismunandi uppruni og gerð gagna skilgreind
og skýrð
Fjallað um gerð kannana
Úrtaksaðferðir reifaðar.
Helstu kannanavillur kynntar
Guðmundur Ólafsson lektor
Kafli 1-28
Statistics for Managers
using Microsoft Excel
3rd Edition
Kafli 2
Lýsing gagna í töflum og
myndrænt
© 2002 Prentice-Hall, Inc.
Chap 1-29
Chapter Topics

Organizing numerical data


Tabulating and graphing Univariate numerical
data



The ordered array and stem-leaf display
Frequency distributions: tables, histograms,
polygons
Cumulative distributions: tables, the Ogive
Graphing Bivariate numerical data
Guðmundur Ólafsson lektor
Kafli 1-30
Chapter Topics



Tabulating and graphing Univariate
categorical data

The summary table

Bar and pie charts, the Pareto diagram
(continued)
Tabulating and graphing Bivariate categorical
data

Contingency tables

Side by side bar charts
Graphical excellence and common errors in
presenting data
Guðmundur Ólafsson lektor
Kafli 1-31
Organizing Numerical Data
Numerical Data
Ordered Array
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Stem and Leaf
Display
2 144677
3 028
4 1
Guðmundur Ólafsson lektor
41, 24, 32, 26, 27, 27, 30, 24, 38, 21
Frequency Distributions
Cumulative Distributions
Histograms
Tables
Ogive
Polygons
Kafli 1-32
Organizing Numerical Data
(continued)



Data in raw form (as collected):
24, 26, 24, 21, 27, 27, 30, 41, 32, 38
Data in ordered array from smallest to largest:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Stem-and-leaf display:
2 144677
3 028
4 1
Guðmundur Ólafsson lektor
Kafli 1-33
Tabulating and Graphing
Numerical Data
Numerical Data
Ordered Array
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
41, 24, 32, 26, 27, 27, 30, 24, 38, 21
Frequency Distributions
Cumulative Distributions
O g ive
120
100
80
60
40
20
0
10
Stem and Leaf
Display
2 144677
3 028
4 1
Histograms
30
40
50
60
Ogive
7
6
5
4
Tables
Polygons
3
2
1
0
10
Guðmundur Ólafsson lektor
20
20
30
40
50
60
Kafli 1-34
Tabulating Numerical Data:
Frequency Distributions

Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Find range: 58 - 12 = 46

Select number of classes: 5 (usually between 5 and 15)

Compute class interval (width): 10 (46/5 then round up)

Determine class boundaries (limits):

Compute class midpoints:

Count observations & assign to classes
Guðmundur Ólafsson lektor
10, 20, 30, 40, 50, 60
15, 25, 35, 45, 55
Kafli 1-35
Frequency Distributions, Relative Frequency
Distributions and Percentage Distributions
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Class
10 but under 20
20 but under 30
30 but under 40
40 but under 50
50 but under 60
Total
Guðmundur Ólafsson lektor
Relative
Frequency Frequency Percentage
3
6
5
4
2
20
.15
.30
.25
.20
.10
1
15
30
25
20
10
100
Kafli 1-36
Graphing Numerical Data:
The Histogram
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Frequency
Histogram
7
6
5
4
3
2
1
0
6
5
3
2
0
5
Class Boundaries
Guðmundur Ólafsson lektor
No Gaps
Between
Bars
4
0
15
25
36
45
Class Midpoints
55
More
Kafli 1-37
Graphing Numerical Data:
The Frequency Polygon
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Frequenc y
7
6
5
4
3
2
1
0
5
Guðmundur Ólafsson lektor
15
25
36
45
55
Class Midpoints
M ore
Kafli 1-38
Tabulating Numerical Data:
Cumulative Frequency
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Class
10 but under 20
20 but under 30
30 but under 40
40 but under 50
50 but under 60
Guðmundur Ólafsson lektor
Cumulative
Frequency
3
9
14
18
20
Cumulative
% Frequency
15
45
70
90
100
Kafli 1-39
Graphing Numerical Data:
The Ogive (Cumulative % Polygon)
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Ogive
100
80
60
40
20
0
10
20
30
40
50
60
Class Boundaries (Not Midpoints)
Guðmundur Ólafsson lektor
Kafli 1-40
Graphing Bivariate Numerical
Data (Scatter Plot)
Mutual Funds Scatter Plot
Total Year to
Date Return
(%)
40
30
20
10
0
0
Guðmundur Ólafsson lektor
10
20
30
Net Asset Values
40
Kafli 1-41
Tabulating and Graphing
Categorical Data:Univariate Data
Categorical Data
Tabulating Data
The Summary Table
Graphing Data
Pie Charts
Bar Charts
Guðmundur Ólafsson lektor
Pareto Diagram
Kafli 1-42
Summary Table
(for an Investor’s Portfolio)
Investment Category
Amount
Percentage
(in thousands $)
Stocks
Bonds
CD
Savings
Total
46.5
32
15.5
16
110
42.27
29.09
14.09
14.55
100
Variables are Categorical
Guðmundur Ólafsson lektor
Kafli 1-43
Graphing Categorical Data:
Univariate Data
Categorical Data
Graphing Data
Tabulating Data
The Summary Table
Pie Charts
CD
Pareto Diagram
S a vi n g s
Bar Charts
B onds
S to c k s
0
10
20
30
40
50
45
120
40
100
35
30
80
25
60
20
15
40
10
20
5
0
0
S to c k s
Guðmundur Ólafsson lektor
B onds
S a vi n g s
CD
Kafli 1-44
Bar Chart
(for an Investor’s Portfolio)
Investor's Portfolio
Savings
CD
Bonds
Stocks
0
10
20
30
40
50
Amount in K$
Guðmundur Ólafsson lektor
Kafli 1-45
Pie Chart
(for an Investor’s Portfolio)
Amount Invested in K$
Savings
15%
Stocks
42%
CD
14%
Bonds
29%
Guðmundur Ólafsson lektor
Percentages are
rounded to the
nearest percent.
Kafli 1-46
Pareto Diagram
Axis for
bar
chart
shows
%
invested
in each
category
45%
100%
40%
90%
80%
35%
70%
30%
60%
25%
50%
20%
40%
15%
30%
10%
20%
5%
10%
0%
0%
Stocks
Guðmundur Ólafsson lektor
Bonds
Savings
Axis for line
graph
shows
cumulative
% invested
CD
Kafli 1-47
Tabulating and Graphing
Bivariate Categorical Data

Contingency tables:
Investment
Category
Investor A
Stocks
Bonds
CD
Savings
46.5
32
15.5
16
Total
110
Guðmundur Ólafsson lektor
investment in thousands of dollars
Investor B
Investor C
Total
55
44
20
28
27.5
19
13.5
7
129
95
49
51
147
67
324
Kafli 1-48
Tabulating and Graphing
Bivariate Categorical Data

Side by side charts
C o m p arin g In vesto rs
S avings
CD
B onds
S toc k s
0
10
Inves tor A
Guðmundur Ólafsson lektor
20
30
Inves tor B
40
50
60
Inves tor C
Kafli 1-49
Principles of Graphical Excellence





Presents data in a way that provides
substance, statistics and design
Communicates complex ideas with clarity,
precision and efficiency
Gives the largest number of ideas in the most
efficient manner
Almost always involves several dimensions
Tells the truth about the data
Guðmundur Ólafsson lektor
Kafli 1-50
Errors in Presenting Data


Using “chart junk”
Failing to provide a relative
basis in comparing data
between groups

Compressing the vertical axis

Providing no zero point on the vertical axis
Guðmundur Ólafsson lektor
Kafli 1-51
“Chart Junk”
Bad Presentation
 Good Presentation
Minimum Wage
1960: $1.00
Minimum Wage
4
$
1970: $1.60
2
1980: $3.10
0
1990: $3.80
Guðmundur Ólafsson lektor
1960
1970
1980
1990
Kafli 1-52
No Relative Basis
Bad Presentation
 Good Presentation
A’s received by
Freq. students.
300
200
30 %

10
0

FR SO
JR SR
A’s received by
students.

FR SO JR SR
FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior
Guðmundur Ólafsson lektor
Kafli 1-53
Compressing Vertical Axis
Bad Presentation
Good Presentation
Quarterly Sales
200
$
Quarterly Sales
50
100
25
0
0
Q1 Q2
Q3 Q4
Guðmundur Ólafsson lektor
$
Q1
Q2
Q3 Q4
Kafli 1-54
No Zero Point on Vertical Axis
Bad Presentation

Good Presentation
Monthly Sales
45
$
Monthly Sales
45
42
39
42
39
$
36
36
J F M A M J
0
J F M A M J
Graphing the first six months of sales.
Guðmundur Ólafsson lektor
Kafli 1-55
Chapter Summary

Organized numerical data


Tabulated and graphed univariate numerical
data



The ordered array and stem-leaf display
Frequency distributions: tables, histograms,
polygon
Cumulative distributions: tables and the Ogive
Graphed bivariate numerical data
Guðmundur Ólafsson lektor
Kafli 1-56
Chapter Summary

Tabulated and graphed univariate categorical data



The summary table
Bar and pie charts, the Pareto diagram
Tabulated and graphed bivariate categorical data



(continued)
Contingency tables
Side by side charts
Discussed graphical excellence and common errors in
presenting data
Guðmundur Ólafsson lektor
Kafli 1-57
Statistics for Managers
using Microsoft Excel
3rd Edition
Chapter 3
Numerical Descriptive Measures
© 2002 Prentice-Hall, Inc.
Chap 1-58
Chapter Topics

Measures of central tendency

Mean, median, mode, geometric mean, midrange

Quartile

Measure of variation


Range, interquartile range, variance and standard
deviation, coefficient of variation
Shape

Symmetric, skewed, using box-and-whisker plots
Guðmundur Ólafsson lektor
Kafli 1-59
Chapter Topics


(continued)
Coefficient of correlation
Pitfalls in numerical descriptive measures and
ethical considerations
Guðmundur Ólafsson lektor
Kafli 1-60
Summary Measures
Summary Measures
Central Tendency
Mean
Quartile
Mode
Median
Range
Variation
Coefficient of
Variation
Variance
Geometric Mean
Guðmundur Ólafsson lektor
Standard Deviation
Kafli 1-61
Measures of Central Tendency
Central Tendency
Average
Median
Mode
n
X 
X
i 1
i
X G   X1  X 2 
n
N

X
i 1
Geometric Mean
 Xn 
1/ n
i
N
Guðmundur Ólafsson lektor
Kafli 1-62
Mean (Arithmetic Mean)

Mean (arithmetic mean) of data values

Sample mean
Sample Size
n
X

X
i 1
i
n
X1  X 2 

n
 Xn
Population mean
Population Size
N

Guðmundur Ólafsson lektor
X
i 1
N
i
X1  X 2 

N
 XN
Kafli 1-63
Mean (Arithmetic Mean)
(continued)


The most common measure of central
tendency
Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 5
Guðmundur Ólafsson lektor
0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 6
Kafli 1-64
Median


Robust measure of central tendency
Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10
Median = 5

0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
In an ordered array, the median is the
“middle” number


If n or N is odd, the median is the middle number
If n or N is even, the median is the average of the
two middle numbers
Guðmundur Ólafsson lektor
Kafli 1-65
Mode






A measure of central tendency
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may may be no mode
There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Guðmundur Ólafsson lektor
Mode = 9
0 1 2 3 4 5 6
No Mode
Kafli 1-66
Geometric Mean

Useful in the measure of rate of change of a
variable over time
X G   X1  X 2 

 Xn 
1/ n
Geometric mean rate of return

Measures the status of an investment over time
RG  1  R1   1  R2  
Guðmundur Ólafsson lektor
 1  Rn  
1/ n
1
Kafli 1-67
Example
An investment of $100,000 declined to $50,000 at the
end of year one and rebounded to $100,000 at end of
year two:
X1  $100,000
X 2  $50,000
X 3  $100,000
Average rate of return:
(50%)  (100%)
X
 25%
2
Geometric rate of return:
RG  1   50%    1  100%   
1/ 2
  0.50    2  
1/ 2
Guðmundur Ólafsson lektor
1
 1  1  1  0%
1/ 2
Kafli 1-68
Quartiles

Split Ordered Data into 4 Quarters
25%
25%
 Q1 

25%
 Q2 
Position of i-th Quartile
25%
Q3 
i  n  1
 Qi  
4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
1 9  1
Position of Q1 
 2.5
4
Q1
12  13


 12.5
2
Q1 and Q3 Are Measures of Noncentral Location
 Q = Median, A Measure of Central Tendency
2

Guðmundur Ólafsson lektor
Kafli 1-69
Measures of Variation
Variation
Variance
Range
Population
Variance
Sample
Variance
Interquartile Range
Guðmundur Ólafsson lektor
Standard Deviation
Coefficient
of Variation
Population
Standard
Deviation
Sample
Standard
Deviation
Kafli 1-70
Range


Measure of variation
Difference between the largest and the
smallest observations:
Range  X Largest  X Smallest

Ignores the way in which data are distributed
Range = 12 - 7 = 5
Range = 12 - 7 = 5
7
8
9
10
11
Guðmundur Ólafsson lektor
12
7
8
9
10
11
12
Kafli 1-71
Interquartile Range


Measure of variation
Also known as midspread


Spread in the middle 50%
Difference between the first and third
quartiles
Data in Ordered Array: 11 12 13 16 16 17
17 18 21
Interquartile Range  Q3  Q1  17.5  12.5  5

Not affected by extreme values
Guðmundur Ólafsson lektor
Kafli 1-72
Variance


Important measure of variation
Shows variation about the mean

Sample variance:
n
S 
2

 X
i 1
X
i
2
n 1
Population variance:
N
 
2
Guðmundur Ólafsson lektor
 X
i 1
i

N
2
Kafli 1-73
Standard Deviation



Most important measure of variation
Shows variation about the mean
Has the same units as the original data

Sample standard deviation:
n
S

Population standard deviation:

Guðmundur Ólafsson lektor
 X
i 1
X
i
2
n 1
N
 X
i 1
i

2
N
Kafli 1-74
Comparing Standard Deviations
Data A
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 3.338
Data B
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = .9258
Data C
11 12 13 14 15 16 17 18 19 20 21
Guðmundur Ólafsson lektor
Mean = 15.5
s = 4.57
Kafli 1-75
Coefficient of Variation

Measures relative variation

Always in percentage (%)

Shows variation relative to mean


Is used to compare two or more sets of data
measured in different units
S
CV  
X
Guðmundur Ólafsson lektor

100%

Kafli 1-76
Comparing Coefficient
of Variation

Stock A:



Stock B:



Average price last year = $50
Standard deviation = $5
Average price last year = $100
Standard deviation = $5
Coefficient of variation:

Stock A:

Stock B:
Guðmundur Ólafsson lektor
S
CV  
X

 $5 
100%  
100%  10%

 $50 
S
CV  
X

 $5 
100%  
100%  5%

 $100 
Kafli 1-77
Shape of a Distribution

Describes how data is distributed

Measures of shape

Symmetric or skewed
Left-Skewed
Mean < Median < Mode
Guðmundur Ólafsson lektor
Symmetric
Mean = Median =Mode
Right-Skewed
Mode < Median < Mean
Kafli 1-78
Exploratory Data Analysis

Box-and-whisker plot

Graphical display of data using 5-number summary
X smallest Q
1
4
6
Guðmundur Ólafsson lektor
Median( Q2)
8
Q3
10
Xlargest
12
Kafli 1-79
Distribution Shape and
Box-and-Whisker Plot
Left-Skewed
Q1
Q2 Q3
Guðmundur Ólafsson lektor
Symmetric
Q1Q2Q3
Right-Skewed
Q1 Q2 Q3
Kafli 1-80
Coefficient of Correlation

Measures the strength of the linear
relationship between two quantitative
variables
n
r
 X
i 1
n
 X
i 1
Guðmundur Ólafsson lektor
i
i
 X Yi  Y 
X
2
n
 Y  Y 
i 1
2
i
Kafli 1-81
Features of
Correlation Coefficient

Unit free

Ranges between –1 and 1



The closer to –1, the stronger the negative linear
relationship
The closer to 1, the stronger the positive linear
relationship
The closer to 0, the weaker any positive linear
relationship
Guðmundur Ólafsson lektor
Kafli 1-82
Scatter Plots of Data with
Various Correlation Coefficients
Y
Y
Y
X
r = -1
X
r = -.6
Y
X
r=0
Y
r = .6
Guðmundur Ólafsson lektor
X
r=1
X
Kafli 1-83
Pitfalls in
Numerical Descriptive Measures

Data analysis is objective


Should report the summary measures that best
meet the assumptions about the data set
Data interpretation is subjective

Should be done in fair, neutral and clear manner
Guðmundur Ólafsson lektor
Kafli 1-84
Ethical Considerations
Numerical descriptive measures:



Should document both good and bad results
Should be presented in a fair, objective and
neutral manner
Should not use inappropriate summary
measures to distort facts
Guðmundur Ólafsson lektor
Kafli 1-85
Chapter Summary

Described measures of central tendency

Mean, median, mode, geometric mean, midrange

Discussed quartile

Described measure of variation


Range, interquartile range, variance and standard
deviation, coefficient of variation
Illustrated shape of distribution

Symmetric, skewed, box-and-whisker plots
Guðmundur Ólafsson lektor
Kafli 1-86
Chapter Summary


(continued)
Discussed correlation coefficient
Addressed pitfalls in numerical descriptive
measures and ethical considerations
Guðmundur Ólafsson lektor
Kafli 1-87