Download COMPLETE - Binus Repository

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Matakuliah
Tahun
: D0722 - Statistika dan Aplikasinya
: 2010
Pendahuluan
Pertemuan 1
Learning Outcomes
•
Pada akhir pertemuan ini, diharapkan
mahasiswa akan mampu :
1. memberikan definisi skala pengukuran,
sampel, populasi , data dan
pengumpulan data
2. menerangkan statistik deskriptif
3
COMPLETE
5th edi tion
1-4
BUSINESS STATISTICS
Using Statistics (Two Categories)

Descriptive Statistics





Inferential Statistics
 Predict and forecast
values of population
parameters
 Test hypotheses about
values of population
parameters
 Make decisions
Collect
Organize
Summarize
Display
Analyze
McGraw-Hill/Irwin

Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-5
BUSINESS STATISTICS
Types of Data - Two Types

Qualitative Categorical or
Nominal: Examples
are Color
 Gender
 Nationality
McGraw-Hill/Irwin

Quantitative Measurable or
Countable: Examples
are Temperatures
 Salaries
 Number of points scored
on a 100 point exam
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-6
Scales of Measurement
• Nominal Scale - groups or classes
Gender
• Ordinal Scale - order matters
Ranks
• Interval Scale - difference or distance
matters – has arbitrary zero value.
Temperatures
• Ratio Scale - Ratio matters – has a
natural zero value.
Salaries
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-7
Samples and Populations



A population consists of the set of all
measurements for which the investigator
is interested.
A sample is a subset of the measurements
selected from the population.
A census is a complete enumeration of
every item in a population.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-8
5th edi tion
Why Sample?

Census of a
population may
be:
Impossible
Impractical
Too costly
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-9
12-6 Index Numbers
An index number is a number that measures the relative
change in a set of measurements over time. For example: the
Dow Jones Industrial Average (DJIA), the Consumer Price
Index (CPI), the New York Stock Exchange (NYSE) Index.
Value in period i
Index number in period i: = 100
Value in base period
Changing the base period of an index:
Old index value
New index value: = 100
Index value of new base
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-10
BUSINESS STATISTICS
Index Numbers
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
121
121
133
146
162
164
172
187
197
224
255
247
238
222
100.0
100.0
109.9
120.7
133.9
135.5
142.1
154.5
162.8
185.1
210.7
204.1
196.7
183.5
McGraw-Hill/Irwin
64.7
64.7
71.1
78.1
86.6
87.7
92.0
100.0
105.3
119.8
136.4
132.1
127.3
118.7
Price and Index (1982=100) of Natural Gas Price
250
Original
Index (1984)
P ric e
Index
Index
Year Price 1984-Base 1991-Base
150
Index (1991)
50
Aczel/Sounderpandian
1985
1990
1995
Year
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-11
BUSINESS STATISTICS
Summary Measures: Population Parameters
Sample Statistics

Measures of Central
Tendency
Median
Mode
Mean

McGraw-Hill/Irwin

Measures of Variability
 Range
 Interquartile range
 Variance
 Standard Deviation
Other summary
measures:
 Skewness
 Kurtosis
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-12
Measures of Central Tendency
or Location
Median
 Middle value when
sorted in order of
magnitude
 50th percentile
Mode
 Most frequentlyoccurring value
Mean
 Average
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-13
BUSINESS STATISTICS
Arithmetic Mean or Average
The mean of a set of observations is their average the sum of the observed values divided by the
number of observations.
Population Mean
Sample Mean
N
m=
McGraw-Hill/Irwin
n
x
x=
i =1
N
Aczel/Sounderpandian
x
i =1
n
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-14
5th edi tion
Percentiles and Quartiles



Given any set of numerical observations,
order them according to magnitude.
The Pth percentile in the ordered set is that
value below which lie P% (P percent) of the
observations in the set.
The position of the Pth percentile is given by
(n + 1)P/100, where n is the number of
observations in the set.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-15
5th edi tion
Quartiles – Special Percentiles




Quartiles are the percentage points that break down
the ordered data set into quarters.
The first quartile is the 25th percentile. It is the point
below which lie 1/4 of the data.
The second quartile is the 50th percentile. It is the
point below which lie 1/2 of the data. This is also
called the median.
The third quartile is the 75th percentile. It is the
point below which lie 3/4 of the data.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
1-16
BUSINESS STATISTICS
5th edi tion
Measures of Variability or Dispersion

Range
 Difference between maximum and minimum values

Interquartile Range
 Difference between third and first quartile (Q3 - Q1)

Variance
 Average*of the squared deviations from the mean

Standard Deviation
 Square root of the variance

Definitions of population variance and sample variance differ slightly.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-17
5th edi tion
Example - Range and Interquartile Range (Data
is used from Example )
Sales
9
6
12
10
13
15
16
14
14
16
17
16
24
21
22
18
19
18
20
17
Sorted
Sales
6
9
10
12
13
14
14
15
16
16
16
17
17
18
18
19
20
21
22
24
McGraw-Hill/Irwin
Maximum - Minimum =
Range
Rank
24 - 6 = 18
1
Minimum
2
3
4
5
Q1 = 13 + (.25)(1) = 13.25
6 First Quartile
7
8
9
10
See slide # 19 for the template output
11
12
13
14
Q3 = 18+ (.75)(1) = 18.75
15
16 Third Quartile
17
Q3 - Q1 =
Interquartile
18
18.75 - 13.25 = 5.5
19
Range
Maximum
20
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-18
BUSINESS STATISTICS
Variance and Standard Deviation
Population Variance
Sample Variance
2
m
(x )
s 2 = i=1
x
2
s=
McGraw-Hill/Irwin
( x)
-
i=1
s
s =
2
i =1
N
N
=
(x - x)
n
N
N

i =1
2
N
2
(n - 1)
(
)
x n
=
N
2
2
n
x
i =1
2
n
i =1
(n - 1)
s= s
2
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-19
BUSINESS STATISTICS
Group Data and the Histogram


Dividing data into groups or classes or
intervals
Groups should be:
Mutually exclusive
• Not overlapping - every observation is assigned to
only one group
Exhaustive
• Every observation is assigned to a group
Equal-width (if possible)
• First or last group may be open-ended
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-20
BUSINESS STATISTICS
Frequency Distribution

Table with two columns listing:
 Each and every group or class or interval of values
 Associated frequency of each group
• Number of observations assigned to each group
• Sum of frequencies is number of observations
– N for population
– n for sample


Class midpoint is the middle value of a group or
class or interval
Relative frequency is the percentage of total
observations in each class
 Sum of relative frequencies = 1
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
1-21
BUSINESS STATISTICS
5th edi tion
Cumulative Frequency Distribution
x
Spending Class ($)
0 to less than 100
100 to less than 200
200 to less than 300
300 to less than 400
400 to less than 500
500 to less than 600
F(x)
Cumulative Frequency
30
68
118
149
171
184
F(x)/n
Cumulative Relative Frequency
0.163
0.370
0.641
0.810
0.929
1.000
The cumulative frequency of each group is the sum of the
frequencies of that and all preceding groups.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-22
5th edi tion
Histogram

A histogram is a chart made of bars of
different heights.
Widths and locations of bars correspond to
widths and locations of data groupings
Heights of bars correspond to frequencies or
relative frequencies of data groupings
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-23
Histogram Example
Frequency Histogram
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-24
Histogram Frequency

A histogram is a chart made of bars of
different heights.
Widths and locations of bars correspond to
widths and locations of data groupings
Heights of bars correspond to frequencies or
relative frequencies of data groupings
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-25
Skewness and Kurtosis

Skewness
– Measure of asymmetry of a frequency distribution
• Skewed to left
• Symmetric or unskewed
• Skewed to right

Kurtosis
– Measure of flatness or peakedness of a frequency
distribution
• Platykurtic (relatively flat)
• Mesokurtic (normal)
• Leptokurtic (relatively peaked)
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-26
5th edi tion
Skewness
Skewed to left
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-27
5th edi tion
Skewness
Symmetric
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-28
5th edi tion
Skewness
Skewed to right
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-29
Kurtosis
Platykurtic - flat distribution
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-30
5th edi tion
Kurtosis
Mesokurtic - not too flat and not too peaked
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-31
Kurtosis
Leptokurtic - peaked distribution
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
1-32
BUSINESS STATISTICS
5th edi tion
Methods of Displaying Data

Pie Charts
 Categories represented as percentages of total

Bar Graphs
 Heights of rectangles represent group frequencies

Frequency Polygons
 Height of line represents frequency

Ogives
 Height of line represents cumulative frequency

Time Plots
 Represents values over time
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
1-33
5th edi tion
Pie Chart
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-34
BUSINESS STATISTICS
Bar Chart
Fig. 1-11 Airline Operating Expenses and Revenues
12
Average Revenues
Average Expenses
10
8
6
4
2
0
American Continental Delta
Northwest Southwest United
USAir
A i r li n e
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-35
BUSINESS STATISTICS
Frequency Polygon and Ogive
Relative Frequency Polygon
0.3
Ogive
1.0
0.2
0.5
0.1
0.0
0.0
0
10
20
30
40
50
Sales
McGraw-Hill/Irwin
0
10
20
30
40
50
Sales
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-36
Time Plot
M o n thly S te e l P ro d uc tio n
(P ro b le m 1 -4 6 )
Millions of Tons
8.5
7.5
6.5
5.5
Month
McGraw-Hill/Irwin
J F M A M J J A S ON D J F M A M J J A S ON D J F M A M J J A S O
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
1-37
BUSINESS STATISTICS
5th edi tion
Exploratory Data Analysis - EDA
Techniques to determine relationships and trends,
identify outliers and influential observations, and
quickly describe or summarize data sets.

Stem-and-Leaf Displays
 Quick-and-dirty listing of all observations
 Conveys some of the same information as a histogram

Box Plots
 Median
 Lower and upper quartiles
 Maximum and minimum
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-38
BUSINESS STATISTICS
Example Stem-and-Leaf Display
1
2
3
4
5
6
McGraw-Hill/Irwin
122355567
0111222346777899
012457
11257
0236
02
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
5th edi tion
1-39
BUSINESS STATISTICS
Box Plot
Elements of a Box Plot
Outlier
Smallest data
point not below
inner fence
o
Largest data point
Suspected
not exceeding
outlier
inner fence
X
Outer
Fence
Inner
Fence
Q1-1.5(IQR)
Q1-3(IQR)
McGraw-Hill/Irwin
X
Q1
Median
Interquartile Range
Q3
Inner
Fence
Q3+1.5(IQR)
*
Outer
Fence
Q3+3(IQR)
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
5th edi tion
1-40
Example: Box Plot
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
Ringkasan
Skala pengukuran: nominal, ordinal, interval,
rasio
Penyajian data : histogram frekuensi
Angka indeks
Statistik deskriptif : ukuran pemusatan dan
penyebaran
41