Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Matakuliah Tahun : D0722 - Statistika dan Aplikasinya : 2010 Pendahuluan Pertemuan 1 Learning Outcomes • Pada akhir pertemuan ini, diharapkan mahasiswa akan mampu : 1. memberikan definisi skala pengukuran, sampel, populasi , data dan pengumpulan data 2. menerangkan statistik deskriptif 3 COMPLETE 5th edi tion 1-4 BUSINESS STATISTICS Using Statistics (Two Categories)  Descriptive Statistics      Inferential Statistics  Predict and forecast values of population parameters  Test hypotheses about values of population parameters  Make decisions Collect Organize Summarize Display Analyze McGraw-Hill/Irwin  Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-5 BUSINESS STATISTICS Types of Data - Two Types  Qualitative Categorical or Nominal: Examples are Color  Gender  Nationality McGraw-Hill/Irwin  Quantitative Measurable or Countable: Examples are Temperatures  Salaries  Number of points scored on a 100 point exam Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-6 Scales of Measurement • Nominal Scale - groups or classes Gender • Ordinal Scale - order matters Ranks • Interval Scale - difference or distance matters – has arbitrary zero value. Temperatures • Ratio Scale - Ratio matters – has a natural zero value. Salaries McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-7 Samples and Populations    A population consists of the set of all measurements for which the investigator is interested. A sample is a subset of the measurements selected from the population. A census is a complete enumeration of every item in a population. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-8 5th edi tion Why Sample?  Census of a population may be: Impossible Impractical Too costly McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-9 12-6 Index Numbers An index number is a number that measures the relative change in a set of measurements over time. For example: the Dow Jones Industrial Average (DJIA), the Consumer Price Index (CPI), the New York Stock Exchange (NYSE) Index. Value in period i Index number in period i: = 100 Value in base period Changing the base period of an index: Old index value New index value: = 100 Index value of new base McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-10 BUSINESS STATISTICS Index Numbers 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 121 121 133 146 162 164 172 187 197 224 255 247 238 222 100.0 100.0 109.9 120.7 133.9 135.5 142.1 154.5 162.8 185.1 210.7 204.1 196.7 183.5 McGraw-Hill/Irwin 64.7 64.7 71.1 78.1 86.6 87.7 92.0 100.0 105.3 119.8 136.4 132.1 127.3 118.7 Price and Index (1982=100) of Natural Gas Price 250 Original Index (1984) P ric e Index Index Year Price 1984-Base 1991-Base 150 Index (1991) 50 Aczel/Sounderpandian 1985 1990 1995 Year © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-11 BUSINESS STATISTICS Summary Measures: Population Parameters Sample Statistics  Measures of Central Tendency Median Mode Mean  McGraw-Hill/Irwin  Measures of Variability  Range  Interquartile range  Variance  Standard Deviation Other summary measures:  Skewness  Kurtosis Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-12 Measures of Central Tendency or Location Median  Middle value when sorted in order of magnitude  50th percentile Mode  Most frequentlyoccurring value Mean  Average McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-13 BUSINESS STATISTICS Arithmetic Mean or Average The mean of a set of observations is their average the sum of the observed values divided by the number of observations. Population Mean Sample Mean N m= McGraw-Hill/Irwin n x x= i =1 N Aczel/Sounderpandian x i =1 n © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-14 5th edi tion Percentiles and Quartiles    Given any set of numerical observations, order them according to magnitude. The Pth percentile in the ordered set is that value below which lie P% (P percent) of the observations in the set. The position of the Pth percentile is given by (n + 1)P/100, where n is the number of observations in the set. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-15 5th edi tion Quartiles – Special Percentiles     Quartiles are the percentage points that break down the ordered data set into quarters. The first quartile is the 25th percentile. It is the point below which lie 1/4 of the data. The second quartile is the 50th percentile. It is the point below which lie 1/2 of the data. This is also called the median. The third quartile is the 75th percentile. It is the point below which lie 3/4 of the data. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 1-16 BUSINESS STATISTICS 5th edi tion Measures of Variability or Dispersion  Range  Difference between maximum and minimum values  Interquartile Range  Difference between third and first quartile (Q3 - Q1)  Variance  Average*of the squared deviations from the mean  Standard Deviation  Square root of the variance  Definitions of population variance and sample variance differ slightly. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-17 5th edi tion Example - Range and Interquartile Range (Data is used from Example ) Sales 9 6 12 10 13 15 16 14 14 16 17 16 24 21 22 18 19 18 20 17 Sorted Sales 6 9 10 12 13 14 14 15 16 16 16 17 17 18 18 19 20 21 22 24 McGraw-Hill/Irwin Maximum - Minimum = Range Rank 24 - 6 = 18 1 Minimum 2 3 4 5 Q1 = 13 + (.25)(1) = 13.25 6 First Quartile 7 8 9 10 See slide # 19 for the template output 11 12 13 14 Q3 = 18+ (.75)(1) = 18.75 15 16 Third Quartile 17 Q3 - Q1 = Interquartile 18 18.75 - 13.25 = 5.5 19 Range Maximum 20 Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-18 BUSINESS STATISTICS Variance and Standard Deviation Population Variance Sample Variance 2 m (x ) s 2 = i=1 x 2 s= McGraw-Hill/Irwin ( x) - i=1 s s = 2 i =1 N N = (x - x) n N N  i =1 2 N 2 (n - 1) ( ) x n = N 2 2 n x i =1 2 n i =1 (n - 1) s= s 2 Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-19 BUSINESS STATISTICS Group Data and the Histogram   Dividing data into groups or classes or intervals Groups should be: Mutually exclusive • Not overlapping - every observation is assigned to only one group Exhaustive • Every observation is assigned to a group Equal-width (if possible) • First or last group may be open-ended McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-20 BUSINESS STATISTICS Frequency Distribution  Table with two columns listing:  Each and every group or class or interval of values  Associated frequency of each group • Number of observations assigned to each group • Sum of frequencies is number of observations – N for population – n for sample   Class midpoint is the middle value of a group or class or interval Relative frequency is the percentage of total observations in each class  Sum of relative frequencies = 1 McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 1-21 BUSINESS STATISTICS 5th edi tion Cumulative Frequency Distribution x Spending Class ($) 0 to less than 100 100 to less than 200 200 to less than 300 300 to less than 400 400 to less than 500 500 to less than 600 F(x) Cumulative Frequency 30 68 118 149 171 184 F(x)/n Cumulative Relative Frequency 0.163 0.370 0.641 0.810 0.929 1.000 The cumulative frequency of each group is the sum of the frequencies of that and all preceding groups. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-22 5th edi tion Histogram  A histogram is a chart made of bars of different heights. Widths and locations of bars correspond to widths and locations of data groupings Heights of bars correspond to frequencies or relative frequencies of data groupings McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-23 Histogram Example Frequency Histogram McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-24 Histogram Frequency  A histogram is a chart made of bars of different heights. Widths and locations of bars correspond to widths and locations of data groupings Heights of bars correspond to frequencies or relative frequencies of data groupings McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-25 Skewness and Kurtosis  Skewness – Measure of asymmetry of a frequency distribution • Skewed to left • Symmetric or unskewed • Skewed to right  Kurtosis – Measure of flatness or peakedness of a frequency distribution • Platykurtic (relatively flat) • Mesokurtic (normal) • Leptokurtic (relatively peaked) McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-26 5th edi tion Skewness Skewed to left McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-27 5th edi tion Skewness Symmetric McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-28 5th edi tion Skewness Skewed to right McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-29 Kurtosis Platykurtic - flat distribution McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-30 5th edi tion Kurtosis Mesokurtic - not too flat and not too peaked McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-31 Kurtosis Leptokurtic - peaked distribution McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 1-32 BUSINESS STATISTICS 5th edi tion Methods of Displaying Data  Pie Charts  Categories represented as percentages of total  Bar Graphs  Heights of rectangles represent group frequencies  Frequency Polygons  Height of line represents frequency  Ogives  Height of line represents cumulative frequency  Time Plots  Represents values over time McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-33 5th edi tion Pie Chart McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-34 BUSINESS STATISTICS Bar Chart Fig. 1-11 Airline Operating Expenses and Revenues 12 Average Revenues Average Expenses 10 8 6 4 2 0 American Continental Delta Northwest Southwest United USAir A i r li n e McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-35 BUSINESS STATISTICS Frequency Polygon and Ogive Relative Frequency Polygon 0.3 Ogive 1.0 0.2 0.5 0.1 0.0 0.0 0 10 20 30 40 50 Sales McGraw-Hill/Irwin 0 10 20 30 40 50 Sales Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-36 Time Plot M o n thly S te e l P ro d uc tio n (P ro b le m 1 -4 6 ) Millions of Tons 8.5 7.5 6.5 5.5 Month McGraw-Hill/Irwin J F M A M J J A S ON D J F M A M J J A S ON D J F M A M J J A S O Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 1-37 BUSINESS STATISTICS 5th edi tion Exploratory Data Analysis - EDA Techniques to determine relationships and trends, identify outliers and influential observations, and quickly describe or summarize data sets.  Stem-and-Leaf Displays  Quick-and-dirty listing of all observations  Conveys some of the same information as a histogram  Box Plots  Median  Lower and upper quartiles  Maximum and minimum McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-38 BUSINESS STATISTICS Example Stem-and-Leaf Display 1 2 3 4 5 6 McGraw-Hill/Irwin 122355567 0111222346777899 012457 11257 0236 02 Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-39 BUSINESS STATISTICS Box Plot Elements of a Box Plot Outlier Smallest data point not below inner fence o Largest data point Suspected not exceeding outlier inner fence X Outer Fence Inner Fence Q1-1.5(IQR) Q1-3(IQR) McGraw-Hill/Irwin X Q1 Median Interquartile Range Q3 Inner Fence Q3+1.5(IQR) * Outer Fence Q3+3(IQR) Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-40 Example: Box Plot McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 Ringkasan Skala pengukuran: nominal, ordinal, interval, rasio Penyajian data : histogram frekuensi Angka indeks Statistik deskriptif : ukuran pemusatan dan penyebaran 41