Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Displaying & Summarizing Data (Lesson – 02A) From Data to Information Dr. C. Ertuna 1 Displaying & Summarizing Data Consumer Price Index: U.S. city average; 1967=100 Month CPI Jan-1999 492.3 Jul-1999 499.2 Jan-2000 505.8 Jul-2000 517.5 Dec-2000 521.1 Dr. C. Ertuna What do these data tell us about CPI? 2 Displaying & Summarizing Data • Raw data need to be converted into useful managerial information • Visual Displays and Statistical Summaries are means to convert raw data into insightful information. • That information could be interpreted for decision making purposes Dr. C. Ertuna 3 Visual Display of Data Consumer Price Index 530 520 510 500 490 480 Dr. C. Ertuna Jul-00 Jan-00 Jul-99 Jan-99 470 4 Displaying & Summarizing Data • Raw data need to be converted into useful managerial information • Visual Displays and Statistical Summaries are means to convert raw data into insightful information. • That information could be interpreted for decision making purposes Dr. C. Ertuna 5 Displaying & Summarizing Data • CPI increased steadily over the last two years. Consumer Price Index 530 520 510 500 • Over the last 6 months, however, that increase slowed down. 490 480 Jul-00 Jan-00 Jul-99 Jan-99 470 Dr. C. Ertuna 6 Displaying & Summarizing Data Consumer Price Index: U.S. city average; 1967=100 530 520 Jan-1999 492.3 Jul-1999 499.2 500 Jan-2000 505.8 480 Jul-2000 517.5 Dec-2000 521.1 510 490 Jul-00 Jan-99 470 Jan-00 CPI Jul-99 Month Consumer Price Index • CPI increased steadily over the last two years. • Over the last 6 months, however, that increase slowed down. Dr. C. Ertuna 7 Graphical Display Methods • Charts (Column, Bar, Line, Pie, Area) and Scatter Diagrams make it easier to gain insights about the data (visual interpretation of data), • They provide also excellent communication vehicles • The drawback is that data could be distorted by manipulating the scale on the chart. Dr. C. Ertuna 8 Creating a Column Chart Profits (in 1000 of Dollars) 2000 1998 5. Select “Series Rows” 6. Select Finish Tractor Mower 1996 20000 1996 1997 1998 1999 2000 18000 Tractor 1,808 2,674 3,974 11,802 18,089 16000 Mower 1,246 1,204 1,226 981 2,549 14000 12000 10000 1. Enter data in the worksheet 8000 2. Select Chart Wizard 6000 4000 3. Select Clustered Column Chart (1:1) 4. Select Data Range (including Labels) 20000 Data: St-CE-Ch02-x1-Examples-Slide Dr. C. Ertuna 9 Statistical Summaries • Descriptive Statistics such as – Measures of Central Tendency (mean, median, mode, midrange, etc.) – Measures of Dispersion (range, variance, standard deviation, etc.) – Frequency Distributions – Histograms • Statistical Relationships (such as Correlation) provide effective way of obtaining meaningful information from data. Dr. C. Ertuna 10 Descriptive Statistics • Descriptive statistics need to be computed for both the sample and the population • “Population Parameters” is the name for the Descriptive Statistics for population Greek letters represent Population Parameters • “Sample Statistics” is the name for the Descriptive statistics for sample Roman letters represent Sample Statistics. Dr. C. Ertuna 11 Measure of Central Tendency B C 2 Blood Pressure 1/A 3 NA Others 4 5 6 7 8 9 175 162 159 193 148 151 128 117 152 138 97 115 Blood pressures for North American and other managers are given on the left. Which group has higher blood pressure? To answer this question we need to measure first, the central tendency of each group. Data: St-CE-Ch02-x1-Examples-Slide 13 Dr. C. Ertuna 12 Descriptive Measures of C.T. Descriptive Measure Computation Method Data Level Pros/Cons Mean Sum of values divided by the number of values Ratio Interval • Numerical center of the data • Sum of the deviations from the mean is zero • Sensitive to extreme values Median Middle value for data that have been sorted Ratio Interval Ordinal • Not sensitive to extreme values • Computed only for the central values • Does not use information from all the data Mode Value(s) that occur most frequently in the data Ratio • May not reflect the center Interval • May not exist Ordinal • Might have multiple modes Nominal Dr. C. Ertuna 13 Descriptive Measures of C.T. Descriptive Measure Mean Median Mode Excel Command =Average(Range) =Median(Range) =Mode(Range) Symbol , x Pros/Cons • Numerical center of the data • Sum of the deviations from the mean is zero • Sensitive to extreme values • Not sensitive to extreme values • Computed only for the central values • Does not use information from all the data • May not reflect the center • May not exist • Might have multiple modes Dr. C. Ertuna 14 Example: Measure of C. T. (cont.) B C 2 Blood Pressure 1/A 3 NA Others 4 5 6 7 8 9 175 162 159 193 148 151 128 117 152 138 97 115 Data: St-CE-Ch02-x1-Examples-Slide 13 Blood pressures for NA and other managers are given on the left. 1. Compare mean, median, mode and midrange. 2. Explain the meaning of the results. 3. Evaluate the implications. Dr. C. Ertuna 15 Example: Measure of C. T. (cont.) 1/A B C D E F G H 2 Blood Pressure 3 NA Others 4 5 6 7 8 9 175 162 159 193 148 151 128 117 152 138 97 115 NA =AVERAGE(B4:B23) Median = =MEDIAN(B4:B23) Mode = =MODE(B4:B23) Mean = Range = =MAX(B4:B23)-MIN(B4:B23) Data: St-CE-Ch02-x1-Examples-Slide 13 Dr. C. Ertuna 16 Example: Measure of C. T. (cont.) C D B 2 Blood Pressure 1/A 3 NA Others 4 5 6 7 8 9 175 162 159 193 148 151 128 117 152 138 97 115 E F G H NA Others Mean = Median = Mode = Midrange = Range = 158.15 160.50 148.00 135.50 115.00 119.50 116.00 138.00 120.50 63.00 Data: St-CE-Ch02-x1-Examples-Slide 13 Dr. C. Ertuna 17 Checking on the Extreme Value • Before we can relay on the mean as a “good” measure for central tendency we need to test for extreme values. • One such measure is the 3-sigma rule. • The standard deviation of a series is computed then multiplied by 3 and the result is: (a) once added to the mean to find the Upper Limit for extreme value detection and (b) once deducted from the mean value to determine the Lower Limit. Dr. C. Ertuna 18 Example: Checking on Extreme Value 2 Blood Pressure 3 NA Others 4 5 6 7 8 9 10 175 162 159 193 148 151 78 128 117 152 138 97 115 105 NA Others Mean = Std Dev = 3 Std Dev = Upper L = Lower L = Extreme V = Dr. C. Ertuna 158.15 23.87 71.60 229.75 86.55 1 119.50 17.83 53.49 172.99 66.01 0 19 Example: Measure of C. T. (cont.) B C NA D Others Mean = 158.15 119.50 Median = 160.50 Mode = 148.00 116.00 138.00 Range = 115.00 63.00 Blood pressure statistics for NA and other managers are given on the left. 1. Compare mean, median, mode. 2. Explain the meaning of the results. 3. Evaluate the implications. Dr. C. Ertuna 20 Note: Comparison of the Means • Actually To compare the means of two series we need to run a hypothesis test (two sample mean test – t-test). • We will learn how to run hypothesis tests, later. • For now we will do a simple face value comparison with the knowledge that the true comparison requires a formal test. Dr. C. Ertuna 21 Comparison: Measure of C. T. (cont.) • Mean, Median, and Mode of NAmanagers are greater than “Other” managers • Median is slightly higher than the mean for the NA managers and the opposite is true for others. Dr. C. Ertuna 22 Meaning: Measure of C. T. (cont.) 1 Mean values suggests that blood pressures of NA-managers are higher than the “other” managers 2 Midrange could be interpreted as a raw measurement for the effect of distortion by the extreme values on the mean. It suggests that the real mean for NA is under estimated and for “Others” it is pretty much accurate. Dr. C. Ertuna 23 Meaning: Measure of C. T. (cont.) 3 Half of the NA managers have higher blood pressure than the mean. Median values support point 2 for NA and suggests that mean for “Others” may overestimate the central tendency 4 Mode values reveal that most observed numbers are below the mean for NA and above for “Others.” Dr. C. Ertuna 24 Evaluation: Measure of C. T. (cont.) • High blood pressure is an indicator for stress and strain • The results suggest that the North American managers of the company are under much more stress than the managers of the company in the other parts of the world • If corrective measures are not taken than Errors, Loss of Managerial Talent, etc. may occur in NA. Dr. C. Ertuna 25 Next Lesson (Lesson – 02B) Measure of Dispersion Dr. C. Ertuna 26