Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

Basic Statistics Six Sigma Foundations Continuous Improvement Training Six Sigma Simplicity Key Learning Points Simple Statistics can: Increase your Understanding of Process Behavior Helps Identify Improvement Opportunities for 6S Statistics Common statistics: Miles per gallon (liter); mpg (mpl) Median home prices Consumer price index Inflation rate Stock market average Airline on-time arrival rate Statistics are computed using data. Statistics summarize the data and help us to predict future performance. Basic Statistics Serve as a means to analyze data collected in the Measure phase. Allow us to numerically describe the data that characterizes our process’ Xs and Ys. Use past process and performance data to make inferences about the future. Serve as a foundation for advanced statistical problem-solving methodologies. Are a concept that creates a universal language based on numerical facts rather than intuition. Data Visualization Before any statistical tools are applied, visually display and look at your data. A histogram allows us to look at how the data is distributed across our Y scale of measure. Number of Wins for National Football League Teams (1998) 5 Number of Teams Five teams won eight games 4 3 2 1 0 0 5 10 Number of Games Won 15 Source: AOLSports Building a Histogram The following data came from our bicycle test facility: stopping distances required to bring a 150 lb weight to a complete stop with the rear brake applied from a 10 mph cruising speed. Trial (sample #) 1 Stop Distance (Feet) 14 2 3 6 13 4 5 6 7 7 10 10 11 8 9 10 11 12 13 14 15 9 11 9 11 9 10 10 10 Y-Axis X-Axis 6 7 8 9 10 11 12 13 14 Feet Measures of Central Tendency In addition to counting occurrences and graphing the results, we can describe processes in terms of central tendency and dispersion. Measures of Central Tendency Mean (m, Xbar)—The arithmetic average of a set of values Median (M)—The number that reflects the middle of a set of values Uses the quantitative value of each data point Is strongly influenced by extreme values Is the 50th percentile Is identified as the middle number after all the values are sorted from high to low Is not affected by extreme values Mode—The most frequently occurring value in a data set Central Tendency Exercise Determine the mean, median, and mode for the bicycle stopping distances used to create the histograms. Mean = ________ Median = ________ Mode = ________ Trial 1 Stop Distance (Feet) 14 2 3 6 13 4 5 6 7 7 10 10 11 8 9 10 11 12 13 14 15 9 11 9 11 9 10 10 10 Mean, Median, Mode Mode Median 80 Mean Frequency 120 40 Median 100 Mean 50 0 0 60 80 100 120 Positive Skew 0 Mode Median Mean 60 Frequency Frequency Mode 40 20 0 30 50 70 90 Normal 110 20 40 60 Negative Skew 80 Measures of Dispersion Range (R)—The difference between the highest and lowest R xmax xmin Sample Variance (s2)—The average squared distance of each point from the average (Xbar) n 2 x x 2 2 2 x - x x - x ... x x i n s2 i 1 1 2 n 1 n 1 Sample Standard Deviation(s)—The square root of the variance n 2 x x i 2 i 1 s s = n 1 Example of Measures of Dispersion Number of Wins for National Football League Teams (1998) Source: AOLSports Xbar = 8 Frequency 5 4 3 2 1 0 0 5 10 Range = 12 s2 = 11.72 s = 3.42 15 Dispersion Exercise Find measures of dispersion for the stopping distance data. Fill in the table at the right. Range (R) = Variance (s2) = Std Dev (s) = Population vs. Sample (Certainty vs. Uncertainty) A sample is just a subset of all possible values. Population Sample Since the sample does not contain all the possible values, there is some uncertainty about the population. Hence any statistics, such as mean and standard deviation, are just estimates of the true population parameters. Symbols Sample Population N n Mean (n = # of samples) x x n Standard Deviation s = (little “s”) xi m i 1 N xi x i 1n i 1 i n 1 2 x m N 2 = i 1 i N The Normal Curve In 80 to 90% of problems worked, data will follow a normal bell curve or can be transformed to look like a normal curve. This curve is described by the Xbar and s “statistic.” The area under this curve is 1 or 100%. For the normal curve, mean = median = mode. X s Normal Bell Curve Properties X1sd Histograms (bar charts) are developed from samples. Sample statistics (Xbar and s) are calculated from representatives of the population. From the histogram and sample statistics, we form a curve that represents the population from which these samples were drawn. X 68.26% of the data falls within 1 standard deviation from the mean 3sd X 6sd 99.73% of the data falls within 3 standard deviations from the mean 99.9999998% of the data falls within 6 standard deviations from the mean Other Data Distributions 15 Frequency Frequency Normal 10 5 0 95 105 115 0 100 200 300 20 Uniform Frequency Frequency 10 0 85 10 Log Normal 20 5 Exponential 10 0 0 80 90 100 110 120 0 100 200 300 400 500 Normal Curve Exercise Here is a histogram of the bike stopping distance data. (Xbar = 10 , s = 2) Does the histogram appear normal? Draw vertical lines at 1sd, 2sd, 4sd Discuss Frequency 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Basic Statistics Six Sigma Foundations Continuous Improvement Training