Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 800: Quantitative Business Analysis for Decision Making Measures of Locations and Variability Lecture Outline A. Discussion of Problems From Chapter 3  B. Population and Sample  C. Averages (Mean, Median, Mode)  D. Box Plot ; Cumulative Distribution Curves  E. Variance and Standard Deviation ; Coefficient of Variation 403.4  2 Normal Bell Curve Normal Bell Curve 0.4 Density 0.3 0.2 0.1 0.0 -5 0 5 X 403.4 3 Bell Curves Cont’d Normal Bell Curve 0.4 DensityX 0.3 0.2 0.1 0.0 -10 0 10 X 403.4 4 Population and Sample •Portion of the population is called sample •Sample aspects are called statistics •Population aspects are called parameters •Population mean is denoted by µ •Population standard deviation by 403.4 5 Notations  Population mean is denoted by   Sample mean is denoted by X  Population standard deviation is denoted by   Sample standard deviation is denoted by S 403.4 6 Notations Capital letters (X, Y, Z, etc.) denote variables  Lower case letters (x1, x2, x3, etc.) denote observations for X.  n = random sample size  403.4 7 Averages (Mean, Median, Mode)  Mean of x1, x2, ….x300 for sample size n = 300.  T = total,  Mean = T   xi x  T / 300 403.4 8 Mean (con’t)  Formula for computing weighted mean: x   wixi 403.4 9 Median & Mode  Median – has property that 50% of the numbers in the data set are less, and 50% are greater in magnitude – The median need not be a number in the data set  Mode – most frequently occurring number in the data set 403.4 10 Ranks and Percentiles  Ranks expressed as percentages are called Percentiles of a data set are. 0th percentile 25th 50th percentile Percentile (lower quartile) 403.4 75th Percentile (upper quartile) 100th percentile 11 Box Plot  Five number summary: – smallest value (0th percentile) – lower quartile (25th percentile) – median (50th percentile) – upper quartile (75th percentile) – largest value (100th percentile)  Box Plot is a graphical representation of the five number summary 403.4 12 Two Box Plots Box plot of a data set with median 3 and average 4.35 30 X 20 10 0 The distribution is skew tow ards extreme values Box plot of a data set with median 3 and average 3 5 4 X 3 2 1 0 The distribution is symmetric around the average Cumulative Distribution Curves Plots data against their percentages  Easy to estimate any percentile value  Cumulative distribution curve of a data set symmetric around average 3 Cumulative Percent 100 50 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 X 403.4 14 One More Cumulative Distribution Curve Cumulative distribution curve of a skew data set with median 3 and average 4.35 Cumulative Percent 100 50 0 0 5 10 15 20 25 30 X 403.4 15 Standard Deviation and Coefficient of Variation Sampling Variation or Sampling Error: Extent to which repeated samples may differ from each other.  Reliability of an estimate is measured by its sampling error  Standard deviation of a sample quantifies the extent to which its values vary from their mean  403.4 16 Notation and Formula for Standard Deviation s = sample standard deviation  Formula:   x  x  2 s n 1 403.4 17 An Empirical Rule  Nearly 67% of the population is inside the one S.D. interval around the mean:    Nearly 95% of the population is inside the two S.D. interval around the mean:   2  Nearly 99.7% of the population is inside the three S.D. interval around the mean:   3 403.4 18 Coefficient of Variation CV is a relative measure of variability; it is the standard deviation divided by the mean  Useful when the variation is better understood as a percentage    Population CV   /  100%, and sample CV  s / x 100% 403.4 19 Effects of Adding to or ReScaling Data ----------------------------------------------------------------------------------Original Add d Multiply by c Multiply then Add -----------------------------------------------------------------------------------Variable X X+d cX cX + d Average, (median, percentiles) X X +d Standard deviation s s cX X c cs cs 403.4 +d 20