Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 800: Quantitative Business Analysis for Decision Making Measures of Locations and Variability Lecture Outline A. Discussion of Problems From Chapter 3 B. Population and Sample C. Averages (Mean, Median, Mode) D. Box Plot ; Cumulative Distribution Curves E. Variance and Standard Deviation ; Coefficient of Variation 403.4 2 Normal Bell Curve Normal Bell Curve 0.4 Density 0.3 0.2 0.1 0.0 -5 0 5 X 403.4 3 Bell Curves Cont’d Normal Bell Curve 0.4 DensityX 0.3 0.2 0.1 0.0 -10 0 10 X 403.4 4 Population and Sample •Portion of the population is called sample •Sample aspects are called statistics •Population aspects are called parameters •Population mean is denoted by µ •Population standard deviation by 403.4 5 Notations Population mean is denoted by Sample mean is denoted by X Population standard deviation is denoted by Sample standard deviation is denoted by S 403.4 6 Notations Capital letters (X, Y, Z, etc.) denote variables Lower case letters (x1, x2, x3, etc.) denote observations for X. n = random sample size 403.4 7 Averages (Mean, Median, Mode) Mean of x1, x2, ….x300 for sample size n = 300. T = total, Mean = T xi x T / 300 403.4 8 Mean (con’t) Formula for computing weighted mean: x wixi 403.4 9 Median & Mode Median – has property that 50% of the numbers in the data set are less, and 50% are greater in magnitude – The median need not be a number in the data set Mode – most frequently occurring number in the data set 403.4 10 Ranks and Percentiles Ranks expressed as percentages are called Percentiles of a data set are. 0th percentile 25th 50th percentile Percentile (lower quartile) 403.4 75th Percentile (upper quartile) 100th percentile 11 Box Plot Five number summary: – smallest value (0th percentile) – lower quartile (25th percentile) – median (50th percentile) – upper quartile (75th percentile) – largest value (100th percentile) Box Plot is a graphical representation of the five number summary 403.4 12 Two Box Plots Box plot of a data set with median 3 and average 4.35 30 X 20 10 0 The distribution is skew tow ards extreme values Box plot of a data set with median 3 and average 3 5 4 X 3 2 1 0 The distribution is symmetric around the average Cumulative Distribution Curves Plots data against their percentages Easy to estimate any percentile value Cumulative distribution curve of a data set symmetric around average 3 Cumulative Percent 100 50 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 X 403.4 14 One More Cumulative Distribution Curve Cumulative distribution curve of a skew data set with median 3 and average 4.35 Cumulative Percent 100 50 0 0 5 10 15 20 25 30 X 403.4 15 Standard Deviation and Coefficient of Variation Sampling Variation or Sampling Error: Extent to which repeated samples may differ from each other. Reliability of an estimate is measured by its sampling error Standard deviation of a sample quantifies the extent to which its values vary from their mean 403.4 16 Notation and Formula for Standard Deviation s = sample standard deviation Formula: x x 2 s n 1 403.4 17 An Empirical Rule Nearly 67% of the population is inside the one S.D. interval around the mean: Nearly 95% of the population is inside the two S.D. interval around the mean: 2 Nearly 99.7% of the population is inside the three S.D. interval around the mean: 3 403.4 18 Coefficient of Variation CV is a relative measure of variability; it is the standard deviation divided by the mean Useful when the variation is better understood as a percentage Population CV / 100%, and sample CV s / x 100% 403.4 19 Effects of Adding to or ReScaling Data ----------------------------------------------------------------------------------Original Add d Multiply by c Multiply then Add -----------------------------------------------------------------------------------Variable X X+d cX cX + d Average, (median, percentiles) X X +d Standard deviation s s cX X c cs cs 403.4 +d 20