Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MEASURES OF CENTRAL TENDENCY A NOTATION FOR ADDING A PATTERNED SEQUENCE EXAMPLE. This is the SUMMATION SYMBOL (uppercase Greek letter ‘sigma’). It tells us to: “compute all the values of the expression next to it, then add!”. For example, X = X1 + X2 + X3 + X4 + . . . (all values of X) Suppose we have the following data X : values for the variables X and Y. Y: 12 5 4 9 10 9 A. Y 5 + 9 + 9 + 8 + 10 + 8 49 B. XY (12)(5) + (4)(9) + (10)(9) + (6)(8) + (8)(10) + (8)(8) 378 C. XY 2 D. X2Y3 6 8 8 10 8 8 (12)(52) + (4)(92) + (10)(92) + (6)(82) + (8)(102) + (8)(82) 3130 (122)(53) + (42)(93) + (102)(93) + (62)(83) + (82)(103) + (82)(83) 217764 MEASURES OF CENTRAL TENDENCY — ANOTATION FOR: ADDING A PATTERNED SEQUENCE Page 1 THE MODE OF A DATA SET TAKE NOTE! IMPORTANT REMINDER! The formulas given here apply for sample data only. From here onwards, the data we will be dealing with are (mostly) sample data, and the formulas are sample measurements. As for population data and formulas, our goal in Stat is actually to provide good estimates to population measurements using the sample data. If a data set has certain values occurring more than once, the most frequently occurring among them is called the MODE. If the data set has only one such value, it is described as UNIMODAL; if it has two, BIMODAL; and if it has more than two, MULTIMODAL. A data set may have no modal value (i.e., if each data value is distinct). And it’s fine! EXAMPLE. Find the mode. X : 2.3 MODE = 5.4, 4.8 EXAMPLE. 9.0 5.4 6.8 3.5 4.8 5.4 7.2 7.8 6.5 9.2 (The data set X is bimodal) Find the mode. Y : 4.2 MODE = N.A. 4.8 1.4 3.8 4.7 6.3 5.5 8.3 (The data set Y has no mode) MEASURES OF CENTRAL TENDENCY — THE MODE OF A DATA SET Page 2 THE MEDIAN OF A DATA SET The MEDIAN of a data set is the data value in the middle position when the data set is arranged in order. The MEDIAN divides the data sets into two parts of equal size. For a data set with N elements the location of the MEDIAN is given by the following formula: EXAMPLE. X : 2.3 Find the median. Arrange: X : 2.3 3.5 4.8 4.8 4.8 9.0 5.4 MEDIAN LOCATION N + 1 2 5.4 5.4 6.8 6.8 3.5 9.0 4.8 5.4 7.2 9.2 Compute: MEDIAN LOCATION (9+1)/2 5th Locate: MEDIAN 5.4 TAKE If the calculated median location is not a whole number (has a 0.5), the NOTE! median is the midpoint of the two values around the calculated location. EXAMPLE. Find the median. Arrange: X : 1.4 3.8 X : 4.2 4.2 1.4 4.7 3.8 5.5 4.7 6.3 6.3 6.5 5.5 7.8 7.8 8.3 6.5 9.2 8.3 9.2 Compute: MEDIAN LOCATION (10+1)/2 5.5th Locate: MEDIAN (5.5+6.3)/2 5.9 MEASURES OF CENTRAL TENDENCY — THE MEDIAN OF A DATA SET Page 3 THE MEAN OF A DATA SET The MEAN of a data set is the data value that you can expect to find at the center, basing on the pattern of values in the data set. (Hence, it is also called the EXPECTED VALUE.) The formula for the mean of a data set is: where N is the number of data values in the set and the X’s are the individual data values. EXAMPLE. Find the mean for the sample data. X : 2.3 MEAN X X N MEAN X 4.8 9.0 5.4 6.8 X N 3.5 4.8 5.4 7.2 2.3 + 4.8 + 9.0 + 5.4 + 6.8 + 3.5 + 4.8 + 5.4 + 7.2 9 5.4666... 5.47 EXAMPLE. (Always round off results to two decimal places!) Find the mean for the sample data. X : 4.2 1.4 3.8 4.7 6.3 5.5 7.8 6.5 9.2 8.3 MEAN X X 4.2 + 1.4 + 3.8 + 4.7 + 6.3 + 5.5 + 7.8 + 6.5 + 9.2 + 8.3 N 9 5.77 MEASURES OF CENTRAL TENDENCY — THE MEAN OF A DATA SET Page 4 MEASURES OF VARIABILITY EXAMPLE. Which of the following data values are more “scattered” (or “far apart”? Data Set #3: 10 12 13 13 15 17 19 19 20 Data Set #4: 10 15 19 23 27 32 36 40 45 By, simple observation, Data Set #2 is more “scattered”. EXAMPLE. Which of the following data values are more “scattered” (or “far apart”? Data Set #1: 2.3 4.8 9.0 5.4 6.8 3.5 4.8 5.4 7.2 Data Set #2: 4.2 1.4 3.8 4.7 6.3 5.5 7.8 6.5 9.2 8.3 I don’t know! I cant decide anymore! THE VARIANCE AND STANDARD DEVIATION OF A DATA SET N ( X2 ) ( X )2 The VARIANCE and STANDARD DEVIATION of 2 VARIANCE s a data set measure how scattered the data values N(N 1) are around the mean value. The formulas for the variance and standard deviation of a data set is: STANDARD DEVIATION s VARIANCE MEASURES OF VARIABIILITY — THE VARIANCE AND STANDARD DEVIATION OF A DATA SET Page 5 EXAMPLE. Find the variance and standard X : 2.3 deviation of the ff sample data. 4.8 X2 2.3 5.29 4.8 23.04 9.0 81 5.4 29.16 6.8 46.24 3.5 12.25 STANDARD s DEVIATION 4.8 23.04 s 5.4 29.16 s 2.01 9.2 51.84 49.2 VARIANCE s N( 5.4 6.8 3.5 4.8 5.4 9.2 X2 X 2 9.0 ) ( X )2 N(N 1) 10(301.2) (49.2)2 s 10(10 1) 2 s2 4.03 VARIANCE 4.03 301.2 MEASURES OF VARIABIILITY — THE VARIANCE AND STANDARD DEVIATION OF A DATA SET Page 6 EXAMPLE. Find the variance and standard deviation of the ff sample data. X : 4.2 1.4 3.8 4.7 6.3 5.5 7.8 6.5 9.2 8.3 X2 X X2 4.2 17.64 1.4 1.96 3.8 14.44 4.7 22.09 6.3 39.69 5.5 30.25 STANDARD s DEVIATION 7.8 60.84 s 6.5 42.25 s 2.35 9.2 84.64 8.3 68.89 57.7 301.2 2 VARIANCE s N( ) ( X )2 N(N 1) 10(382.69) (57.7)2 s 10(10 1) 2 s2 5.53 VARIANCE 5.53 MEASURES OF VARIABIILITY — THE VARIANCE AND STANDARD DEVIATION OF A DATA SET Page 7 MEASURES OF relative position THE Z-SCORE OR THE STANDARD SCORE The Z-SCORE or STANDARD SCORE of a data value measures how far this data value from the MEAN of the data set in terms of the STANDARD DEVIATION of the data set. Given the data set: 49.5 57 59 Z-SCORE Z XX s ... with mean X = 52 and standard deviation s = 2.5 s 2.5 s 2.5 49.5 s 2.5 57 X 52 Data value 49.5 is one s.d.’s away to the left of the mean. So Z=-1. Data value 57 is two s.d.’s away to the right of the mean. So Z=+2. MEASURES OF RELATIVE POSITION — THE Z-SCORE OR THE STANDARD SCORE Page 8 EXAMPLE. We have the ff. sample data. X : 2.3 We have also calculated: MEAN X 5.47 X = 4.8: Z = 2.3 - 5.47 2.01 –1.58 X = 5.4: Z = 5.4 - 5.47 2.01 –0.03 X = 9.0: Z = 9.0 - 5.47 2.01 1.76 NOTE! ALSO! 4.8 9.0 5.4 6.8 3.5 4.8 5.4 7.2 and STAND. DEV. s 2.01 If z<0 (z is negative), it means that the data value lies on the left side of the MEAN. If z>0 (z is positive), it means that the data value lies on the right side of the MEAN. If z=0 (z is zero), it means that the data value is equal to the MEAN. If we convert an entire data set into z-scores a, we will come up with a new data set (small values between -3 and 3) with MEAN = 0 and STANDARD DEVIATION = 1. MEASURES OF RELATIVE POSITION — THE Z-SCORE OR THE STANDARD SCORE Page 9 RANKING DATA VALUES BY PERCENTILES MEASURES OF RELATIVE POSITION — THE Z-SCORE OR THE STANDARD SCORE Page 10