Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MAT112 Sections 11.1 – 11.3 Grouped Data 11.1 Bar Graphs – use vertical and horizontal axes and either vertical or horizontal bars to represent data. (Used to easily give a visual representation to compare data.) Broken Line Graphs – constructed by connecting the midpoints of a bar graph. (Useful to emphasize the change over time.) The line can be used to more accurately estimate future values. Pie Chart – used to show percentages of several categories dividing a whole. The size of the angle of the circle sector is proportional to the percentage the category is of the whole circle (360°). Hours of Sleep in One Night 10 10 5 5 Hours Hours Hours of Sleep in One Night 0 Sydnie's Day 67% 0 Sydnie Jack Taylor Brad Alseep 33% Awake Sydnie Jack Taylor Brad Frequency Table – table used to show large groups of data and frequency distribution Data Range –obtained by subtracting the smallest data value from the largest data value Class Interval –equal size intervals that break up a data range into groups. To determine what class size to use, divide the data range by the number of intervals desired. Then pick a slightly larger number that is easy to use and avoids data values being on the border of two different intervals. Extra decimal place values may be used also to help avoid this. Class Frequency - the number of data values that are in the class interval Frequency Distribution –a set of all classes listed with their frequencies (usually shown in a frequency table) Relative Frequency –the percent of data that falls in the class interval, written as a decimal. Probability Distribution – the set of relative frequencies used when considering the probability associated with a class. The probability of a data value landing in a certain class is the relative frequency of that class. Example: The number of calls from motorists per day for roadside service was recorded for the month of December 2003. The results were as follows: 28 86 122 80 217 90 130 120 120 70 104 82 97 109 75 81 40 68 145 174 187 194 113 111 90 75 123 140 100 170 120 We can use this data to record how often data occurs for each given class interval. The data range is 217-28 = 189. If we want 6 classes we can divide 189 by 6: 189/6 = 31.5. Next, find a slightly larger number to work for class size where no data value will appear in the border value of two intervals. 32 will work for class size starting at a value of 25 since the smallest value is 28. Class Interval Tally Frequency Relative Frequency 25 - 57 II 2 0.065 57 - 89 IIII III 8 0.26 89 - 121 IIII IIII I 11 0.35 121 - 153 IIII 5 0.16 153 - 185 II 2 0.065 185 - 217 III 3 0.1 Total 31 1.00 The relative frequency shows the percent of data that falls in the class interval, written as a decimal. Notice that the sum of all the relative frequencies is equal to 1. The tallies record how many data values fall within a certain class. Histogram – similar to bar graphs. Note that there is no space in between the bars. Class boundaries are along the x-axis and frequencies are on the y-axis. They are used to create a visual representation for large groups of data. Frequency Polygon – the broken line graph that is associated with a histogram. Remember that this uses the midpoint of each bar on the histogram. Histogram 12 10 Frequency Polygon 6 4 Fequency Frequency 8 2 0 25 57 89 121 153 185 15 10 5 0 9 41 73 105 137 169 201 233 217 Class Boundaries Class Boundaries 11.2 (*Note: For Mean, Median and Mode of ungrouped data, see the handout titled Measures of Central Tendency & Dispersion. ) Grouped Data – a data set or frequency table that does not show each individual data entry. The data is only shown grouped into interval classes. Mean for Grouped Data – the Greek letter (“mu”) is used as the symbol for population mean and the symbol ̅ (“x-bar”) is used to represent the mean of a sample of data values. To determine the mean of grouped data, take the midpoint of each class and multiply it by its class frequency . Then, add all of these together and divide this by the total number of data values . ̅ ∑ Example: For the data presented earlier: Class Interval 25 - 57 57 - 89 89 - 121 121 - 153 153 - 185 185 - 217 Midpoint 41 73 105 137 169 201 Frequency Product 82 584 1155 685 338 603 ∑ 3447 2 8 11 5 2 3 Totals 31 = Mean = Median for grouped data –the number in the middle of the data. Half of the area of the histogram should be on each side of this value. To determine the median: 1. The area of the entire histogram must be determined. To do this, calculate the area of each rectangle (A = l · w) and add all the areas together to get the total area. 2. Then take the total area and divide it by two. Use this number to locate which class interval the median lies in by adding up the areas of the rectangles from left to right until it exceeds the value. 3. Next set up an equation for the areas of the rectangles that must equal the value from step 2. 4. Finally, solve for x. Use this value to determine the median. Example: From the previously given data: 1. 3. Note: The width of each rectangle is 32 since that is the width of each class interval. The length of each is the frequency of each interval. Median 2. (half the area) 4.4. x = 16 16 x First three areas total: 64 + 256 +352 = 672 which is greater than 496, so the Median is in the third interval: 89 – 121. 32 8 11 Median = 105 32 2 25 Area needs to equal 496 2(32) + 8(32) + x(11) = 496 x = 16 57 89 121 16 89 + 16 = 105 11.3 (*Note: For Range, Variance and Standard Deviation of ungrouped data, see the handout titled Measures of Central Tendency & Dispersion. ) Range for Frequency Distribution – difference between the upper boundary of the highest class and the lower boundary of the smallest class Standard Deviation for Grouped Data - the standard deviation for grouped data uses the midpoints ( of each class and the class’s frequency to determine the standard deviation. Sample Standard Deviation, s: √∑ Formulas: Population Standard Deviation, : ̅ √∑ 1. Find the mean ( ̅ )of the grouped data. (Described previously in this handout.) 2. Take the midpoint ( of each class and subtract mean ( ̅ ) from each . 3. Square each value from Step 2. 4. Multiply each number from Step 3 by each class interval’s frequency . 5. Add all of the results from Step 4 together. 6. Divide this total by the total number of data points (n) for a population standard deviation or total number minus one (n-1) for a sample standard deviation. 7. The standard deviation is the square root of this number Example: Using the data given previously, find the population standard deviation. 1. The mean ( ) was approximately 111. Midpoint of Interval (xi) Deviation (xi ) Values squared 41-111= -70 (-70)2 = 4900 4900·2 = 9800 73-111= -38 105-111= -6 (-38)2 1444·8 = 11552 137 137-111= 26 (26)2 36·11 = 396 676·5 = 3380 169 201 169-111= 58 201-111 = 90 (58)2 = 3364 3364·2 = 6728 (90)2 = 8100 Total 8100·3 = 24300 56156 41 73 105 6&7 √ (-6)2 = 1444 = 36 = 676 √ Therefore, 42.56 is the population standard deviation for this grouped data.