Download Chapter 12: Discrete Math: Statistics

Chapter Twelve: Discrete Mathematics: Statistics Section One: Measures of Central Tendency We can describe a set of numbers by using a center measure. There are three different types of these measures of central tendency. Mean – The arithmetic average. Add all the numbers and divide by the number of numbers. We denote the mean with x . Median – The middle number. Place the numbers in order and find the middle number. Mode – The number that occurs the most. EX1: Find the measures of central tendency of the following set of numbers 25,33, 26,30, 28, 28, 29 . EX2: Last month, 6 houses were sold in Centerville. The sale prices were as follows: $140,000, $165,000, $150,000, $300,000, $146,000, and $125,000. Find the mean, median, and mode of the sale prices. Explain which measure best represents the typical sale price of a house in Centerville last month? Sometimes we need a way to keep track of the occurrences of our data. A frequency table is a good way to do this. EX3: Complete the frequency table for the data below on the number of books read by 30 students last month. Then use the frequency table to find the mean. 4, 0, 1, 2, 6, 0, 1, 2, 0, 4, 0, 2, 2, 1, 1, 3, 1, 0, 3, 2, 0, 1, 2, 2, 4, 1, 1, 1, 0, 4 Books Tally Frequency 0 1 2 3 4 5 6 Total When there are many different values we can use a grouped frequency table. In a grouped frequency table, the values are grouped into classes that contain an equal range of values. We can only estimate the mean when looking at a group frequency table. We do so by finding the average of each group. EX4: The group frequency table lists the numbers of CDs bought by 50 students last year. Estimate the mean number of CDs bought by these students last year. # of CDs Frequency 0 4 1 5 14 6  10 9 11  15 10 16  20 8 21  25 1 26  30 2 31  35 1 36  40 1 Section Two: Stem-and-Leaf Plots, Histograms, and Circle Graphs A stem-and-leaf plot is a way to arrange a set of data. We can see how spread out the data is by looking at the plot. The stems and leaves can be arranged many different ways: tens and ones, whole and decimal, etc. Each stem-and-leaf plot should have a key explaining the nature of the stems and leaves. To the right is an example of a stem-and-leaf plot. 32 32 34 43 38 12 17 53 22 51 27 34 43 44 44 8 30 30 31 40 34 37 38 38 78 50 26 54 28 29  1 2  12 Stem Leaf 0 8 1 2,7 2 2,6,7,8,9 3 0,0,1, 2, 2, 4, 4, 4,7,8,8,8 4 0,3,3, 4, 4 5 0,1,3, 4 6 7 8 EX1: A bakery collected the following data about the number of loaves of fresh bread sold on each of 24 business days. Make a stem-and-leaf plot of the data. Find the median and mode(s) of the data. How can the owner use the stem-and-leaf plot to make plans for making bread? 53 49 27 48 60 52 44 38 47 52 82 46 55 31 39 54 51 47 50 45 50 61 43 64 A histogram is a bar graph the gives the frequency of different types of data. The horizontal axis is divided into equal ranges and the height of the bar represents the range’s frequency. EX2: Make a frequency table and a histogram of the following set of numbers: 0.2, 1.3, 5.4, 4.3, 2.2, 4.3, 4.6, 3.5, 5.1, 4.8, 1.5, 3.7, 5.4, 4.0, 4.2, 5.2 A relative frequency table shows how often a category occurs relative to the entire set of data. This could be as a percentage or a decimal. We can take this data and create a relative frequency histogram. EX3: Make a relative frequency table and a relative frequency histogram of the following data. 0.2, 1.3, 5.4, 4.3, 2.2, 4.3, 4.6, 3.5, 5.1, 4.8, 1.5, 3.7, 5.4, 4.0, 4.2, 5.2 A circle graph is another way of showing relative frequency. Each segment of the circle represents a percentage of the whole set of data. Keep in mind that there is a total of 360 degrees in a circle and each percentage represents a part of the 360. Region EX4: The table shows the distribution by region of the resident population of the United States in 1996. Make a circle graph of the data. Find the probability that a randomly chosen resident of the United States in 1996 was not a resident of the South. Population Northeast 51.6 Midwest 62.1 South 93.1 West 58.5 Section Three: Box-and-Whiskers Plots A box-and-whisker plot is a way to display data to see how it is dispersed. It divides the data into quartiles (4 groups) that each contain 25% of the entire data. We know that the median of a set of data divides the data into two equal groups. If we then divide the two equal groups created by the median, we have found the first quartile (lower quartile) and the third quartile (upper quartile). These three values along with the maximum and minimum values make up the five-point summary of the data set that we use to create a box-and-whisker plot. The distance from the minimum to the maximum we have learned is the range of the data. The distance between the first and third quartile is known as the interquartile range (IQR). Sometimes values are so extreme that they could skew our averages of a set of data. We call these points outliers because they really aren’t very common to the other points of the data. We determine if a point is an outlier by using the IQR. A value of 1.5 times the IQR below the first quartile or a value of 1.5 times the IQR above the third quartile is an outlier. EX1: The data shows the age of 43 Florida Marlins baseball players at the 1997 World Series. Find the quartiles, range, and interquartile range for the data. Create a box-and-whisker plot. Identify any outliers. 29 25 31 19 34 22 32 34 20 31 35 27 35 22 38 28 25 26 21 22 27 26 21 32 25 25 23 35 32 27 28 35 25 30 22 23 38 25 36 30 26 34 26 EX2: The mean monthly temperatures for L.A. and Chicago are listed in the table. Construct a box-andwhisker plot for each city and compare them. Monthly Mean Temperatures (°F) Los Angeles Chicago Jan 55.9 21.0 Feb 57.0 25.5 Mar 58.3 37.0 Apr 60.8 48.6 May 63.3 58.8 Jun 66.7 68.5 Jul 70.9 73.0 Aug 71.8 71.6 Sep 70.5 64.4 Oct 66.6 52.7 Nov 62.1 39.9 Dec 57.6 26.6 Section Four: Measures of Dispersion The range and interquartile range that we studied about in previous lessons are two examples of measures of dispersion. They measure how spread out the data in the set is. However, neither of these is very reliable because they focus on only two values of the set. A better way of measuring dispersion is with the mean deviation. It gives the average amount of space between all the numbers in your set and the mean of the set. Mean Deviation = Average of differences between the mean and each data point Or mean deviation  1 n  xi  x n i1 EX1: The table lists a student’s test scores in two subjects this year. Find the range and the mean deviation for the test scores. Then describe what these measures indicate about each set of test scores. History Math 85 82 91 92 96 100 85 77 93 84 Two more ways of measuring dispersion is variance and standard deviation. variance   2   1 n  xi  x n i1  2 standard deviation   EX2: Find the variance and standard deviation for the test scores in EX1. EX3: In a survey, 30 people were asked to rank a new soda on a scale of 1 to 10. The results are shown in the table. Use a graphing calculator to find the variance and standard deviation for the set of data. 5 7 10 8 7 7 8 8 10 9 9 8 7 9 8 6 9 9 8 7 8 7 6 10 7 10 8 8 10 9 Section Five: Binomial Distribution A binomial experiment must meet the following criteria: 1. The experiment has only two possible outcomes: success or failure 2. The two outcomes are independent. Their probabilities add up to equal one. To find the probability of a binomial experiment, we can use the following method P=Combinations(success prob)(failure prob) EX1: Suppose that the probability of a VCR that was manufactured in a certain factory being defective is 2%. What is the probability that 2 VCRs are defective in a shipment of 12 VCRs from this factory? We have discussed in previous lessons that when finding the probability of “or” statements that we can simply add the probability of each together. EX2: A landscaping plan specifies for 10 trees of a certain type are to be planted in front of a building. When this type of tree is planted in autumn, the probability that it will survive the winter is 85%. What is the probability that no fewer than 8 of the 10 trees will survive the winter if planted in the autumn? EX3: Create a relative frequency histogram of the scenario from EX2 displaying the probability that 0, 1, 2, 3, …, 9, 10 trees will live. Section Six: Normal Distributions In a normal distribution data varies randomly from the mean, creating a bell-shaped pattern that is symmetric about the mean when graphed. If we connect the tops of the bars in the histogram it creates a normal curve. We define a normal curve with its mean and standard deviation. Because of its shape, we sometimes call the distribution a bell curve. Normal distributions represent many real-life scenarios such as IQ scores, adult heights, and baby birth weights. A normal curve with a mean of 0 and a standard deviation of 1 is called a standard normal curve. Since a normal curve is derived from a relative frequency histogram, we can find different probabilities by finding the area under the standard normal curve. The total area under the curve is 1 (100% probability). The table below gives us a breakdown of the areas under the curve in terms of standard deviations. SD 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Area 0.0000 0.0793 0.1554 0.2257 0.2881 0.3413 0.3849 0.4192 0.4452 0.4641 0.4772 A summary of the properties of normal distributions are seen in the chart below. EX1: Approximate each probability by using the area table for a standard normal curve. a. P  x  1.2 b. P  0.4  x  1.8 EX2: On a cereal box, the weight is listed as 12 ounces. Among all boxes of cereal of this type, however, the actual weights approximate a normal distribution with a mean of 12 ounces and a standard deviation of 0.1 ounce. What is the probability that the contents of a randomly selected box of cereal of this type is within 0.1 ounce of 12 ounces? How many boxes of cereal out of 5000 can the manufacturer expect to be within this range? A z-score tells how far a value is from the data’s mean in terms of standard deviations. For example, assume a set of data has a mean of 50 and a standard deviation of 10. A data value of 70 would have a zscore of 2 since it is 2 SDs from the mean. 20 would have a z-score of 3. We can find a z-score with the formula z  xx  . EX3: A manufacturer finds that the life expectancy of its best-selling model of clothes dryer is 11 years and the standard deviation is 1.5 years. Find the z-scores for 10 years and 12 years. What is the probability that this model dryer will last from 10 years to 12 years?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 12: Discrete Math: Statistics