Download Part 1: The Bell Curve

Part 1: The Bell Curve Clarification of Symbols: When reading different articles or text, you may see different symbols to represent quantities. You may see x or  used for the mean and s and  used for the standard deviation*. The reason for the different symbols is that x and s are symbols used when dealing with samples and  and  are symbols used when dealing with entire populations. Don’t worry too much right now about the distinction, because it will be discussed later. *Standard deviation: This term comes up this week and will be discussed in great detail through out the rest of the course. Read about the Bell Curve and the Empirical Rule at the site below. http://www.oswego.edu/~srp/stats/6895997.html" target="new The key points of this section are:    The normal bell curve is symmetrical; that is the left side is a mirror image of the right side. The data in the normal bell curve are distributed according to a definite pattern. The distribution of additional data collected from the same population is predictable. The other key word here is normal. What is normal? In terms of data, if you collect enough data randomly from a population, the data will fall into a predictable distribution. This is the key to solving problems based on mathematical models. When data is not distributed according to a definite pattern, it is a signal that something is wrong or different with one or several data values. When you look at the next section on standard deviations you will see how the data is distributed under normal circumstances. I have included two links to web sites involviing the bell curve. Many web sites involving standard deviations deal with psychology and social issues. The two sites I have given are closely realted to what you are studying now. 1) Description of the Bell Curve, http://wwwstat.stanford.edu/~naras/jsm/NormalDensity/NormalDensity.html At this site, read about normal distribution, the standard deviation, empircal rule, and observe the bell curve. Make sure to read the examples at the bottom of the page. A formula for normal density is given and may be a little hard to read. Don't worry about the formula because we will not be using it at this time. 2) Application of the Bell Curve, http://www.tsoft.com/~deano/articles/BellCurve.html" target="new">. This is a site that deals with basketball statistics and shows a few applications of the bell curve. Part 2: Dividing the bell curve into standard deviations In the previous lesson you learned that the normal bell curve is a predictable distribution of data. How is that data distributed? To make life a little easier statisticians have devised a term called the standard deviation ( S for samples and  for populations). The standard deviation is a calculated distance from the mean. We will be actually calculating this value next week. For right now it will be given and you just have to use it in a formula. One Standard Deviation (1S) Since the normal bell curve is symmetrical you can go both to the right of the mean and to the left of the mean on the curve. The symmetrical line is called the standard deviation. If you go to the right one line, this is plus one standard deviation. If you go to the left one line this is minus one standard deviation. The whole area from minus one standard deviation to plus one standard deviation is an area that contains 68% of the data. One standard deviation contains 68% of the data. Two Standard Deviations (2S) The next two areas contain 27% percent (13.5% on each side) of the data. This area is called the second standard deviation. If you combine the first and second standard deviations, the total data contained in these areas is 95%. Three Standard Deviations (3S) The third standard deviation contains 4.7% (2.35% on each side) of the data. If you combine all the data contained in the all three standard deviations, it totals 99.7%. This means that only 0.3% of data will be outside the third standard deviation. While this may seem to be pretty good, companies are looking at 6 standard deviations. Why? Because if they consider 3 standard deviations to be good enough, that means 3 out of 1000 items could be bad, 30 out of 10,000 could be bad, 300 out of 100,000 could be bad and 3000 out of 1,000,000 will be bad. In a time of large scale production, that is just not good enough. However, in order to keep things simple you will only have to deal with 3 standard deviations. The exercise that you will do today consists of two parts. . Part 1: Given the mean( = 5.8 and  = 0.3 ) and standard deviation (  ) calculate were the three standard deviation lines would appear on a bell curve. Example: Given that  = 15 and  = 1.2 The middle line is at 15, the mean The first line above the mean is at  + 1 = 15 + 1.2 = 16.2 The second line above the mean is at  + 2 = 15 + 2.4 = 17.4 The third line above the mean is at  + 3 = 15 + 3.6 = 18.6 The first line below the mean is at  - 1 = 15 - 1.2 = 13.8 The second line below the mean is at  - 2 = 15 - 2.4 = 12.6 The third line below the mean is at  - 3 = 15 - 3.6 = 11.4 1)  = 5.8 and  = 0.3 2)  = 9.24 and  = 1.32 3)  = 62.3 and  = 0.4 4)  = 101.24 and  = 2.51 5)  = 34 and  = 0.5 6)  = 20 and  = 1 Part 2: Given the number of data values, calculate the number of data values that will fall between each standard deviation. Assume the data is distributed normally. Example: For 200 data values Between  ± 1, 200 x 68% = 136 of the 200 values will exist Between  ± 2, 200 x 95% = 190 of the 200 values will exist Between  ± 3, 200 x 99.7% = 199 of the 200 values will exist 1) 35 values 2) 280 values 3) 88 values 4) 570 values 5) 2009 values 6) 10964 values Part 3: The Z-score Once you have information about the mean and the standard deviation you have a pretty good idea about where individual data values may lie on the bell curve. A calculated value that gives you a more exact idea of where a data value is on the curve is the Z-Score, sometimes referred as the Z-Value. The Z Score measures the exact number of standard deviations between the data value and the mean. The formula is listed below: z-score = y - m  Where y = the data value  the mean  = the standard deviation If you take a good look at this formula you should be aware of 3 things: 1. The mean has a z-score of 0. 2. A data value less than the mean has a negative z-score. 3. A data value greater than the mean has a positive z-score. To see an example of a z-score calculation go to: http://www.stat.tamu.edu/stat30x/notes/node34.html Assignment: This is a real short one. Calculate the z-score for each. 1)  = 5.8,  = 0.3 and y = 3.1 2)  = 9.24,  = 1.32 y = 10.98 3)  = 62.3,  = 0.4 and y = 62.7 4)  = 101.24,  = 2.51 and y = 98.24 5)  = 34,  = 0.5 and y = 32.8 6)  = 20,  = 1 and y = 22.3 Part 4: Assessment 1) Multiple Choice: What information is not available in a box-and-whisker plot? a) b) c) d) Minimum Value Mean Median Maximum value 2) Multiple Choice: Which of the following values can be calculated from the information given on a line plot? a) b) c) d) Mean Median Mode All of the Above 3) Multiple Choice: For the data set{10,12,13,13,15,16,16,16,20,21} the innerquartile range is: a) b) c) d) 3 11 15.5 Does not exist 4) Multiple Choice: Consider the data set {20,25,27,30,33,36} If the 20 was changed to a 17, which of the following could be done so that the mean doesn't change? a) b) c) d) Change the 25 to 22 Change the 27 to 30 Change the 33 to 30 Change the 36 to 33 5) Multiple Choice: In a data set of 1500 values, a mean of 20, and a standard deviation of 3, how many values lie between 14 and 26? a) b) c) d) 12 1020 1425 1496 6) Multiple Choice: What is true about the data in the following frequency table? Score Frequency 80 82 86 90 2 4 3 5 a) The mean is greater than the median b) The median is greater than the mean c) The mean and median are equal d) There is no mode for the data 7) Multiple Choice: Which measure will be most affected by an outlier? a) b) c) d) Q1 Q3 Innerquartile Range Range 8) Multiple Answer: Which of the following are true concerning the z-score? Select all correct answers. a) b) c) d) All z-scores are positive A z-score greater than the mean is positive The z-score is the number that appears most The mean has a z-score of 0 9) Multiple Answers: Choose all the statements that are true about a line plot for the data set {15, 12, 18, 12, 16, 18, 19, 18, 12} a) b) c) d) There will be 3 marks above the 12 There will be one mark above the 15 There will be one mark above the 17 It should start at 12 and end at 19 10) Ordering: For a data set with normal distribution, order the following from least to greatest: a) b) c) d) Q3 a value with a z-score of 1 A value with a z-score of 2 The mean Essay or Short Response: Refer to the stem and leaf plot below for questions 11 and 12 Below are the attendance totals for performances of a series of concerts. 21 22 23 24 25 68 1579 227 466 0 11) How many concerts were there? 11)________ 12) What was the range in attendance? 12)________ Key: 22/3 means 223 Weekly Resturaunt Sales 13) Refer to the histogram at the right. Would you consider the data as Normal ? Try to answer in 2 sentences or less 6 5 Number of Weeks 4 3 2 1 50 to 59 60 to 69 70 to 79 80 to 89 90 to 99 Tatal Sales in Th ousands of Dollars 14) Ken had grades of 95, 90, 90, 100 and 55 on his tests this marking period. Which measure of central tendency, the mean or median is a better indicator of his ability. Explain why. 15) Alberto received grades of 85, 97, 95, and 89 on his first four tests. What must he get on his fifth test, exactly, in order to average a 93? 16) Of 500 high school students whose mean height is 67.8 inches, 150 were girls. If the mean height of the girls was 63.0 inches, what is the mean height of the boys? 17) The two stem and leaf plots below represent the final grades for Ms. Sloan’s Algebra II class and Mr. Clarks Pre-calculus class. The two class are the top two math classes in the school this year. The class with the highest average will win a first place prize as best math class. If another class has a better median grade than the class with the best average, that class will share the first place prize. Ms. Sloan's Class 5 6 7 8 9 a) b) c) d) 1 8 1488 003889 03338 Mr. Clark's Class 6 7 8 9 9 2689 1222589 03338 Calculate the mean and median for Ms. Sloan’s class. Calculate the mean and median for Mr. Clark’s class. Which class wins first prize or will they both share it? Explain. Before the announcement of first prize took place, the student who received a 51 in Ms. Sloan’s class dropped out of school. Would removing the 51 from Ms. Slone’s class change what should be announced? Explain. 18) Which grade is better: a 78 on a test whose mean is 72 and standard deviation is 6.5, or an 83 on a test whose mean is 77 and standard deviation is 8.4? Justify your answer.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Part 1: The Bell Curve