Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Unit 7 Statistics and Probability: Patterns in Scatter Plots Introduction In Grades 6 and 7, students manipulated univariate data. In Grade 8, they will extend their knowledge and skills to bivariate data. In this unit, students will construct and interpret scatter plots. They will describe patterns of association between two quantities as positive or negative and linear or nonlinear. Students will assess whether the pattern of association is strong or weak, and they will identify clusters and outliers in the data. Terminology. Scatter plots can also be called scattergraphs, plots, or graphs. We use the term “scatter plots.” In the AP Book, students will see the terms “horizontal axis” and “vertical axis.” In the lessons, you should also use the terms “x-axis” for the horizontal axis and “y-axis” for the vertical axis so that students are familiar with both ways of describing the axes of a scatter plot. Grid paper. We recommend that students use grid paper and that you have a background grid on your board. If students do not have grid paper, you will need to have lots of it available (e.g., from BLM 1 cm Grid Paper on p. I-1). If you do not have a background grid on your board, you will need to project a transparency of a grid onto the board so that you can write over the grid and erase the board without erasing the grid. Scatter plots. Several scatter plots to be used during the lessons in this unit are provided on BLMs. You may reproduce the BLMs on a transparency to be projected onto the board. You may also distribute a photocopy to individual students or pairs. Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-1 G8-1 Drawing Scatter Plots Pages 203–204 Standards: 8.SP.A.1 Goals: Students will draw scatter plots to represent bivariate data. Prior Knowledge Required: Can plot points on a coordinate grid and identify coordinates of points Can write ordered pairs from a table Knows that a coordinate plane has two axes (horizontal and vertical) that are labeled and have a scale Vocabulary: axis, bivariate data, break in the axis, coordinate plane, horizontal axis, ordered pair, plot, scale, scatter plot, variable, vertical axis, x-axis, y-axis Introduce bivariate data. Write on the board: Bivariate data ASK: What does the prefix bi- mean? (two) PROMPT: Give students examples of words with the prefix bi-, such as bicycle, bilingual, and bifocal. ASK: What word is variate similar to? (vary, variable) SAY: So bivariate data is data that has two variables. For example, a list of students’ ages and heights has two pieces of information for each student—age and height. With bivariate data, we can compare two quantities and look for relationships between them. Introduce scatter plots. SAY: Six students recorded their age and the number of hours of sleep they usually get on a school night. Draw on the board: Mike Hanna Zara Will Liz Glen Age (years) 14 14 13 13 13 14 Hours of Sleep on a School Night 8 7 9 8 6 10 SAY: We could draw a bar graph to show how many students there are of each age, or how many students sleep between 6 to 8 and 9 to 11 hours. But to show both variables—age and hours of sleep—on the same graph, we need to draw a scatter plot. A scatter plot shows bivariate data on a coordinate plane. H-2 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability Draw on the board: Sleeping Time (hours) Hours of Sleep by Age 10 9 8 7 6 0 13 14 Age (years) ASK: Why do you think this is called a scatter plot? (the points seem to be scattered over the graph) SAY: Scatter plots are similar to coordinate planes—there is the horizontal x-axis for one variable, the vertical y-axis for the other variable, and points that represent the data. Point out the breaks in the axes shown by the symbols and . SAY: The data is only for 13- and 14-year-olds, so we “break” the x-axis and start counting at 13 years instead of 0 years. Similarly, we “break” the y-axis to start counting at 6 hours. Draw on the board: Sleeping Time (hours) Hours of Sleep by Age 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Age (years) (MP.5) SAY: This is what the scatter plot would look like without the breaks. Point out how crowded the numbers on the axis are, how close together the points are, and how much white space there is on the scatter plot. SAY: By breaking the axes, we can zoom in on the part of the graph that has the data. This makes a scatter plot that is visually appealing and easy to understand. Refer back to the original scatter plot. SAY: Each dot on the scatter plot shows the data for one student. We can identify the data points by their coordinates. Remind students that coordinates are listed with the x-coordinate first and the y-coordinate second. ASK: What variable is on the x-axis? (age) On the y-axis? (sleeping time) SAY: So the ordered pairs for this scatter plot will be given as (age, sleeping time). For example, the point (14, 8) represents a 14-year-old who usually sleeps for 8 hours on a school night. ASK: Which student is this? (Mike) Write “(14, 8)” on the board and have a volunteer use the table or the scatter plot to complete the list of coordinates. ((14, 7), (13, 9), (13, 8), (13, 6), (14, 10)) Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-3 Refer to the point at (14, 10), and ASK: Which student’s data is represented by this point? (Glen) Have a volunteer circle the point that represents Liz. ((13, 6)) ASK: Which two people sleep for the same number of hours on a school night? (Mike and Will) How is this shown on the scatter plot? (the points are at the same height) Tell students that you want to add a point for Alice, who is 13 years old and usually sleeps for 8 hours on a school night. Have a volunteer plot Alice’s data. Point out that there is already a point at (13, 8) for Will, so we can’t add a distinct point for Alice. In this case, our scatter plot will have 6 points that represent data for 7 students. Exercises: Yu recorded the number of vowels and letters in six students’ first names, and then made a scatter plot. Number of Letters in Name 4 5 2 3 3 4 Vowels and Letters in Name 5 Number of Letters Anna Bobby Ed Ava Han Tess Number of Vowels in Name 2 2 1 2 1 1 4 3 2 1 0 1 2 Number of Vowels a) Write the ordered pairs (vowels, letters) for the six students. b) Underline Ava’s ordered pair in your list from part a). c) Who has the most letters in his or her name? How is this shown on the scatter plot? Bonus: d) Write the ordered pair that represents the data for your first name. e) Think of a first name that would be plotted at (3, 8). Answers: a) (2, 4), (2, 5), (1, 2), (2, 3), (1, 3), (1, 4); b) (2, 3); c) Bobby, his point is the highest on the scatter plot; Bonus: d) answers must be in the format (number of vowels in first name, number of letters in first name); e) Sample answers: Samantha, Jennifer, Benjamin, Jonathan Writing ordered pairs from a table and plotting points on a scatter plot. Draw on the board: Height (cm) Arm Span (cm) 174 170 176 172 155 152 167 163 152 145 158 157 164 160 SAY: This table shows the height and arm span of seven 14-year-olds. Have a volunteer write the ordered pairs (height, arm span) for the data on the board. ((174, 170), (176, 172), (155, 152), (167, 163), (152, 145), (158, 157), (164, 160)) H-4 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability Draw on the board: Arm Span (cm) Height and Arm Span 180 170 160 150 140 0 140 150 160 170 180 Height (cm) Have a volunteer plot the points on the scatter plot. PROMPT: The first number in the coordinate tells you the height (x-axis), and the second number tells you the arm span (y-axis). (see completed graph below) Arm Span (cm) Height and Arm Span 180 170 160 150 140 0 140 150 160 170 180 Height (cm) Leave this scatter plot on the board. Exercises: This table shows the standard values for temperature for different altitudes above sea level. 0 4,000 8,000 12,000 16,000 Altitude (ft) 59 44.7 30.5 16.2 Temperature (°F) a) Write the ordered pairs for the data in the table. b) Plot the points on the scatter plot. 1.9 Temperature (°F) Altitude and Temperature 60 50 40 30 20 10 0 4 8 12 16 20 Altitude (thousands of ft) Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-5 Temperature (°F) (MP.4) Bonus: Leadville, Colorado, is 10,152 feet above sea level. Use the scatter plot to estimate the temperature for this altitude. Answers: a) (0, 59), (4,000, 44.7), (8,000, 30.5), (12,000, 16.2), (16,000, 1.9) b) Altitude and Temperature 60 50 40 30 20 10 0 4 8 12 16 20 Altitude (thousands of ft) Bonus: Draw a line from (0, 59) to (16,000, 1.9). For the x-value of 10,000, there is a point on the line with a y-value that is a little less than 25. So the estimated temperature is 23°F. (MP.6) Drawing scatter plots. Use the scatter plot for Height and Arm Span to point out the following features of scatter plots: • short, descriptive title • labels on horizontal and vertical axes, indicating units if possible • an even, appropriate scale on the axes • the symbol or if some numbers have been left off the scale • points to represent the data • no lines connecting the points Draw on the board: Year Number of Athletes Number of Medals 1998 186 13 2002 202 34 2006 211 25 2010 216 37 2014 230 28 Tell students that the table shows how many American athletes participated in the Winter Olympics from 1998 to 2014, and how many medals were won by Team USA. SAY: Let’s make a scatter plot for the year and the number of athletes. On the board, draw a horizontal axis and a vertical axis. Remind students that we put the independent variable on the horizontal axis and the dependent variable on the vertical axis. SAY: Bivariate data does not necessarily have a dependent variable. For example, for height and weight, either variable could go on the horizontal axis. In this case, the data is collected by year so we will put year on the horizontal axis. ASK: What goes on the vertical axis? (Number of Athletes) What title should we give the scatter plot? (Olympic Athletes by Year) H-6 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability SAY: The data for the horizontal axis goes from 1998 to 2014. ASK: How should we divide the horizontal axis? (with five ticks, one for every 4 years, starting at 1998) Have a volunteer mark the scale of the horizontal axis. Point out that the convention is not to break the axis if it shows years. SAY: The data for the vertical axis goes from 186 to 230. This range is too big to count by ones, so we need to make a scale for the axis. Point out that if you make the scale too big, it will be hard to plot exact numbers. For example, if you count by fifties, you’ll have to approximate the position of each point. On the other hand, if you make the scale too small, the scatter plot might not fit on the paper. For example, if you count by twos, the axis will be almost 25 squares high! Tell students that we usually aim for 5 to 10 divisions on an axis. ASK: What is the range of this data? (230 − 186 = 44) For this data, how should we count? (by tens) What number should we start at? (185) Have a volunteer break and label the vertical axis. Have another volunteer plot the points. (see completed graph below) Olympic Athletes by Year Number of Athletes 235 225 215 205 195 185 0 ‘98 ‘02 ‘06 ‘10 ’14 Year 40 40 35 35 Number of Medals Number of Medals Exercises: Use the data for the Winter Olympics. a) Make a scatter plot for the year and number of medals. b) Make a scatter plot for the number of athletes and number of medals. (MP.4) Bonus: Does sending more athletes to the Olympics guarantee more medals? Which points on the scatter plot for part b) support your answer? Answers: a) b) Olympic Athletes and Medals Olympic Medals by Year 30 25 20 30 25 20 15 15 10 10 0 ‘98 ‘02 ‘06 ‘10 ‘14 Year 0 185 195 205 215 225 235 Number of Athletes Bonus: No. The point (230, 28) represents more athletes but less medals than (216, 37). The point (211, 25) represents more athletes but less medals than (202, 34). Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-7 Extensions 1. a) Complete the table. All dimensions are given in inches. Hint: Rectangular prisms have SA = 2ℓw + 2ℓh + 2hw and V = ℓwh i) Rectangle Perimeter (in) Area (in2) 3×5 1×7 7×4 5×8 5×6 1×3 b) Draw a scatter plot to represent the data. Answers: a) i) Perimeter Area Rectangle (in) (in2) 3×5 16 15 1×7 16 7 7×4 22 28 5×8 26 40 5×6 22 30 1×3 8 3 b) i) Perimeter and Area Rectangular Prism 3×4×5 1×2×8 4 × 4 × 10 6×6×6 2×7×9 5 × 1 × 11 Surface Area (in2) Volume (in3) ii) Rectangular Prism 3×4×5 1×2×8 4×4×5 3×3×3 2×5×8 5 × 1 × 11 Surface Area (in2) 94 52 112 54 132 142 Volume (in3) 60 16 80 27 80 55 ii) Surface Area and Volume 90 35 75 Volume (in3) 40 30 Area (in2) ii) 25 20 15 45 30 15 10 0 5 0 60 5 50 70 90 110 130 150 2 Surface Area (in ) 10 15 20 25 Perimeter (in) (MP.3, MP.6) 2. Kyle made five mistakes when he drew this scatter plot. Identify and correct his mistakes. 200 Weight 180 160 140 120 110 100 0 H-8 58 60 62 64 66 68 70 Height Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability Answers: Kyle needs a title (Height and Weight), he needs units (inches for height and pounds for weight), he needs the symbols to show that there are breaks in the horizontal and vertical axes, his vertical axis is uneven and should count by twenties, the points should not be joined by a line (MP.1, MP.4) 3. a) Write two questions to collect data from your classmates. The data must be numerical, such as number of pets and number of family members. Categorical data, such as favorite sport teams and languages spoken at home, should not be collected because there could be too many different answers. Don’t ask about sensitive or private information, such as money or weight. Ask your teacher to approve your questions before doing parts b) to d). b) Draw a table to collect your information. The table should have three columns and take up a whole page. The first column is where you will record the names of your classmates. In the second and third columns, you will record the answers to your two questions. Write the questions in the headers of the columns. c) Ask your classmates the two questions and record their answers. d) Draw a scatter plot to represent the data. 4. The table shows data for 10 students in Grades 4 to 12. 28 24 28 29 21 26 25 22 30 23 Foot Length (cm) 174 152 171 177 140 163 158 146 191 155 Height (cm) a) Draw a scatter plot with foot length on the horizontal axis and height on the vertical axis. b) Draw a scatter plot with height on the horizontal axis and foot length on the vertical axis. c) What difference does changing the axes make on the graphs? d) Use the scatter plot to predict the foot length of a student with a height of 185 cm. e) Can you use the scatter plot to predict the foot length of a student with a height greater than 200 cm? Answers: a) b) Foot Length and Height Height and Foot Length 32 Foot Length (cm) 200 Height (cm) 190 180 170 160 150 140 0 30 28 26 24 22 20 20 22 24 26 28 30 Foot Length (cm) 0 140 150 160 170 180 190 Height (cm) c) the graphs are symmetrical about a diagonal line; d) about 29.5 cm; e) No, the graph does not show this information. Not many students are taller than 200 cm or have feet longer than 30 cm, so the pattern of the graph will change. Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-9 G8-2 Describing Scatter Plots Pages 205–207 Standards: 8.SP.A.1 Goals: Students will recognize and describe patterns of association of bivariate data in scatter plots. Prior Knowledge Required: Can read scatter plots Can draw scatter plots Vocabulary: association, data, negative association, no association, ordered pair, plot, positive association, scatter plot Materials: overhead projector (optional) BLM Describing Scatter Plots (p. H-25), either on a transparency for display or a copy for each student or student pair Describing scatter plots. Show students Scatter Plot 1 (Age and Weight of Children) from BLM Describing Scatter Plots. SAY: This scatter plot shows the age and weight of some children. ASK: For how many 7-year-olds does the scatter plot show data? (three) How do you know? (there are three points above 7) According to the data, are 4-year-olds likely to weigh less than 10-year-olds? (yes) How do you know? (the points for 4-year-olds are lower than the points for 10-year-olds) Was every 2-year-old heavier than every 1-year-old? (no) How do you know? (the lowest point for 2-year-olds is lower than the highest point for 1-year-olds) According to this data, does weight increase or decrease when age increases? (increase) How does the scatter plot show this? (as we move to the right along the horizontal axis, the points get higher) Show students Scatter Plot 2 (Age and Weight of Adults) from BLM Describing Scatter Plots for the following Exercises. Exercises: This scatter plot shows the age and weight of some adults. a) How many 80-year-olds are shown in the data? b) According to the data, are 60-year-olds more likely to weigh more than, or less than, 40-yearolds, or is it difficult to say? c) Explain your answer to part b) using the scatter plot. (MP.7) Bonus: Why do the two scatter plots comparing age and weight (Scatter Plots 1 and 2) look so different? Answers: a) 3, b) it is difficult to say, c) 40-year-olds and 60-year-olds have very similar points, Bonus: Children are growing so an older child will usually weigh more than a younger child. Adults have stopped growing, so an older adult will not necessarily weigh more than a younger adult. H-10 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability Describing relationships shown in scatter plots. Show students Scatter Plot 3 (Households with Landline Phones Only) and Scatter Plot 4 (Households with No Phones) from BLM Describing Scatter Plots. SAY: Look at the scatter plot that shows the percentage of households with landline phones only (this means that there is at least one landline in the house/apartment/ etc., but no one who lives there has a cellphone). ASK: Since 2008, has the percentage of households with only landlines increased, decreased, or not changed significantly? (decreased) How does the scatter plot show this? (as the years increase, the points get lower) SAY: Look at the scatter plot that shows the percentage of households with no phones (neither landline nor cellphone). ASK: Since 2008, has the percentage of households with no phones increased, decreased, or not changed significantly? (not changed significantly) How does the scatter plot show this? (the heights of the points haven’t changed much) Does this make sense? (yes, some people will probably never have a phone) Show students Scatter Plot 5 (Games Played and Jersey Number) and Scatter Plot 6 (Games Played and Points Scored) from BLM Describing Scatter Plots for the following exercises. Exercises: 1. The scatter plots show some data for professional basketball players. Based on the scatter plots, write “increases,” “decreases,” or “is not affected.” a) As the number of games a player has played increases, the number on the player’s jersey ___________. b) As the number of games a player has played increases, the average number of points scored by the player each game ___________. Answers: a) is not affected, b) increases (MP.3) 2. Explain why your answers to Exercise 1 make sense. Sample answers: a) coaches don’t pick who plays in a game based on the number on their jersey, jersey numbers are chosen randomly by the players; b) the more games a player plays, the more opportunity he has to score; better players play more games and score more points Identifying patterns of positive and negative association. SAY: Bivariate data allows us to look for relationships between two sets of data. You can analyze the pattern of the points on the scatter plot to determine how the data is related. Draw on the board: SAY: Associations describe how sets of data are related. Three types of associations are shown in these scatter plots. Point at the scatter plot on the left, and ASK: As we move to the right, do the points get higher or lower? (higher) SAY: There is a positive association between two sets of data if the values in the sets of data increase together. Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-11 Point at the scatter plot in the middle, and ASK: As we move to the right, do the points get higher or lower? (lower) SAY: There is a negative association if the values in one set of data increase as the values in the other set of data decrease. Point at the scatter plot on the right, and ASK: As we move to the right, do the points get higher or lower? (both) SAY: There is no association if there is neither a positive association nor a negative association between the two sets of data. Label the three scatter plots on the board with the type of association (positive association, negative association, no association). SAY: In this unit, you will look at scatter plots to determine whether there is an association between two variables. You will be able to see clearly if there is or isn’t an association in the graphs you are given. In high school, you will learn statistical methods—beyond just looking at a graph—to determine whether there is or isn’t an association. (MP.7) Exercises: State the type of association for the scatter plot. a) b) c) d) Answers: a) negative association, b) no association, c) positive association, d) no association Drawing a scatter plot to identify the pattern of association. Draw on the board: Number of Pages in Book Length of Movie (minutes) 309 152 341 161 435 142 734 157 870 138 652 153 SAY: May researched six books that were made into movies. The table shows the data she collected. Have a volunteer write the data as ordered pairs on the board. ((309, 152), (341, 161), (435, 142), (734, 157), (870, 138), (652, 153)) SAY: We want to make a scatter plot for the data. On the board, draw a horizontal and a vertical axis. ASK: How should we label the horizontal axis? (Number of Pages in Book) The vertical axis? (Length of Movie (minutes)) What title could we give the scatter plot? (Book and Movie Lengths) SAY: The data for the horizontal axis goes from 309 to 870. Let’s break the axis and start counting at 300. ASK: How should we divide the horizontal axis? (into hundreds) Have a volunteer break and label the horizontal axis. SAY: The data for the vertical axis goes from 142 to 161. ASK: What scale should we use for the axis? (start at 140, count by fives) Have a volunteer break and label the vertical axis. Have another volunteer plot the points. (see completed graph on the next page) H-12 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability Book and Movie Lengths Length of Movie (minutes) 165 160 155 150 145 140 0 300 400 500 600 700 800 900 Number of Pages in Book ASK: Does the scatter plot indicate an association between the number of pages in the book and the length of the movie? (no) Does this make sense? (yes, short books don’t necessarily get made into short movies) (MP.4) Exercises: 1. The table shows the gender, age, and arm span of 14 students. M F F M M F M M F Gender F M F F M Age 13 12 13 11 13 10 13 11 12 10 10 11 12 13 (years) Arm Span 159 150 156 150 180 126 163 145 160 144 135 155 157 173 (cm) a) Draw a scatter plot using gender for the horizontal axis and age for the vertical axis. b) Draw a scatter plot using gender for the horizontal axis and arm span for the vertical axis. c) Draw a scatter plot using age for the horizontal axis and arm span for the vertical axis. Answers: Gender and Arm Span a) b) c) Gender and Age Age and Arm Span 180 180 170 11 10 170 160 Arm Span (cm) 12 Arm Span (cm) Age (years) 13 150 140 130 140 120 M Gender 150 130 120 F 160 F M Gender 0 10 11 12 13 Age (years) 2. Does the scatter plot for Exercise 1.c) indicate an association between age and arm span? If so, describe the association. Answer: Yes. The scatter plot indicates a positive association. Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-13 3. Look at the scatter plots you drew for Exercise 1. Which statement(s) do you agree with? A. Female students are likely to be older. B. Younger students are likely to have a shorter arm span. C. Male students are likely to have a longer arm span. Answer: B and somewhat C Bonus: Which statement from your answer to Exercise 3 has the strongest data to support it? Explain. Answer: C. The data clearly shows that younger students are likely to have a shorter arm span. There is a lot of overlap for the data for B (gender and arm span), so a male student is only slightly more likely to have a longer arm span than a female student. Extensions (MP.4) 1. Bill is studying the properties of water at different temperatures in science class. He filled three graduated cylinders with 100 mL of water. He heated the water and performed three separate experiments. He recorded his results in a table. Density Amount of Sugar that Temperature Calories 3 (C) Could Be Dissolved (g) (kg/m ) 20 0 998.3 203.9 40 0 992.3 238.1 60 0 983 287.3 80 0 972 362.1 100 0 958 487.2 To finish his lab report, Bill needs to draw three scatter plots and answer these questions: a) How did the temperature of water affect its calorie content? b) How did the temperature of water affect its density? c) How did the temperature of water affect the solubility of sugar (the amount of sugar that can be dissolved)? Help Bill finish his lab report. Answers: 1 1,000 0.8 990 0.6 0.4 0.2 0 20 40 60 80 100 Temperature (°C) 980 970 960 950 0 Solubility of Sugar in Water Amount of Sugar that Could Be Dissolved (g) Density of Water Density (kg/m3) Calories Calories in Water 20 40 60 80 100 Temperature (°C) 600 500 400 300 200 100 0 20 40 60 80 100 Temperature (°C) a) The temperature of water had no effect on the calorie content. b) As the temperature of water increased, the density of water decreased. c) As the temperature of water increased, the solubility increased. H-14 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability (MP.8) 2. a) Find the slope of the line AB. Hint: slope = i) ii) y 6 4 4 2 2 A x 1 2 iii) y 3 4 5 4 A 2 B 0 y 6 6 B 0 rise run 1 2 3 4 x 5 0 B A x 1 2 3 4 5 b) How is the graph of a line with positive slope similar to a scatter plot with positive association? c) How is the graph a line with negative slope similar to a scatter plot with negative association? Bonus: d) Find the slope of the line AB. y 6 A 4 B 2 0 x 1 2 3 4 5 e) How is the graph of a line with no slope similar to a scatter plot with no association? Answers: a) i) slope = 2; ii) slope = −1; iii) slope = 1/2; b) when data has positive association, the points in the scatter plot lie on or near a line with positive slope; c) when data has negative association, the points in the scatter plot lie on or near a line with negative slope; Bonus: d) slope = 0; e) when data has no association or a line has slope = 0, the y-values do not consistently increase or decrease as the x-values increase (MP.1) 3. If students completed Extension 3 from G8-1, display their scatter plots around the classroom. Have students walk around the classroom and write down their observations based on the scatter plots. For example, “In my class, there is no association between the number of pets a student has and how many people are in the student’s family.” Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-15 G8-3 Interpreting Scatter Plots Pages 208–210 Standards: 8.SP.A.1 Goals: Students will describe patterns of association in scatter plots, including nonlinear association. Students will identify clusters and outliers. Students will predict the relationship between two quantities. Prior Knowledge Required: Can read scatter plots Can draw scatter plots Can recognize and describe positive association, negative association, and no association in a scatter plot Vocabulary: cluster, linear association, negative association, no association, nonlinear association, outlier, pattern of association, positive association, scatter plot, strong association, weak association Materials: overhead projector (optional) BLM Interpreting Scatter Plots (p. H-26), either on a transparency for display or a copy for each student or student pair Describing patterns of association. Draw on the board: A. B. C. ASK: What type of association do the three scatter plots show? (positive) How do you know? (the values in the sets of data increase together) In which graph do the points lie on a nearly straight line? (graph A) SAY: Data that lie more or less along a line have a linear association. Data can have a strong association or a weak association. Graph A shows a strong association because its points are close to a line. Graph B is also linear, but it shows a weak association because the points are more spread out, or scattered. ASK: Does Graph C show a linear association? (no, it’s a curve) SAY: If the points lie on a curve or bent line, then the data shows a nonlinear association. Nonlinear associations can also be strong or weak. Graph C shows a strong nonlinear association. Tell students that if a scatter plot shows an association between two sets of data, then to fully describe the pattern of association, they should say whether the H-16 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability association is strong or weak, positive or negative, and linear or nonlinear. Summarize the pattern of association on the board below each scatter plot, as shown below: A. strong positive linear association B. weak positive linear association C. strong positive nonlinear association (MP.7) Exercises: Determine the pattern of association. A. B. C. D. For which scatter plot(s) does the data show … a) a positive association? b) a negative association? c) a linear association? d) a nonlinear association? e) a strong association? f) a weak association? g) a strong positive association? h) a weak negative association? Answers: a) A, D; b) B, C; c) A, B, D; d) C; e) B, D; f) A, C; g) D; h) C Review drawing scatter plots. Tell students that you collected some data on the budget and box-office revenue of 10 Hollywood movies. ASK: What type of relationship do you think there is between the budget of a movie and how much money the movie makes at the box office? Tell students that you will say “positive association,” “negative association,” and “no association.” Instruct students to raise their hands once to indicate which type of relationship they think exists. Tally their responses on the board. Draw on the board: Budget (millions of dollars) Box-Office Revenue (millions of dollars) 170 105 200 1 7 300 185 75 60 250 77 10 669 117 230 950 995 815 468 260 SAY: We want to draw a scatter plot for this data to see if there is a relationship. Draw axes on the board, and ask students how they should be labeled. (horizontal: Budget (millions of dollars), vertical: Box-Office Revenue (millions of dollars)) Have students suggest a title. (Budget and Revenue of Movies) Have a student label the horizontal axis from 0 to 300 in increments of 50. Have another student label the vertical axis from 0 to 1,000 in increments of 100. Emphasize that we can choose any increment for the axes. SAY: In this example, both units are “millions of dollars,” but the horizontal axis increases by increments of 50 million and the vertical axis increases by increments of 100 million. Have a third student plot the data. (see completed graph on the next page) Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-17 0 ASK: Is there any association between budget and box-office revenue? (no) Does this make sense? (yes, some low-budget movies are surprise hits, and some big-budget movies are flops) Could you use this information to predict the box-office revenue of a movie, based on its budget? (no, the scatter plot shows that there is no relationship between those variables) (MP.4) Exercises: The table shows the value of e-commerce sales from 2000 to 2014. 2000 2002 2004 2006 2008 2010 2012 2014 Year E-Commerce Sales 7,765 12,203 19,588 30,148 33,188 45,006 59,862 79,567 (millions of dollars) a) Draw a scatter plot for the data. b) Is there any association between the year and the value of e-commerce sales? Explain. Bonus: Are there any points that don’t follow the pattern? Answers: a) b) There is a strong positive association. From 2000 to 2014, the value of e-commerce sales increases. Bonus: There is an increase every 2 years, but the increase in 2008 is less than in other years, so 2008 does not follow the pattern. Describing nonlinear patterns of association. Show students Scatter Plot 1 (Distance and Height of Football) from BLM Interpreting Scatter Plots. SAY: A football player punts the football. The scatter plot shows the horizontal distance of the ball from the player, and the height of the ball. Cover up the half of the scatter plot for distances greater than 60 feet. ASK: Is there any association between height and distances less than 50 feet? (yes, positive) Cover up the half of the scatter plot for distances less than 60 feet. ASK: Is there any association between height and distances greater than 60 feet? (yes, negative) SAY: Imagine if you had only been given a scatter plot for distances up to 60 feet. ASK: What conclusion would you make? (as H-18 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability horizontal distance increases, height increases) Tell students to make sure that they are given all the information before making observations based on a scatter plot! ASK: For the scatter plot overall, can we say there is an association between height and horizontal distance? (there is a nonlinear association) (MP.4) Exercises: The table shows the cost of hiring a DJ to play for an event. Length of Event (hours) 1 2 3 4 5 6 7 8 9 400 400 400 400 400 475 550 625 700 Fee Charged by DJ (dollars) a) Draw a scatter plot for the data. b) For an event that is less than 6 hours long, what type of association is there between the length of the event and the fee charged by the DJ? c) For an event that is longer than 5 hours, what type of association is there between the length of the event and the fee charged by the DJ? d) For the scatter plot overall, can we say there is an association between the length of the event and the fee charged by the DJ? Explain. e) If you wanted to show that the fee increases for every hour, what scale would you use? Answers: a) b) no association; c) positive linear association; d) no linear association because the pattern is not consistent, but there is nonlinear association; e) you would break the horizontal axis and start the scale at 5 hours to show that the fee charged is always increasing Identifying clusters and outliers. Draw on the board: ASK: What type of association does the scatter plot show? (positive) Are there points that stand out? (two points—one in the top left, one in the bottom right) Why do they stand out? (most of the points are grouped together and these points are not part of the group) SAY: A data point that is very different from the rest of the data in the set is called an outlier. ASK: Are there any groups of data? (yes, in the top right and near the origin) SAY: Data points that are grouped close to each other are called a cluster. Data can be clustered about a point (indicate the cluster Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-19 near the origin) or a line (indicate the cluster at the top right). Clusters tell you about trends in the data, and outliers tell you which points don’t follow the trend. Tell students that identifying clusters and outliers gives you more information about the data. Show students Scatter Plot 2 (Life Expectancy and Income) from BLM Interpreting Scatter Plots for the following exercises. (MP.4, MP.7) Exercises: The scatter plot shows data for 20 countries around the world. Each point represents the life expectancy at birth, and the average income per person (IPP) for one country’s population. a) What association does the scatter plot show? Explain. b) Circle the cluster. What does the cluster tell you about the data? c) Are there any data points that you would identify as outliers? Explain. d) Algeria has a life expectancy of 76.3 years and an IPP of $12,957. Circle the data point for Algeria. e) India is not yet shown on the scatter plot. Add the data point for India, where the population has a life expectancy of 66.2 years and an IPP of $5,198. f) Use the scatter plot to estimate the IPP of a person with a life expectancy of 81 years. g) Create two points for this data: one that would be part of the cluster and one that would be an outlier. Selected answers: a) weak positive nonlinear association, countries with a greater life expectancy generally have a greater IPP; b) the cluster is in the top right (there are many countries with a life expectancy of about 82 years and an IPP around $40,000), and a roughly linear cluster along the x-axis (as life expectancy increases from 55 to 75, IPP increases from $1,000 to $20,000); c) the point near (59, 35,000) is an outlier because an income of $35,000 is more likely in a country with a life expectancy of more than 80 years, and the point near (82, 75,000) is also an outlier because the income is much higher than in other countries with a life expectancy of 82 years; f) about $40,000; g) Sample answers: (80, 40,000) would be part of the cluster, (55, 50,000) would be an outlier Choosing scatter plots to represent associations. Write and draw on the board: minutes of exercise per week and bone strength minutes of exercise per week and risk of cardiovascular disease minutes of exercise per week and length of hair SAY: Sometimes we can make predictions about relationships. ASK: From the options on the board, which data do you think would produce a scatter plot that looks like the one on the left? (minutes of exercise per week and risk of cardiovascular disease) Why? (the more you exercise, the lower your risk of cardiovascular disease) Which measurements would produce a scatter H-20 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability Exercise (minutes per week) Hair Length Bone Strength Risk of Cardiovascular Disease plot that looks like the one in the middle? (minutes of exercise per week and bone strength) Why? (the more you exercise, the stronger your bones are) Would “minutes of exercise per week and length of hair” create a scatter plot like the one on the right? (yes) Why? (there is no relationship between how much you exercise and how long your hair is) Have volunteers label the axes of the scatter plots, as shown below: Exercise (minutes per week) Exercise (minutes per week) Show students Scatter Plot 3 (A to D) from BLM Interpreting Scatter Plots for the following exercises. (MP.4) Exercises: a) Which scatter plot best represents the association between the time a student spends studying for a test and the student’s mark on the test? Explain. b) Explain what the other scatter plots mean. Bonus: What scale could you draw on the vertical axis of the scatter plot you chose for part a)? Sample answers: a) C because it shows a generally positive relationship, but has some points for students who didn’t study much but did well and for students who studied a lot but didn’t do well; b) A: Every student studied for the same amount of time and got a different mark, B: The mark that every student got was directly proportional to the time the student spent studying, D: the time that a student spent studying had no relationship to the student’s mark on the test; Bonus: The vertical scale should go from 0 to 100, since 0% is the minimum mark and 100% is the maximum mark. Describing versus interpreting scatter plots. SAY: Scatter plots that show an association often indicate a causal relation between the variables, but not always. Say we collected data from 5-year-olds to 18-year-olds. We recorded their age and their calorie intake per day. ASK: What type of association do you think this data would have? (positive) Why? (as we get older, we need to eat more) If students struggle with the answer, PROMPT: Do we need to eat more because we get older? (yes, as we get older, we grow and our bodies need more energy) ASK: If we sat on a busy downtown street and recorded how many people were holding umbrellas and how many cars had their wipers on, what type of association do you think there would be? (positive) Why? (if lots of people have umbrellas, then it’s probably raining, so cars will also have their wipers on) Do cars have their wipers on because people are using umbrellas? (no, they have them on because it’s raining) SAY: We can predict associations in bivariate data, but we have to be careful about saying that one variable influences the other. Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-21 Predicting associations in bivariate data. Read the statements and have students signal if they agree (thumbs up) or disagree (thumbs down). a) As a student’s age increases, the number of baby teeth that the student has decreases. (thumbs up) b) As a student’s age increases, the number of Americans that travel to Paris each year increases. (thumbs down) c) As the temperature increases, a person’s cell phone bill decreases. (thumbs down) d) As the temperature increases, the number of winter coats that a store sells decreases. (thumbs up) e) As the time it takes a student to get to school increases, the number of pencils in the student’s pencil case is not affected. (thumbs up) f) As the number of sports that a student plays increases, the amount of time the student spends at practices is not affected. (thumbs down) Exercises: 1. Predict the relationship. Write “increases,” “decreases,” or “is not affected.” a) As the price of a pair of jeans increases, the customer satisfaction rating of the jeans _______________. b) As the size of a diamond on a ring increases, the price of the ring _______________. c) As the mileage on a car increases, the value of the car _______________. d) As the number of pages in a book increases, the price of the book ___________________. Answers: a) is not affected, b) increases, c) decreases, d) is not affected of Diamond Rings of Jeans Price of Ring ($) Customer Satisfaction Rating 2. Sketch scatter plots for the four sentences in Exercise 1. Include the titles and labels, but no scales. Answers: Size and Price a) b) Price and Rating Size of Diamond (carats) Price of Jeans ($) c) Value of Car ($) Mileage on Car H-22 d) Length and Price of Books Price of Books ($) Mileage and Value of Cars Number of Pages in Book Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability Bonus: Fill in the blank with a variable that could make the sentence true. a) As a student’s age increases from 5 years to 18 years, _______ increases. b) As a student’s age increases from 5 years to 18 years, _______ decreases. c) As a student’s age increases from 5 years to 18 years, _______ is not affected. Sample answers: a) arm span, shoe size; b) number of rules set by parents, number of toys; c) length of hair, number of siblings Interpreting scatter plots. Show students Scatter Plot 4 (Location and Price of Hotel) and Scatter Plot 5 (Rating and Price of Hotel) from BLM Interpreting Scatter Plots. SAY: Two friends are going to New York City for the weekend. They collected data about hotel rooms. ASK: What type of association is there between the price of a hotel room and the hotel’s distance from Times Square? (weak negative linear) What does that mean? (as the hotel’s distance from Times Square increases, the price of a room decreases) What type of association is there between the price of a hotel room and the percentage of guests who gave the hotel a positive review? (strong positive) What does this mean? (as the percentage of positive reviews increases, the price of a room increases) Have students identify the clusters. (there are groups centered around $150 and $300–$400) ASK: Why are there clusters here? (these are the common prices for hotel rooms) Looking at both scatter plots, could you say that as the distance from Times Square to the hotel decreases, the percentage of positive reviews increase? (no, the data doesn’t tell you that) (MP.3, MP.4) Exercises: The table shows information for 10 employed adults. Salary (tens of 20 80 135 45 70 14 150 95 30 65 thousands of dollars) 13 20 22 19 22 12 22 21 15 20 Years of Education Commute Time 5 45 24 25 8 16 30 16 28 20 (minutes, one way) a) Draw a scatter plot for salary and years of education. What type of association is there? b) Draw a scatter plot for salary and commute time. What type of association is there? c) Are there any outliers in the graph you drew for part a)? d) What is the significance of there being no clusters in either scatter plot? e) Look at both scatter plots. Can you say if there is an association between years of education and commute time? Explain. Answers: a) b) 0 0 There is a weak positive linear association. There is no association. c) the salaries of $135,000 and $150,000 are much higher than the other salaries; d) the data samples asked people with different salaries; e) you can’t tell that from the scatter plots Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability H-23 Extensions (MP.3) 1. The scatter plot shows data for 14 countries around the world. Each point represents the average amount of chocolate that each person in the country consumes in a year, and how many people (for every 10 million people in the country) have won Nobel Prizes. a) Describe the association between a country’s chocolate consumption and number of Nobel Prize winners. b) Do you think that if a country increases their chocolate consumption they would win more Nobel Prizes? Answers: a) positive association; b) no, buying chocolate does not make you more likely to win a Nobel Prize. It’s a coincidence that the data are related. The association does not indicate causation. (MP.1, MP.6) 2. Describe a set of real-world bivariate data that the scatter plot could represent. a) b) c) d) Sample answers: a) x-axis: age of people in a daycare, y-axis: height; b) x-axis: the number of times you wash your hair, y-axis: the amount of shampoo in the bottle; c) x-axis: how many coins are in your pocket, y-axis: your mark on a science test; d) x-axis: years of experience, y-axis: salary (MP.1, MP.5) 3. Students can use the Internet to research a topic of interest to them, such as horses or baseball. They should choose two variables to investigate, such as age and jumping height of horses, or number of triples and number of homeruns of baseball players. They should gather at least 30 data points. Show them how to use software to create a scatter plot. They will then describe the association. Useful sources of data: • Census at School is an international online project for students in Grades 4 to 12. Students can complete a brief online survey, analyze their class results, and compare their data with data from random samples of students in the United States and other countries. • The World Factbook provides information about countries around the world. • Professional sports associations have sortable stats for each player and team. H-24 Teacher’s Guide for AP Book 8.1 — Unit 7 Statistics and Probability