Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 07 7/6/07 9:37 AM Page 289 Home Quit Key Words • one-variable data • discrete data • continuous data • categorical data What You’ll Learn To collect and analyse one-variable data from primary and secondary sources • distribution And Why People often make business and personal decisions based on data. Collecting data in an unbiased way and analysing data effectively are important life skills. • population • sample • bias • measures of central tendency • standard deviation Chapter 07 7/6/07 9:37 AM Page 290 Home CHAPTER 7 Quit Activate Prior Knowledge Interpreting Circle Graphs Prior Knowledge for 7.1 A circle graph is also known as a pie chart. It displays data by dividing a circle into sectors that represent parts of a whole proportionally. Example This circle graph shows the results of a survey on the method of communication with friends used most often by Ontario secondary students. a) From the graph, which method is the Method of Communication most popular? Text message, e-mail 7% b) What percent of students prefer to use a Cell phone 6% cell phone? Telephone c) The survey was completed by 15 600 students. 12% Internet chat How many prefer to communicate in person? or MSN 38% Solution In person 37% Online chat or MSN has the greatest percent, 38%, so it is the most popular. b) 6% of students prefer to use a cell phone. c) 37%, or 0.37 of the students surveyed prefer to communicate in person. 0.37 ⫻ 15 600 ⫽ 5772 So, 5772 students prefer to communicate in person. a) ✓ Check 1. The same 15 600 students in the Guided Example were also asked this Charity Type question: “If you had $1000 to give to charity, which type would you choose?” The results are shown in the circle graph. a) Which choice is the most popular? Other 21% What percent of students chose this type of donation? b) How many students chose to donate to wildlife and animals? Arts, culture, Health 31% sports 13% 2. A circle graph shows each category of data as a percent of the complete set of data. Why do you think this is a good way to display data? What type of data would not fit in a circle graph? 290 CHAPTER 7: One-Variable Data International Wildlife/ animals 17% aid 18% 7/6/07 9:37 AM Page 291 Home Quit Bar Graphs and Pictographs Prior Knowledge for 7.1 A bar graph has horizontal or vertical bars that represent the data. The graph compares the data in categories, such as the average rainfall of different months. A pictograph is similar to a bar graph, but uses pictures or symbols to compare data. Example This table shows the average rainfall of Toronto, to the nearest millimetre, for each month of the year over a period of 40 years. Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Rainfall (mm) 47 46 58 66 67 66 72 82 70 63 67 62 Draw a bar graph and a pictograph to represent the same data. b) About how much rain falls in the summer months, July and August? a) Solution a) The length of a bar represents the amount of rain represents 10 mm of rainfall Average Rainfall in Toronto Average Rainfall in Toronto Rainfall (mm) Chapter 07 90 80 70 60 50 40 30 20 10 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Month b) Jan Feb Mar Apr May Jun Jul AugSep Oct NovDec Month Add the average rainfalls for July and August. About 154 mm of rain falls in July and August. ✓ Check 1. Here is a record of the types of books read by students in English classes during 1 year. Draw a bar graph and a pictograph to represent the data. Genre Mystery Comic Poetry Romance Biography Books read 22 43 12 54 34 2. Do you prefer to use a bar graph or a pictograph to represent the data in question 1? Explain when you would use a bar graph and when you would use a pictograph. Activate Prior Knowledge 291 Chapter 07 7/6/07 9:37 AM Page 292 Home Quit Organizing Data into Intervals Prior Knowledge for 7.1 During a survey, you may collect data that spreads over a wide range. To display and analyse the data, you need to group the data into appropriate intervals. To group the data, use a suitable range to divide the data into a reasonable number of intervals. Then determine the data that should go into each interval. Example The heights of 15 players of the 2006 Toronto Raptors basketball team are listed. 7⬘ 0⬙ 6⬘ 10⬙ 6⬘ 3⬙ 6⬘ 0⬙ 6⬘ 9⬙ 6⬘ 7⬙ 6⬘ 9⬙ 5⬘ 11⬙ 7⬘ 0⬙ 6⬘ 6⬙ 6⬘ 7⬙ 6⬘ 10⬙ 6⬘ 10⬙ 6⬘ 5⬙ 6⬘ 2⬙ Determine the number of players in each interval: Under 6⬘, 6⬘ – 6⬘ 5⬙, 6⬘ 6⬙ – 6⬘ 11⬙, 7⬘ and over Solution Create a tally chart to help count the number of players in each interval. Height Tally Under 6⬘ | 6⬘ – 6⬘ 5⬙ |||| 6⬘ 6⬙ – 6⬘ 11⬙ |||| ||| 7⬘ and over || 1 4 8 2 Number of players ✓ Check 1. In 2005, the populations of the world’s 30 largest cities are: 35 327 000 19 013 000 18 498 000 18 336 000 18 333 000 15 334 000 14 299 000 13 349 000 13 194 000 12 665 000 12 560 000 12 146 000 11 819 000 11 469 000 11 286 000 11 146 000 11 135 000 10 849 000 10 677 000 10 672 000 8 711 000 8 180 000 7 615 000 7 594 000 9 854 000 9 760 000 7 352 000 7 182 000 9 592 000 9 346 000 Determine the number of cities in each interval: Under 10 million, 10 million – 14 999 999, 15 million – 19 999 999, 20 million and over 2. These are the heights of 28 players of the 2006 Toronto Maple Leafs hockey team. 6⬘ 6⬙ 5⬘ 11⬙ 6⬘ 2⬙ 6⬘ 5⬙ 6⬘ 1⬙ 6⬘ 1⬙ 6⬘ 2⬙ 6⬘ 7⬙ 6⬘ 1⬙ 6⬘ 4⬙ 6⬘ 4⬙ 6⬘ 2⬙ 5⬘ 10⬙ 6⬘ 1⬙ 6⬘ 0⬙ 5⬘ 11⬙ 6⬘ 0⬙ 6⬘ 4⬙ 6⬘ 0⬙ 6⬘ 1⬙ 6⬘ 1⬙ 6⬘ 0⬙ 6⬘ 5⬙ 5⬘ 10⬙ 5⬘ 10⬙ 6⬘ 0⬙ 5⬘ 10⬙ 6⬘ 5⬙ Choose the intervals and organize the heights. b) Explain how you chose the intervals. c) Suppose you were organizing the data for the heights of students in your class. What intervals would you use? Explain your choice. a) 292 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:38 AM Page 293 Home Quit Math in the Media: Be Informed! Information in newspapers, magazines, and on Internet sites often involves numbers. But not everything you read is unbiased, or even true. SOME COMMON MISTAKES AND MISLEADING PRACTICES Misuse of language • The words average or typical are sometimes used without identifying whether the number used is the mean, median, or mode. • Survey questions can be worded so results favour one opinion. Distorted visuals • In some pictographs or 3-D graphs, the sizes of parts of the graph can make differences between numbers appear greater or less than they are. • When axes do not start at 0, it is easy to conclude that differences between numbers are greater than they are. Try this after you Questionable sources Section 7.6. • Do the data come from a random, unbiased sample? • When you see the word expert, ask yourself what makes the person an expert. Is he or she an expert in the appropriate field? • Are the presented data facts or are they opinions? Just because many people believe something does not make it true. have completed Look at these data. How might they be misleading? Purchasing power of the Canadian dollar, 1980 to 2000 Value of Average Canadian $151,850 Literacy in Math: Math in the Media: Be Informed! 293 Chapter 07 7/6/07 9:38 AM Page 294 Home 7.1 Quit Organizing and Representing Data Many people find a visual display of data easier to interpret than a set of numbers. Looking at the shape of a bar graph or the size of the sectors in a circle graph is a way to begin analysing the data. Investigate In a histogram, each bar represents the data within an interval. There are no gaps between bars. The interval “9 up to 10” includes 9 and all numbers greater than 9 but less than 10. 294 Exploring Shapes of Histograms Work with a partner or in a group. You will need grid paper. Four sets of data are given. Each data set is organized in intervals. Draw a histogram for one data set. Make sure each data set is graphed by at least one group. Data Set 1: Fuel consumption of mid-size cars, 2006 Fuel consumption rating (L/100 km) Number of models available 9 up to 10 9 10 up to 11 11 11 up to 12 35 12 up to 13 23 13 up to 14 13 14 up to 15 7 Data Set 2: Heights of high school students CHAPTER 7: One-Variable Data Height (cm) Number of students 150 up to 155 3 155 up to 160 7 160 up to 165 17 165 up to 170 10 170 up to 175 10 175 up to 180 16 180 up to 185 10 185 up to 190 6 Chapter 07 7/6/07 9:38 AM Page 295 Home Quit Data Set 3: Ages of cars in a parking lot Age (years) Number of cars 0 up to 3 20 3 up to 6 24 6 up to 8 22 9 up to 11 18 12 up to 14 20 Data Set 4: Population of Canada by age, 1994 Age (years) Population (thousands) 0 up to 19 7930 20 up to 39 9590 40 up to 59 7040 60 up to 79 3920 80 and over 650 ➢ Describe the shape of the histogram you graphed. ➢ Which statement best describes the data you graphed? Explain your choice. • There are two distinct peaks in the data. • Most of the data are clustered in the middle of the graph. • The data are evenly distributed across the intervals. • Most of the data are in the upper intervals. • Most of the data are in the lower intervals. What does the statement you chose tell you about the data? Reflect Share your graph and analysis with a group that used a different data set. Compare your results. Then discuss this question. ➢ What can the shape of a histogram tell you about the data set that it represents? ➢ How could you predict the shape of the histogram by looking at the data set? ➢ Repeat this with another group that used another data set. 7.1 Organizing and Representing Data 295 Chapter 07 7/6/07 9:38 AM Page 296 Home Quit Connect the Ideas Types of data One-variable data describe one piece of information about a person, place, or thing. Each piece of one-variable data is one number or word. Data that involve numbers are called numeric data. Numeric data may be discrete or continuous. Discrete data consist of values from a countable set of possibilities. Examples of discrete data are the number of siblings a person has, the year a person was born, or the number of courses a person is taking at school. Continuous data consist of values from a range. A person’s height, the length of time a competitor takes to run a race, or the distance a person commutes to work are examples of continuous data. Data that are grouped by categories are called categorical data. Examples include the colours of cars in a parking lot, yes or no responses on a questionnaire, or favourite types of music. Data sets Sometimes a data set consists of a list of numbers or words. Favourite Colours of Students in My Class blue red red black blue red purple green green blue blue black black red black black blue yellow green blue purple Favourite Colours of Students in My Class Other times, the data are in a table, with tally marks or Colour Tally Frequency numbers showing how many black |||| 5 pieces of data are in each blue |||| | 6 interval or category. green ||| 3 These are called frequency red |||| 4 tables. purple || 2 The number of pieces of data in each interval or category is called the frequency. 296 CHAPTER 7: One-Variable Data yellow | 1 7/6/07 9:38 AM Page 297 Home Types of graphs Quit The type of graph you draw depends on the type of data being represented. Circle graphs and pictographs can represent categorical data or discrete data. Circle graphs show the parts that make up a whole. Favourite Colours of People in My Class represents 5 students Purple 10% Number of Students at Pinewood High with a Driver’s Licence Yellow 5% Grade 9 Blue 28% Grade 10 Green 14% Grade 11 Black 24% Red 19% Grade 12 Bar graphs can represent categorical or discrete data. A histogram is a type of bar graph that shows numeric data grouped in intervals. There are no gaps between the maximum value of one interval and the minimum value of the next. There is no overlap between the intervals. So, the bars have no spaces between them. Household Incomes in Toronto Favourite Colours of People in My Class Number of households (thousands) Chapter 07 Yellow Purple Red Green Blue 250 200 150 100 50 0 80 100 Over 100 Annual income (thousands of $) Black 0 1 2 3 4 5 Number of people 20 40 60 6 7.1 Organizing and Representing Data 297 Page 298 Home 00 30 5: 00 5: 30 6: 00 M or e 4: 4: 00 30 3: 30 300 250 200 150 100 50 0 Time (h:min) Vacant Apartments 14 12 10 8 6 4 2 0 0 80 0 90 0 10 00 11 00 12 00 Most of the data are at the high end of the range. The graph tails off to the left. If you choose one piece of data at random, you are more likely to get a high number than a low number. 2006 Toronto Marathon Results 0 This is named for the tail on the left. Most of the data are clustered in the middle. The graph tails off the farther you are away from the middle. If you choose one piece of data at random, you are likely to get a number near the middle. Time of day 70 Skewed Left 70 60 50 40 30 20 10 0 3: Normal or BellShaped Distribution Time of Birth for All Babies Born in One Day in Canada 60 Choosing data at random means choosing so that every member of the data set has the same chance of being chosen. The data are approximately evenly distributed across the range. If you choose one piece of data at random, you are just as likely to get a low number as a high number. Frequency Uniform Distribution The shape of a histogram shows the distribution of the data. Because some shapes are common, they are given special names. 2: Distributions Quit Number of apartments 9:38 AM Frequency 7/6/07 M id n 2: igh 00 t 4: a.m 00 . 6: a.m 00 . 8: a.m 00 . 10 a.m :0 0 . a. m N . 2: oon 00 4: p.m 00 . 6: p.m 00 . 8: p.m 00 . 10 p.m :0 0 . M p.m id . ni gh t Monthly rent ($) This is named for the tail on the right. Most of the data are at the low end of the range. The graph tails off to the right. If you choose one piece of data at random, you are more likely to get a low number than a high number. Household Size in Canada, 2000 Number of households (millions) Skewed Right 4 3.5 3 2.5 2 1.5 1 0.5 5 6 or more Number of people in household Bimodal 298 The histogram has two distinct peaks. If you choose one piece of data at random, you are likely to get a number from one of the peaks. CHAPTER 7: One-Variable Data 0 1 2 3 4 Time Between Eruptions of Old Faithful Geyser During 14-Day Period Frequency Chapter 07 40 35 30 25 20 15 10 5 0 40 50 60 70 80 90 100 110 120 Time (min) 7/6/07 9:38 AM Page 299 Home Quit Practice 1. Is each piece of data numeric or categorical? For those that are numeric, identify each as continuous or discrete. a) A person’s eye colour b) A person’s age c) A person’s birth month d) A person’s mass e) A person’s height f) A person’s favourite pet g) Whether a person agrees with the statement “Hockey is more fun than baseball.” 2. Copy this graphic organizer. Write the words categorical, continuous, data, discrete, and numeric in the appropriate boxes to show the relationship between the types of data. 3. Describe the shape of each histogram. Then name the distribution that each graph matches the most. Toronto Blue Jays Batting Averages, 2006 Number of players a) 8 7 6 5 4 3 2 1 0 b) 50 100 150 200 250 300 350 400 Batting average Time to Get to School Number of students Chapter 07 20 10 0 5 10 15 20 25 Time (min) 30 35 7.1 Organizing and Representing Data 299 Chapter 07 7/6/07 9:38 AM Page 300 Home Quit Circle graphs are often used to display categorical data. If the data are provided in a list, you need to make a frequency table before graphing. Example These data show 24 high school students’ answers to the question: “What are your plans after graduation?” Display the data in a circle graph. You need a compass and a protractor to draw a circle graph. Solution college work college college college work university college university college work work college work college university college college university work college college college college First, make a frequency table for the data. Destination The sectors in a circle graph proportionally represent parts of a whole. When the data are provided as fractions or percents, begin by determining each sector angle. Number of students Tally Frequency 14 College |||| |||| |||| University |||| 4 Work ||||| 6 Determine the fraction of students that selected each destination. Multiply each fraction by 360⬚ to determine the sector angle. 14 14 out of 24 students responded “college”; 24 ⫻ 360⬚ ⫽ 210⬚ So, the sector representing college will have an angle of 210⬚. 4 4 out of 24 students responded “university”; 24 ⫻ 360⬚ ⫽ 60⬚ So, the sector representing university will have an angle of 60⬚. The rest of the circle will represent the response “work.” You can check the angle by calculating. 6 6 out of 24 students responded “work”; 24 ⫻ 360⬚ ⫽ 90⬚ So, the sector representing work will have an angle of 90⬚. Draw the circle graph. Label the sectors, and colour each sector a different colour. Planned Destinations of 24 High School Students Work 25% University 17% 300 CHAPTER 7: One-Variable Data College 58% Chapter 07 7/6/07 9:38 AM Page 301 Home Quit 4. A survey of how high school students get to school had these results: car 37.5%, public transit 37.5%, walk 17.5%, cycle 7.5% a) Draw a circle graph to illustrate the data. b) Write a question someone could answer using the graph. S, M, L, and XL represent small, medium, large, and extra large. 5. These data show the sizes of T-shirts sold by a band at a concert. Make a frequency table for the data. b) Draw a circle graph to illustrate the data. c) How could the graph help the band when it orders T-shirts for the next tour? a) S XL XL XL L L M XL XL L S L XL XL XL M L L S XL L XL L XL L XL S XL XL 6. The table shows the prices of houses for sale in Thunder Bay in January 2007. a) Are house prices discrete or continuous data? Explain. b) Predict the shape of a histogram for these data. Explain your prediction. c) Draw a histogram for these data. How does the shape compare to your prediction in part b? M House price ($) Number for sale 50 000 up to 100 000 5 100 000 up to 150 000 9 150 000 up to 200 000 12 200 000 up to 250 000 17 250 000 up to 300 000 10 300 000 up to 350 000 4 350 000 up to 400 000 2 400 000 up to 450 000 1 7. These are the birth months of a group of people. Jul Oct Sep Sep Oct Jul Aug Jun Mar Mar Jul Aug Apr May Feb Dec Jul Dec Sep Sep May May Jan Jun Jul Apr Aug May Mar Oct Are the data discrete, continuous, or categorical? How do you know? b) Make a frequency table for the data. c) Draw a circle graph, bar graph, pictograph, or histogram. Explain how you decided which type of graph to draw. d) Write a question someone could answer using the graph. a) 7.1 Organizing and Representing Data 301 Chapter 07 7/6/07 9:38 AM Page 302 Home Quit 8. Assessment Focus These are daily high temperatures in Waterloo for May in a recent year. Each temperature is recorded in degrees Celsius (⬚C). 19.0 19.8 23.3 21.1 15.2 9.9 17.2 21.7 21.2 23.9 16.3 13.6 17.1 16.1 15.3 14.8 19.0 11.9 12.8 15.0 10.7 7.8 16.1 22.3 23.5 18.9 24.7 27.6 32.7 31.4 27.9 a) Are temperatures discrete or continuous data? Explain. b) What are the greatest and least temperatures in the data set? c) Make a frequency table for the data. Use at least 6 intervals. Explain how you chose the intervals. d) Graph the data. Explain how you decided which type of graph to draw. Describe the shape of the graph. e) Suppose you could graph similar data for July or November. How do you think each graph might compare to this one? Explain your thinking. 9. Would you use a circle graph, bar graph, histogram, or pictograph to display data about each topic? Explain your choice. a) The prices of houses for sale in your town or neighbourhood b) The most popular car colours in the world c) The per capita carbon dioxide emissions in different countries d) The hours of television watched each week by Canadians of different ages e) The amount of Ontario’s electricity generated from nuclear, coal, hydro, natural gas, and renewable sources 10. Take it Further Suppose you collected data about each topic and drew a histogram. Which distribution would you expect the data to have? Explain your thinking. a) A class set of marks out of 100 on a test b) The foot lengths of a group of females c) The foot lengths of a mixed group of males and females d) The numbers of people in the families of students in your class Explain how the type of graph you draw depends on the type of data you are representing. Include examples in your explanation. 302 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:38 AM Page 303 Home 7.2 Quit Organizing and Representing Data Using Technology Statisticians deal with large amounts of data every day. They use technology to organize and display these data. A common tool for organizing data is a spreadsheet. Spreadsheets allow you to graph the same data set in many different ways. Inquire Organizing and Representing Data Using a Spreadsheet You will need Microsoft Excel. Work with a partner. A school council is organizing extra-curricular clubs. To find out about students’ interests, the council conducted a survey about the number of hours each week the students spend doing certain activities. • Using the Internet or playing computer games • Watching TV • Reading In Excel and many other spreadsheet • Volunteering in the community programs, a graph is • Playing sports called a chart. • Spending time with friends • Other The council randomly selected 26 students for the survey. Open the file Freetime.xls. The spreadsheet contains the results of the survey. Each row contains the responses of one student. The columns contain different types of one-variable data. Below the survey data, there are four summary tables, which you will graph in Excel. 7.2 Organizing and Representing Data Using Technology 303 Chapter 07 7/6/07 9:38 AM Page 304 Home Quit 1. The summary table with title Frequency of favourite activities shows the number of students who spent the greatest amount of their free time on each of the activities in the survey. a) Are favourite activities categorical, discrete, or continuous data? Justify your answer. b) Hold down the mouse button while you drag over the data in the table, except for the Total row. This selects the data. From the Insert menu, select Chart. Click Finish to insert the graph. What kind of graph appears on the screen? What is the most common favourite activity? How does the graph show this? How does the table show it? c) Right-click on the blank area around the graph. From the menu that appears, select Chart Type. In the Chart Type box, select Pie. Click OK. What kind of graph is shown now? With the chart selected, select Chart Options from the Chart menu. Click on the Data Labels tab, then check off Value and Percentage. Click OK. What does this graph show more clearly than the previous graph? d) Right-click on the blank area around the graph. From the menu that appears, select Chart Type. What other types of graphs would be appropriate for these data? Choose one of these types and graph the data. Does the new graph provide any new information? Explain your thinking. e) Which graph do you think best represented the data? Explain your choice. 304 CHAPTER 7: One-Variable Data You may need to drag the graph so that it does not cover the table. Chapter 07 7/6/07 9:39 AM Page 305 Home Quit 2. The summary table with title Distribution of free hours shows the number of students with different amounts of free time each week. a) Are the amounts of free time categorical, discrete, or continuous data? Justify your answer. b) Select the data in the table, except for the Total row. From the Insert menu, select Chart. Click Finish to insert the graph. What kind of graph appears? Describe the distribution of the data. Why might this happen? c) Right-click on the blank area around the graph. From the menu that appears, select Chart Type. What other types of graphs would be appropriate for these data? Explain your thinking. Choose one of these types and graph the data. Does the new graph provide any new information? Explain. d) Which graph do you think better represents the data? How is it better? 7.2 Organizing and Representing Data Using Technology 305 Chapter 07 7/6/07 9:39 AM Page 306 Home Quit 3. The summary table with title Distribution of number of activities shows the number of students who spend time doing different numbers of the activities in the survey. a) Are the numbers of activities categorical, discrete, or continuous data? Justify your answer. b) Choose a graph that you think will represent the data best. Explain your choice. Graph the data. Do most of the students spend time on many different activities or only a few? Explain how you know. 4. The summary table with title Total time spent on each activity shows the total amount of time spent by all students doing each activity. a) Are the total amounts of time categorical, discrete, or continuous data? Justify your answer. b) Choose a graph that you think will represent the data best. Explain your choice. Graph the data. What is the most popular activity? How does the graph show this? c) Graph the data from question 1 about the frequency of favourite activities the same way. Do the two graphs give the same information about the popularity of different activities? Explain your answer. 5. What clubs do you think the school council should organize? Explain how to use your graphs to convince the council that you are correct. 306 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:40 AM Page 307 Home Quit Practice 1. The Canadian Union of Farmers warns of a crisis in rural Canada due to the decreasing numbers of family farms. A family farm is a farm that is owned and operated by one family. Its operating costs are generally less than those of a large farm run as an agribusiness or a collective. This table shows the number of farms in Ontario with different operating costs in 1996 and 2001. Year Under $50 000 $50 000 – $100 000 – $99 999 $199 999 1996 1329 2427 2001 402 1164 $200 000 – $349 999 $350 000 – $499 999 $500 000 – $1 000 000 – $1 500 000 $999 999 $1 499 999 and over 11 151 17 962 10 770 14 857 4530 4494 6794 13 791 9453 15 060 5698 7366 Are operating costs categorical, discrete, or continuous data? Enter the data into a spreadsheet or load the spreadsheet file Farms.xls. b) Use a spreadsheet to make a bar graph for the data for each year. c) Compare the bar graphs for the two years. How are the shapes different? What does your answer tell you about how farms in Ontario changed from 1996 to 2001? Do the graphs support the argument that the number of family farms is decreasing? Justify your answer. d) Change the two bar graphs to another type of graph and compare them. How does the information this graph shows compare with your bar graph? Explain the reason for your choice of the type of graph. a) 7.2 Organizing and Representing Data Using Technology 307 Chapter 07 7/6/07 9:40 AM Page 308 Home Quit 2. In 2006, members of the Students’ Assembly on Political Reform met to discuss changing the way members of provincial parliament (MPPs) are elected. The first table shows the numbers of winning candidates in the 2003 election who received different percents of the votes in their ridings. For example, 3 MPPs won with between 35% and 40% of the votes in their ridings. The second table shows the number of MPPs elected from each political party in 2003. a) Are the percent of votes categorical, discrete, or continuous data? b) What kind of data are the political parties? Enter the data into a spreadsheet or load the spreadsheet file Election.xls. c) Use a spreadsheet to graph the percent votes data. Explain your choice of graph. What does the shape of the graph tell you about the data? d) About what percent of the MPPs won with less than 50% of the votes? Explain your strategy for answering this question. e) What do you know about the votes received by the other parties in the ridings represented by MPPs who received less than 50% of the votes? f) Use a spreadsheet to graph the party totals data. Explain your choice of graph. About what fraction of the MPPs elected came from each of the three parties? g) In 2007, the Students’ Assembly on Political Reform recommended that each party’s share of the vote determine the party’s share of the seats in parliament. Do you agree with this idea? Use these data and graphs to support your answer. Reflect ➢ Is it better to use technology to graph some data sets? Explain using examples from this section. ➢ How can a graph help you interpret data? Explain using examples from this section. 308 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:40 AM Page 309 Home 7.3 Quit Sampling Techniques One way to determine the population of fish or reptiles in a habitat would be to count all of them. This is usually not possible and could harm some species. So, biologists use a mark-and-recapture technique. The biologist catches some animals, marks them in a non-destructive way, then releases them. Later, another sample is caught. The ratio of marked animals to all the animals in the second sample should be approximately the same as the ratio in the total population. Investigate Estimating Using Mark-and-Recapture Work with a partner or group to simulate a mark-and-recapture experiment. You will be given a bag containing between 30 and 50 slips of paper. ➢ Without looking, reach into the bag and take some slips of paper. Mark an X on each one. Record the number of slips drawn, then return them to the bag. ➢ Shake the bag, then take a handful of slips from it. Count the number of marked slips and the total number of slips in the handful. ➢ Estimate the total number of slips of paper in the bag. Explain how you made your estimate. ➢ Return the slips. Draw and estimate a few more times. Reflect ➢ Use your estimates from each draw. What is your best prediction for the number of slips in the bag? Explain your thinking. ➢ Take out all the slips from the bag and count them. How accurate was your prediction? Explain your thinking. 7.3 Sampling Techniques 309 Chapter 07 7/6/07 9:40 AM Page 310 Home Quit Connect the Ideas Population and sample The population of a city is all the people who live in it. Similarly, the population of any set is all the objects in the set. The members of the population are called individuals. Collecting data about every individual in a population is called a census. Conducting a census can be costly and time consuming. It may even be physically impossible. In product testing, items may be damaged by the testing, so a census is impractical. Usually, data are collected for a smaller set of individuals selected from the population. This is called a sample. If a sample is not typical of the population it represents, it is called biased. A good sample should be of a suitable size compared to the population size and as unbiased as possible. Random samples In all forms of random sampling, every individual in the population has the same likelihood of being chosen. There are different random sampling techniques. Simple random sampling The sampling you did in Investigate was simple random sampling. Individuals are randomly chosen from the entire population. Stratified sampling Data sets are grouped before sampling. Data can be grouped based on a characteristic such as income or location. A few individuals from each group are then chosen at random. Data are collected for these individuals. Cluster sampling The population is grouped so that each group is representative of the whole population. Groups are chosen at random, and data are collected for every individual in the selected groups. 310 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:40 AM Page 311 Home Quit +10 +10 +10 Systematic sampling Every nth individual from a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 … 24 … 34 … population is chosen. For example, to sample every 10th name on a list of names, you could randomly begin with the 4th name, then continue with the 14th, 24th, and so on. Other types of samples There are other, non-random sampling techniques. For each of these techniques, every individual in the population does not have the same likelihood of being chosen. Convenience sampling Only individuals who are easy to sample are chosen. During elections, researchers often ask people leaving polling stations, “Who did you vote for?” These exit polls are a convenience sample. Judgement sampling The person doing the sampling chooses a sample based on her or his knowledge of the population. The sample chosen may not be representative of the population if the person is biased in her or his selection. Voluntary sampling Only individuals who volunteer to participate are included in the sample. Phone-in polls used by television and radio programs are examples of voluntary sampling. Suppose a downtown merchant’s association wants to determine people’s opinions on parking availability in the city. The association could conduct a survey using one of these sampling techniques: • Leave questionnaires in various locations in town for people to pick up and fill in if they wish. • Use a random number generator to select names from the phone book. • Choose random pages from the phone book and sample every household on the pages. • Begin with the 45th name in the phone book and choose every 100th name after that: 45, 145, 245, 345, . . . • Randomly choose people from each neighbourhood in the city, making the sample size in each neighbourhood proportional to the number of people who live in the neighbourhood. • Survey people who are shopping downtown. 7.3 Sampling Techniques 311 Chapter 07 7/6/07 9:40 AM Page 312 Home Quit Practice 1. Describe two reasons people collect data from a sample rather than the population. 2. Refer to the parking survey described in Connect the Ideas. a) b) c) d) e) What is the population? Use the sampling techniques described in Connect the Ideas. Decide which technique each survey uses. Give reasons for your choices. Which samples might be biased? Explain your thinking. Which sample would you recommend the association uses for its survey? Justify your choice. Suppose the association hopes to convince the city to provide more parking spaces. Would your answer to part d change? Explain your thinking. 3. For each situation, identify the population. Recommend whether to collect data from a sample or the population. Give reasons for your recommendation. a) The capacity of a battery is the number of hours it will work at a particular rate of current. A battery manufacturer wants to test the batteries produced each day to ensure that they have an appropriate capacity. b) A student wants to determine the sport that is most popular among her classmates. c) An environmental group wants to determine people’s opinions about pesticide use in a city. d) A college placement officer wants to survey drug companies in his province to determine which companies will hire students on workterms. e) The student council wants to determine which local band students would pay to see at a school concert. 4. For each situation in question 3 for which you recommended using a sample, suggest a sampling technique. Identify the type of sample and your reasons for suggesting it. 312 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:40 AM Page 313 Home Quit Selecting a sample can sometimes involve several steps. Each step should be as unbiased as possible. Example A company plans to survey Ontarians’ attitudes toward sport utility vehicles and trucks. To ensure it gets a representative view, the company wants every person in the province to have the same chance of being selected. It decides to use a stratified sample. The company selects 3 cities at random, then randomly selects 200 people from each city. The cities selected are: • Guelph, population 126 000 • Peterborough, population 75 000 • Windsor, population 208 000 a) What is wrong with selecting 200 people from each city? b) What else might bias the results of this survey? How could this be corrected? Solution a) b) The cities’ populations are different, so the same number of people should not be selected in each city. To determine how many people to select for each city: • calculate the fraction of the total population for each city; then • multiply by the total number of samples wanted, in this case 600. City Population Guelph 126 000 Peterborough 75 000 Windsor 208 000 TOTAL 409 000 Fraction of total population 126 000 409 000 75 000 409 000 208 000 409 000 Sample size (fraction of total ⴛ 600) or 0.308 ⭈ 185 0.398 ⫻ 600 ⫽ or 0.183 ⭈ 110 0.183 ⫻ 600 ⫽ or 0.509 ⭈ 305 0.509 ⫻ 600 ⫽ 600 The company should have selected 185 people in Guelph, 110 people in Peterborough, and 305 people in Windsor. Only people who live in cities were included in this sample. According to government statistics, the population of Ontario is about 85% urban and 15% rural. So, there are almost 6 times as many urban residents as rural. Since 600 urban residents are surveyed, it would reduce the bias to randomly survey 100 rural residents as well. 7.3 Sampling Techniques 313 Chapter 07 7/6/07 9:40 AM Page 314 Home Quit 5. Suppose you want to sample 50 students from a high school. For each situation below, calculate how many students should be sampled from each grade so that the numbers in the sample are proportional to the number of students in each grade. a) The school has 220 Grade 9 students, 180 Grade 10 students, 160 Grade 11 students, and 190 Grade 12 students. b) Use the enrollment numbers from your school or estimates, if the numbers are not available. 6. Assessment Focus Students in a social studies class are to write a biography of a person selected from a list of names. The names are organized in four groups. Alexander Graham Bell Marie Curie Thomas Edison Albert Einstein Isaac Newton Salvador Dali Frida Kahlo Pablo Picasso Henri Matisse Andy Warhol Margaret Atwood Robertson Davies Margaret Laurence Rohinton Mistry Alice Munro Norman Bethune Tommy Douglas Nellie McClung Louis Riel Tecumseh How many individuals are in the total population? b) How many individuals are in each group? How do the groups appear to be organized? c) Suppose the teacher decides to reduce the choices to 8 names. Describe how he could do this using a simple random sample, a stratified random sample, and a judgement sample. d) Write each name on a slip of paper. Try your ideas from part c, using the slips of paper. Record the 8 names in your sample each time. Which technique do you think ensures the most variety for students? Explain your thinking. a) 7. Take it Further Explain why the groups of names as arranged in question 6 are not suitable for a cluster sample. Explain how to reorganize the groups so a cluster sample could be used. Explain the difference between a sample and a population. Include examples from your school or community in which you might collect data from a sample and from a population. 314 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:40 AM Page 315 Home 7.4 Quit Designing and Using a Questionnaire Suppose you need data about the opinions and habits of people in your school or community. You will not usually find these data by researching in published sources of data. You may need to collect your own data. Inquire Collecting Data with a Questionnaire Work with a partner or in a group. 1. Investigating what makes a good question ➢ Questions can often be asked in an open way or with choices that help organize the responses. Which type of response in these examples do you think will be easier to organize and use? Explain your thinking. Open With choices How much did you spend on entertainment last week? _________ How much did you spend on entertainment last week? Less than $16 ____ $16–$30 ____ over $30 _____ What is your favourite subject? __________ Which is your favourite subject? (choose one) Math ____ Science _____ English ______ Tech ____ Other _______ Don’t have one ____ Should school cafeterias be banned from selling certain foods? If so, which foods? ______________________________ ______________________________ Foods such as pop, chips, and French fries should be banned in school cafeterias. Strongly disagree ______ Disagree _____ Agree ____ Strongly agree ____ No opinion _____ ➢ Try to avoid questions that may bias the results of a survey or provide vague responses. Here are some ways in which questions can be flawed. • Lead people to respond in a particular way because of the wording. • Lead people to respond in a particular way by not providing enough information or alternatives. • Ask questions that are too general. • Ask several things at once without allowing for this in the answers. 7.4 Designing and Using a Questionnaire 315 Chapter 07 7/6/07 9:40 AM Page 316 Home Quit ➢ Identify the better question in each pair. Explain what is wrong with the question you do not choose. Police officers, who perform vital services in our community, should receive a pay raise. Agree _____ Disagree _____ Which statement best describes your opinion about police salaries? Police salaries are lower than they should be. Police salaries are higher than they should be. Police salaries are appropriate. Don’t know, no opinion Have you purchased dog food in the last 3 months? Yes No If yes, what type was it? Canned ______ Dry ______ Both ________ Have you ever purchased canned or dried dog food? Yes No Unsure Do you think the minimum wage in Ontario should be raised to $10/h? Yes No Don’t know, no opinion Do you support the efforts of Ontario anti-poverty groups to raise the minimum wage to $10/h? Yes No Don’t know, no opinion 2. Designing the questionnaire ➢ Choose a topic. If possible, choose a topic where the information you collect might help to achieve some goal or to make a decision • Suppose you were interested in starting an environmental club. You could ask questions that would help you decide whether people would join the club and what days and times would be best for meetings. • Or, suppose the student council is planning a fund-raising dance. You could ask questions that would help determine the types of music people would like to hear and how much they would pay for a ticket. ➢ Write the questions. Keep these guidelines in mind: • Make the questions short and easy to understand. • Respect the respondents’ privacy: do not ask questions that are too personal. • Avoid questions that may offend or provoke an emotional response. If people refuse to participate because they are offended, it may bias your results. • Avoid biased questions. • Include a few questions that collect demographic data such as age, gender, and grade. ➢ Organize your questions in a logical way. 316 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:40 AM Page 317 Home Quit ➢ Write a brief introduction that explains the survey and how you will use the results. Your introduction should grab the attention of potential respondents and encourage their participation. Make it clear that the responses will be anonymous. ➢ Test your questionnaire on people in another group to check that the questions are clear. Change any questions that were misunderstood or caused confusion. 3. Collecting the data ➢ Use the questionnaire you have written. ➢ What is the population you hope to survey? Use your knowledge of sampling techniques to plan how to select people for your survey. Write a brief explanation of your strategy. ➢ Decide how you will conduct the survey: • Will you meet with each respondent, ask the questions orally, and record the answers yourself? • Will you hand out the questionnaires for respondents to complete independently? What are some benefits and drawbacks of each approach? ➢ Carry out your survey. 4. Displaying and analysing the data ➢ Decide which types of graphs are most appropriate to represent the data you have collected. Create a visual display to represent the data, either with paper and pencil, or with a spreadsheet. ➢ Make your decision on the idea that inspired the survey. If you need more data, explain what you have learned so far and what steps you could follow to obtain more data. Reflect ➢ Were you able to get data for an appropriate sample? If not, how did this affect your results? ➢ Were the results what you expected or hoped for? Explain. ➢ How could you improve your questionnaire? 7.4 Designing and Using a Questionnaire 317 Chapter 07 7/6/07 9:40 AM Page 318 Home Quit Mid-Chapter Review 7.1 1. Is each type of data numeric or 7.3 categorical? Identify data that are numeric as continuous or discrete. a) Age b) Zodiac sign c) Marital status d) Height e) Income f) Gender 5. For each situation, identify the population. Recommend whether to collect data from a sample or the entire population. Explain your thinking. a) A radio station wants to determine what type of music it should play to attract 18- to 24-year-old listeners. b) A company wants to test the quality of the fuses it makes. c) A teacher wants to determine which of two field trips her class prefers. 2. Ioana is shopping for a used car. List two categorical choices she might consider. b) List two numeric choices she might consider. a) 6. Thom is conducting a survey about 3. a) Make a frequency table for this set of school library use. What sampling technique is he using in each case? a) He asks the first 30 people entering the cafeteria at lunchtime each day for a week. b) He leaves questionnaires in the library and cafeteria. c) He chooses one class in each grade and asks students in those classes. d) He randomly selects 10% of the people in each grade to ask. data. Explain how you chose the intervals. Scores in a golf tournament 281 272 269 278 273 277 282 283 292 269 277 278 280 275 284 288 274 295 296 283 300 289 296 295 294 301 306 299 b) Draw a histogram to display the data. Describe the distribution. 7.1 4. These data show how students answered 7. In question 6, which sample would you the question “What is your favourite season?” 7.2 Summer 310 Winter 125 Spring 28 Fall 12 Graph the data. Explain how you chose which type of graph to draw. b) Write a question you could answer using your graph. a) 318 CHAPTER 7: One-Variable Data recommend Thom use? Explain your thinking. 7.4 8. Write a survey question about each topic. Include several intervals or category choices as answers for each question. a) Favourite sport b) Hours spent participating in sports each week c) Hours spent doing volunteer work each week d) A topic of your choice Chapter 07 7/6/07 9:40 AM Page 319 Home 7.5 Quit Measures of Central Tendency and Spread Quality control technicians use measures of central tendency and measures of spread to analyse and compare data and to make predictions. Investigate The mode of a set of data is the number that occurs most often. Determining Mode, Mean, and Median Work with a partner. You will need a scientific calculator. The annual salaries of employees at two small companies are shown. Company A salaries ($) The mean of a set of data is the number you get if you divide the total evenly amongst the set of numbers. Company B salaries ($) 20 000 25 000 30 000 40 000 25 000 25 000 40 000 40 000 25 000 35 000 50 000 50 000 50 000 85 000 60 000 80 000 130 000 The median of a set of data is a number such that, when the data are arranged in order, half the data is above the number and half is below. ➢ What is the difference between the highest and lowest salaries at each company? ➢ What is the mode salary at each company? ➢ What is the median salary at each company? How did you determine each median? ➢ What is the mean salary at each company? How did you determine each mean? ➢ You are asked to describe a typical salary at each company. Would you use the mode, median, mean, or some other value? Explain. Reflect ➢ Compare your strategy for determining the median salary at Company A with your strategy for Company B. How are the strategies the same? How are they different? Why? ➢ Suppose you are offered an entry-level job at each company. The job duties and benefits are similar. The salaries fit these data. Which job would you take? Give reasons for your choice. 7.5 Measures of Central Tendency and Spread 319 Chapter 07 7/6/07 9:40 AM Page 320 Home Quit Connect the Ideas Measures of central tendency The mode, mean, and median are measures of central tendency for a data set. A measure of central tendency is sometimes called an average. Mode • The mode is the number that occurs most often. There may be no mode or there may be more than one mode. At a recent concert, J.J. sold these sizes of band T-shirts: 21 small, 16 medium, 50 large, and 14 extra-large JJ sold more large T-shirts than any other size. The mode is usually the best measure when the data represent measures such as shoe sizes or other clothing sizes. The mode size sold that night was large. • Mean To determine the mean, add the numbers, then divide the sum by the number of numbers. Lila recently bought 5 CDs that cost: $14.95, $9.99, $9.99, $13.95, and $12.95 The total cost was: $14.95 ⫹ $9.99 ⫹ $9.99 ⫹ $13.95 ⫹ $12.95 ⫽ $61.83 ⭈ $12.37 $61.83 ⫼ 5 ⫽ The mean is usually the best measure when no data in the set are significantly different from the other numbers. The mean price Lila paid was about $12.37. Median The median is usually the best measure when data in the set are significantly different. • To determine the median, arrange the numbers in order. The median is the middle number. For an even number of numbers, the median is the mean of the two middle numbers. Walid is a real estate agent. Arranged from least to greatest, the prices of the last 6 houses he sold were: $185 500, $194 900, $219 900, $245 000, $259 900, and $749 500 The 2 middle prices are: $219 900 and $245 000 The mean of these prices is: $219 900 +2 $245 000 ⫽ $232 450 The median price of these houses was $232 450. 320 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:41 AM Page 321 Home Measures of spread Quit The measures of spread are the range and the standard deviation. The range is the difference between the greatest number and the least number in a data set. The standard deviation tells how widely spread around the mean the data in a set are. If the data points are all close to the mean, then the standard deviation is close to 0. We use measures of spread to compare 2 or more sets of data. Petra scored 5 goals in 1 game, 2 goals in each of 4 games, 1 goal in each of 4 games, and no goals in 1 game. Her teammate Hasieba scored 3 goals in each of 2 games, 2 goals in each of 3 games, and 1 goal in each of 5 games. Based on these games, which player is the more consistent goal scorer? Calculate the mean number of goals scored per game for each player. Petra Determine the mean Total number of goals Hasieba 5 ⫹ (2 ⫻ 4) ⫹ 4 ⫽ 17 (3 ⫻ 2) ⫹ (2 ⫻ 3) ⫹ 5 ⫽ 17 Total number of games 10 Mean (total number of goals⫼ total number of games) 17 10 10 17 10 ⫽ 1.7 ⫽ 1.7 The mean is the same for each girl. To compare their consistency, we need to calculate the range and standard deviation. Petra Determine the range Greatest number of goals 5 3 Least number of goals 0 1 5⫺0⫽5 3⫺1⫽2 Range (greatest number ⫺ least number) Calculate the standard deviation Hasieba To calculate the standard deviation: • Calculate the mean. • Subtract the mean from each data value. • Square each difference. • Add the squared numbers. • Divide the sum by one less than the number of data items. • Determine the square root of the result. When you have data for a sample, divide by 1 less than the number of items. When you have data for an entire population, divide by the number of items. 7.5 Measures of Central Tendency and Spread 321 Chapter 07 7/6/07 9:41 AM Page 322 Home Quit Organize the calculations in charts. Each mean is 1.7. Petra Hasieba Each data value Data value ⫺ mean Square of difference Each data value Data value ⫺ mean Square of difference 5 5 ⫺ 1.7 ⫽ 3.3 10.89 3 3 ⫺ 1.7 ⫽ 1.3 1.69 2 2 ⫺ 1.7 ⫽ 0.3 0.09 3 3 ⫺ 1.7 ⫽ 1.3 1.69 2 2 ⫺ 1.7 ⫽ 0.3 0.09 2 2 ⫺ 1.7 ⫽ 0.3 0.09 2 2 ⫺ 1.7 ⫽ 0.3 0.09 2 2 ⫺ 1.7 ⫽ 0.3 0.09 2 2 ⫺ 1.7 ⫽ 0.3 0.09 2 2 ⫺ 1.7 ⫽ 0.3 0.09 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 0 0 ⫺ 1.7 ⫽ ⫺1.7 2.89 1 1 ⫺ 1.7 ⫽ ⫺0.7 0.49 Sum of squared differences: Because these data are sample results for part of the hockey season, divide by n ⫺ 1. For Petra 10.89 ⫹ (4 ⫻ 0.09) ⫹ (4 ⫻ 0.49) ⫹ 2.89 ⫽ 16.1 Sum divided by 1 less than the number of data items: 16.1 10⫺1 = 16.1 9 16.1 ⭈ Take the square root 冑苶苶苶 ⫽ 1.3375 9 So, the standard deviation for Petra’s goals is about 1.3. For Hasieba (2 ⫻ 1.69) ⫹ (3 ⫻ 0.09) ⫹ (5 ⫻ 0.49) ⫽ 6.1 Sum divided by 1 less than the number of data items: 6.1 10⫺1 = 6.1 9 6.1 ⭈ Take the square root 冑苶 ⫽ 0.8233 苶苶 9 So, the standard deviation for Hasieba’s goals is about 0.8. Interpret the measures of spread 322 Hasieba has the smaller range and lower standard deviation. The spread of the data for Hasieba is smaller. The data are closer to the mean. This means Hasieba is the more consistent goal scorer. CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:41 AM Page 323 Home Quit You can use a graphing calculator to determine the mean and standard deviation of a data set. For example, to determine the standard deviation for Petra’s goals, follow these steps on a TI-83 or TI-84 graphing calculator. Press y £ to begin a list. Enter each number of goals, using commas to separate the numbers. Then press y ¤ ƒ y 1 Í. ’ The data is now stored in L1 for the next step in the calculations. To display the statistical calculation menu, press … ~. Because you are analysing one-variable data, you will use the first set of calculations. Press 1. Press y 1 Í( ’ . A list of statistical data about the numbers in L1 is displayed. x苶 is the mean Sx is the standard deviation for a sample x is the standard deviation for a population The standard deviation for Petra’s goals is about 1.3. Practice You may use a graphing calculator to calculate measures. 1. Compare the mode, mean, and median of each data set. Which measure best represents the data? Give reasons for your choice. a) Cost of tickets: $5, $6, $2, $4, $6, $5, $5 b) Number of prizes in a package: 7, 2, 8, 4, 0, 9 c) Lengths of timber rattlesnakes: 82 cm, 90 cm, 150 cm, 112 cm, 184 cm 2. Determine the mean, range, and standard deviation for each set of data. Explain what each measure of spread tells about the data. a) Number of points scored in some games: 5, 2, 1, 10, 12, 8, 4, 7, 3, 14 b) Number of games won in a tournament: 7, 7, 6, 7, 5, 6 7.5 Measures of Central Tendency and Spread 323 Chapter 07 7/6/07 9:41 AM Page 324 Home Quit 3. The mean annual temperature in Windsor, Ontario is about 9.4⬚C. The temperature range is about ⫺25⬚C to 35⬚C. The mean annual temperature in Edinburgh, Scotland is about 8.3⬚C. The temperature range is about 0⬚C to 20⬚C. Which city would you say has the milder climate? Justify your answer. 4. Use the data from Connect the Ideas about Hasieba’s goals. Use a graphing calculator to determine the standard deviation. 5. Astrid recorded the prices of gas at a station Astrid’s data near her school. Mode 78.3¢ 72.4¢ 76.6¢ 79.3¢ Mean Gabe recorded the prices of gas at a station Median near his home. Range 71.9¢ 76.3¢ 71.2¢ 74.6¢ 78.3¢ 76.3¢ Standard They calculated measures of central tendency deviation and measures of spread. Use the measures of central tendency and spread to compare the cost of gas at these stations. no mode 76.3¢ 76.65¢ about 74.77¢ 77.45¢ 75.45¢ 6.9¢ 7.1¢ about 3.04¢ about 2.76¢ 6. Choose two measures in Astrid and Gabe’s calculations in question 5. Explain how you can estimate to show whether they are reasonable. 7. Assessment Focus A coach is taking members of the high school cross-country team to OFSAA in Ottawa on October 27th. He researched minimum daily temperatures in previous years. Oct 25 Oct 26 Oct 27 Oct 28 2006 5⬚C 4⬚C 3⬚C 4⬚C 2004 6⬚C 8⬚C 6⬚C 4⬚C 2001 11⬚C 6⬚C 4⬚C 2⬚C Determine the measures of central tendency for each year. b) Determine the measures of spread for each year. c) Which year has the least standard deviation? How could you predict this by looking at the data? d) Why do you think the coach would research temperatures for more than one year? a) 324 CHAPTER 7: One-Variable Data Gabe’s data 7/6/07 9:41 AM Page 325 Home Quit Data are sometimes provided in a frequency table or histogram. Example A company is testing two egg carton designs to see which could better withstand a drop from a specified height. The results are shown in the table. Broken eggs 0 1 Carton A Number of cartons 2 12 22 28 25 8 3 Carton B Number of cartons 0 5 27 36 28 3 1 2 3 4 5 6 a) Without calculating, which appears to be the better carton? Explain. b) Draw a histogram for the number of broken eggs in each carton. Which appears to be the better carton? Explain. Calculate the mean and standard deviation for the number of broken eggs for each carton. Which appears to be the better carton? a) While Carton A had 2 results where no eggs broke, it also had 11 results where 5 or 6 eggs broke. The results for Carton B were more consistent. So, Carton B appears to be more reliable, and thus the better carton. c) Test Results for Carton B b) Test Results for Carton A 30 25 20 15 10 5 0 0 1 2 3 4 5 6 Number of broken eggs Use x, the standard deviation for a population. Number of cartons Solution Number of cartons Chapter 07 40 35 30 25 20 15 10 5 0 0 1 2 3 4 5 6 Number of broken eggs The data for Carton B appear to be more clustered around the centre of the histogram than the data for Carton A. This shows Carton B is more reliable, and thus the better carton. c) Use a TI-83 or TI-84 graphing calculator. Use the technique described in Connect the Ideas to store the whole numbers from 0 to 6 in L1. To store the frequencies for Carton A in L2, press y £. Enter the numbers of cartons from the first row of the table, separated by commas. Then press y ¤ ƒ y 2 Í( ’ . Press … ~ 1 y 1 ¢ y 2 Í ’ (to display the mean 2.98 and standard deviation 1.3113. To store the frequencies for Carton B in L3, press y £. Enter the numbers of cartons from the first row of the table, separated by commas. Then press y ¤ ƒ y 3 Í( ’ . Press … ~ 1 y 1 ¢ y 3 Í ’ (to display the mean 3 and standard deviation 0.9798. The means are very similar, but the standard deviation for Carton B is less than the standard deviation for Carton A. So, Carton B is more reliable, and thus is the better carton. 7.5 Measures of Central Tendency and Spread 325 7/6/07 9:41 AM Page 326 Home Quit 8. During rush hour, a high-occupancy vehicle (HOV) lane is reserved for vehicles carrying at least three people. At other times anyone can use the lane. Without calculating, determine which set of data is likely to have the greatest standard deviation and which the least standard deviation. Explain your thinking. 100 b) c) People in Each Car in HOV Lane 3:30 p.m. to 4:00 p.m. People in Each Car in HOV Lane 8:00 p.m. to 8:30 p.m. Number of cars 80 60 40 20 0 1 80 80 Number of cars a) People in Each Car in HOV Lane 8:00 a.m. to 8:30 a.m. Number of cars Chapter 07 60 60 40 40 20 20 0 2 3 4 5 Number of people 1 0 2 3 4 5 Number of people 1 2 3 4 5 Number of people 9. A company has three machines that manufacture bolts. Each bolt should have length 150 mm. A quality control technician takes a sample of 25 bolts produced on each machine and measures the lengths. Bolt length (mm) 148 149 150 151 152 Machine A Number of bolts 2 4 13 5 1 Machine B Number of bolts 1 3 18 3 0 Machine C Number of bolts 4 5 7 6 3 Without calculating, predict which set of data is likely to have the greatest standard deviation and which the least standard deviation. Explain your thinking. b) Calculate the mean and standard deviation for each set of data. How do the results compare with your predictions in part a? c) Which machine appears to be the most reliable producer of 150 mm long bolts? Which appears to be the least reliable? a) 10. Take it Further Explain how well each mean describes a typical member of the population it represents. Data set Mean Standard deviation Hourly salaries of employees ($) 20 8 Monthly bonuses for sales representatives ($) 200 8 Why do you think mean, median, and mode are called measures of central tendency? Why do you think range and standard deviation are called measures of spread? 326 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:41 AM Page 327 Home 7.6 Quit Analysing Data Technology can be used to calculate measures of central tendency and measures of spread. This makes it easier to focus on the interpretation of these measures and how appropriate they are to describe a data set. Inquire Determining Measures of Central Tendency and Spread Using a Spreadsheet You will need Microsoft Excel. Open the file sunriseandsunset.xls. 1. This spreadsheet shows the length of each day in June and December in Yellowknife, NWT in a recent year. The daylight time is shown in hours, minutes, and seconds. a) How long was there daylight on June 15? b) How long was there daylight on the shortest day in December? 2. Select any five empty cells in a column. From the Format menu, choose Cells.... On the Number tab, under Category: choose Custom. Under Type:, choose h:mm:ss. Then click OK. 3. a) Enter the formula for mode in a cell you formatted: ⴝMODE(B2:B31) For what cells did the formula determine the mode? b) Tell what data the mode describes. c) What does the value in the cell for the mode tell you? Does this make sense? Why or why not? A formula in a spreadsheet always starts with an equals sign. 7.6 Analysing Data 327 Chapter 07 7/6/07 9:41 AM Page 328 Home Quit 4. a) Enter the formula for median in a cell you formatted: ⴝMEDIAN(B2:B31) How does the formula show what cells the median describes? b) Tell what data the median describes. c) What is the median for the data? 5. a) Enter the formula for mean in a cell you formatted: ⴝAVERAGE(B2:B31) b) Tell the value of the mean and the data it describes. The formula for mean uses the word AVERAGE. 6. a) Enter the formula for range in a cell you formatted: ⴝMAX(B2:B31)-MIN(B2:B31) Explain how this formula determines the range. b) Tell the value of the range and the data it describes. 7. a) Enter the formula for standard deviation of a population in a cell you formatted: ⴝSTDEVP(B2:B31) How does the formula show what cells the standard deviation describes? b) Tell the value of the standard deviation and the data it describes. 8. a) Repeat question 2 for five other cells. Use a similar process to question 3 for the mode of the lengths of days in December. What cells will you reference in your formula? c) What does the value in the cell for the mode tell you? Does this make sense? Why or why not? b) 9. Repeat questions 4 to 7 to determine the median, mean, range, and standard deviation for the length of days in December. Remember to change the cells you reference in your formulas. 10. a) Determine the median and mean average number of hours of daylight each month. Use these averages to describe the differences in daylight times in June and December. Which measure better describes these differences? b) What do standard deviations for June and December tell you about the daylight times in these months? 328 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:41 AM Page 329 Home Quit Practice 1. Josef researched these data about the cause of all identified forest fires in a recent year. Open the file forestfires.xls. a) b) c) d) e) f) Select any five empty cells in a column of the file forestfires.xls. From the Format menu, choose Cells... . On the Number tab, under Category: choose General. Determine the measures of central tendency and measures of spread for the number of forest fires due to human activities. Use STDEV for these Repeat part b for forest fires due to lightning. data since they are Repeat part b for forest fires due to unknown cause. only all identified fires. How can you use the standard deviation to interpret the mean? Explain how you could use the measures you calculated to develop awareness of the need for fire safety. 7.6 Analysing Data 329 Chapter 07 7/6/07 9:41 AM Page 330 Home Quit 2. Hilda researched the maximum depths of all the oceans and of the deepest seas. Open the file oceansandseas.xls. a) Select any five empty cells in a column. From the Format menu, choose Cells... . On the Number tab, under Category: choose General. b) Determine each measure of central tendency for the oceans. c) Determine each measure of spread for the oceans. d) Repeat parts b and c for the seas. e) Use the measures you calculated to compare the data for oceans and the data for seas. Use STDEVP for the oceans and STDEV for the seas. 3. The yield of a crop is the number of bushels that are produced for every acre of land farmed. Open the file cropyields.xls. It shows the yields of different crops in each of 10 years. a) Without calculating, which crop appears to produce the most consistent yields? b) Which appears to have the greatest variation in yield? c) Use measures of central tendency and measures of spread to check your predictions. How might this information be useful to a farmer? Reflect ➢ Explain your strategy for naming the cells to be included in a formula. Suppose you have data in the first 18 rows of Column A. Describe how to use your strategy to enter a formula to determine the standard deviation for data in these cells. ➢ Choose an example in this section. Explain how the measures of central tendency and measures of spread can help you compare data. 330 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:42 AM Page 331 Home Quit Dice Choice Materials • 10 dice • graphing calculator or Microsoft Excel Play in a group of 2 to 4. ➢ Roll 10 dice. ➢ Each player writes the digit that appears on each die. ➢ Then each player decides how to use the digits to write five 2-digit numbers. Here are two examples of numbers players might create for the digits on these dice. 41, 46, 31, 52, 52 Since you are calculating the standard deviation for all the numbers you wrote, divide by the total number of data items. 26, 31, 41, 42, 55 ➢ Each player uses technology to determine the standard deviation for her or his 2-digit numbers. ➢ The player with the lower or lowest standard deviation scores 1 point. ➢ Roll the dice to continue. ➢ The first player to score 4 points wins. ➢ Is there a strategy that can help you win? If so, describe it. If not, explain why you cannot develop a strategy. GAME: Dice Choice 331 Chapter 07 7/6/07 9:42 AM Page 332 Home 7.7 Quit Designing and Conducting an Experiment A questionnaire involves asking people about their opinions or habits. An experiment involves counting or measuring physical properties to test an idea or answer a question. Inquire Conducting an Experiment to Collect Data Work with a partner or in a group. Designing an experiment When you plan an experiment, think about these questions. ➢ What factors might influence your results? How can you consider these factors when you design your experiment? ➢ How many observations will you make or for how long will you observe? ➢ What materials will you need? ➢ How will you record your observations? Use your answers to these questions to help plan the experiment. A good experimental plan should include these items: ➢ The question you are investigating ➢ A list of the materials you will need ➢ The steps you will follow, in as much detail as possible ➢ Any tables you might need for recording your observations 332 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:42 AM Page 333 Home Quit For example, suppose you want to explore how quickly after exercise a person’s heart rate returns to its resting rate. You will have to consider these issues: ➢ A person’s age may affect the result. Will you collect data for a variety of ages or just one age group? ➢ The amount of time the person exercises may affect the result. How will you ensure all the people in your experiment exercise for the same amount of time? ➢ The type of exercise may affect the result. How will you ensure all the people in your experiment do the same exercise? ➢ What materials will you need? ➢ From how many people should you collect data? 1. Suppose you are to conduct a heart rate experiment like the one described above. Answer each question that was posed above. Write a plan for the experiment. 2. Suppose you want a bike lane on the street where your school is located. You design a questionnaire asking people whether they will use a bike lane. You also want to measure how much bike traffic the street has now. a) Why should you allow for each of these factors in your experimental design? • time of day • weather • day of the week How could you do this? b) Write a plan for the experiment. 3. Can a person balance on one foot longer with eyes open or with eyes closed? a) What are some issues that you will need to consider when designing an experiment to answer this question? b) Write a plan for the experiment. 7.7 Designing and Conducting an Experiment 333 Chapter 07 7/6/07 9:42 AM Page 334 Home Quit 4. Which brand of orange juice do students prefer in a taste test? What are some issues that you will have to consider when designing an experiment to answer this question? b) Write a plan for the experiment. a) 5. Choose one of the experiments you planned. Compare your plan with the plan developed by another group. Discuss any differences you notice. b) Revise your plan if you see ways to improve it. a) Conducting the experiment ➢ Choose one of the experiments you planned, or design a new experiment about another topic. What question will you try to answer with the data you collect? ➢ If your experiment involves having people perform tasks, use your knowledge of sampling techniques to plan how you will get data about a representative sample of people. If your experiment involves observing and counting things that occur without you planning them , think about where, when, and how you make your observations. If your experiment is a taste test, ask participants about food allergies. ➢ Gather any materials you need. Carry out your experiment. Displaying and analysing the data ➢ Decide which types of graphs are most appropriate for the data you have collected. Create a visual display to represent the data, either by hand or with a spreadsheet. ➢ If the data are numeric, which measures of central tendency or spread best represent the data? Explain your choice of measures. ➢ Answer the question that inspired the experiment. If you need more data, explain what you have learned so far and what steps you could follow to obtain more data. Reflect ➢ Were you able to get data for an appropriate sample or find an appropriate time and place to make your observations? If not, how did this affect your results? ➢ How could you improve the design of your experiment? 334 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:42 AM Page 335 Home 7.8 Quit Collecting Data from Secondary Sources A student writing an essay, a business person preparing a proposal to win a new client, or a charitable organization completing a grant application all have one thing in common. They need to know how to collect, analyse, and display data to support their cases. Inquire Collecting and Analysing Data To complete both parts of this Inquire, you will need a computer with access to the Internet and E-STAT. If you do not have access to E-STAT, you can complete the Part 2 of this Inquire using the Internet or printed materials. You will also need Microsoft Excel. Part 1: Collecting Data Using E-STAT ➢ Go to www.statcan.ca. Click English. Select Learning Resources from the menu on the left. Click on E-STAT in the yellow box on the right. Then click Accept and Enter. If you are working at home, you will need to enter the user name and password assigned to your school. You should see a table of contents on your screen. You may need to scroll down. 7.8 Collecting Data from Secondary Sources 335 Chapter 07 7/6/07 9:42 AM Page 336 Home Quit ➢ Click on Environment in the Land and Resources section. You will be using tables from Human Activity and the Environment, Annual Statistics 2006 and later. Click on the link to this document. In the box that appears, click on View HTML. ➢ The next screen shows a table of contents. Click on Tables. The table you need is in Section 4: Socio-economic response to environmental conditions. Click on the link to this section. If you cannot download the Then click on the HTML link for Table 4.12. file, enter the data for The table shows Canadian recycling data for 2002. Ontario, British Columbia, and Nova Scotia into a Right click on the table and Export to Microsoft Excel. Microsoft Excel spreadsheet. This screen shows the downloaded Excel file. 336 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:43 AM Page 337 Home Quit ➢ Before you explore these data, think about recycling in your community. What different materials can you recycle at home or school? Which two or three materials make up the biggest part of the material you recycle? ➢ Use the spreadsheet data to make a circle graph of materials recycled in Ontario, by mass. You will need to copy and paste the labels from Column A into a new section of the spreadsheet. Then copy and paste the Ontario data so that they are in a column beside the labels you pasted. What were the top three materials recycled in Ontario, by mass? ➢ British Columbia is the only other province with a complete set of data. Make a circle graph for this province. Again, you will have to copy and paste so that the labels are beside the data. Compare the two graphs. What were the top three materials recycled in British Columbia, by mass? ➢ Two categories of data are missing for Nova Scotia. Make a circle graph for this province. Again, you will have to copy and paste so that the labels are beside the data. Replace the Xs in the Copper and aluminum and Other metals categories with zeros before graphing. This ensures the colours used for each category will match those in the other two graphs for ease of comparison. What were the top three materials recycled in Nova Scotia, by mass? 7.8 Collecting Data from Secondary Sources 337 Chapter 07 7/6/07 9:43 AM Page 338 Home Quit ➢ What might you conclude about recycling programs in different parts of the country? Explain your thinking. ➢ Write another question that someone could answer using one or more of your graphs. ➢ Choose another topic to research for which data are available in E-STAT. Think of a question, collect data to determine the answer, and graph the data you find. Part 2: Collecting Data Using Other Websites or Printed Materials With clever searching on the Internet, you can find data on almost any topic. Governments and international organizations, such as the United Nations Statistics Division, are usually trustworthy sources of data. When you collect data, either electronically or in print, it is important to consider how reliable the source of the data is. Other sources of data are the websites of professional sports organizations, the International Olympic Committee, and the Census at School section of the Statistics Canada website. A search engine is a program that finds information by searching for keywords you enter. It returns a list of websites where the keywords were found. Here are some tips for using search engines to find data. • Be as specific as possible in the keywords you enter. • Use the + symbol before each word if you want only results that include all the words you have entered. • Use the – symbol before a word if you do not want any results that contain this word. • If you want a series of words to appear together in a particular order, type quotation marks before the first word and after the last word. Sometimes data can be downloaded from websites as a spreadsheet file or as a CSV (comma separated value) file that can be used in any spreadsheet. If there are only a few pieces of data or you use data from printed sources, write down the data on paper or enter them in a spreadsheet. 338 CHAPTER 7: One-Variable Data Chapter 07 7/6/07 9:43 AM Page 339 Home Quit ➢ Look back at the topics suggested in question 9 in Section 7.1. Choose one of these topics or use a different topic of your choice. Pose a problem you can try to solve with the data you collect. If appropriate, predict what you think the answer will be. Questions I could answer by collecting second-hand data The Environment Which countries produce the most carbon dioxide per capita? Sports How is most of Ontario’s electricity generated? Interesting Facts How many Stanley Cups has each NHL team won? How many minutes do professional basketball players play per game? Students in Canada What are the five most popular car colours in the world? How much of their gross national product do different countries spend on education? How many cigarettes do Canadian teens smoke each week? What percent of Canadian students are left handed? ➢ Use the Internet or printed sources to find the data you need. Record the web addresses or names of your data sources as references. If you are having difficulty finding data, you may need to choose a different topic. It is often better to change topics than to try locating data which may not exist. ➢ Decide which types of graphs are most appropriate for the data you find. Create a visual display to represent the data, either by hand or with a spreadsheet. ➢ If the data are numeric, which measures of central tendency or spread best represent the data? Explain your choice of measures. ➢ Solve the problem you posed. If you need to find more data, explain what you have learned so far and what steps you could follow to obtain more data. Reflect ➢ What difficulty might someone have collecting data? How could they deal with this difficulty? ➢ Does using second-hand data have advantages over collecting your own data? If so, what are they? If not, why not? ➢ Why is it important to collect data from reliable sources? 7.8 Collecting Data from Secondary Sources 339 Chapter 07 7/6/07 9:43 AM Page 340 Home Quit Chapter Review What Do I Need to Know? Types of Data and Graphs One-variable data describe one piece of information about a person, place, or thing. Data that involve numbers are called numeric data. They may be discrete or continuous. • Data that are grouped by categories are called categorical. The type of graph you draw depends on the type of data being represented. • Circle graphs and pictographs can represent categorical data or discrete data. Hours Spent Listening to Music Each Week by Ontarians Car Thefts in Selected Canadian Cities, 2001 represents 8 hours On the internet 1.9 h Toronto Courses Taken by a First Year Apprentice Millright Ottawa Welding 1 On television 2.3 h Calgary Trade theory 1 On the radio 8.2 h Edmonton Trade practice 1 Montreal On CD, mp3, or cassette 7.4 h Electrical 1 Hamilton Drawings and schematics Vancouver 20 30 40 50 Age (years) 60 Number of people Number of people 60 50 40 30 20 10 0 10 20 30 40 50 Age (years) 340 5 10 15 20 25 Number of shifts 30 CHAPTER 7: One-Variable Data 0 00 0 30 00 0 35 00 0 25 0 00 20 00 70 Bimodal 3500 3000 2500 2000 1500 1000 500 0 Minutes per Game Played by Toronto Raptors Players ov 0 er 15 0 0 15 10 75 50 5 25 Number of households (thousands) Skew Right Household Incomes in Canada, 2001 Number of players Skew Left 0 60 70 Edmonton Oilers Shifts Played per Game, 2005/2006 10 9 8 7 6 5 4 3 2 1 15 00 Normal Distribution 10 9 8 7 6 5 4 3 2 1 10 Number of stolen cars Age of Audience Members at Movie Age of Passengers Riding in a Subway Car 0 10 50 Uniform Distribution 00 A histogram is a type of bar graph that shows numeric data that have been grouped in intervals. The shape of a histogram provides information about how the data are distributed. Annual income (thousands of $) Number of players • 0 Winnipeg 4 3 2 1 0 5 10 15 20 25 30 Time played (min) 35 40 Chapter 07 7/6/07 9:43 AM Page 341 Home Quit Sampling Techniques The population of a data set is all the pieces of data in the set. A sample is a smaller set selected from the population. There are different techniques for selecting a sample. With a random sampling technique, each member of the population has the same chance of being selected. This is not true with other techniques. Random sampling techniques • Simple random sampling • Stratified sampling • Cluster sampling • Systematic sampling Other techniques • Convenience sampling • Judgement sampling • Voluntary sampling Measures of Central Tendency and Spread The mode, mean, and median are measures of central tendency for a data set. They are used to describe a typical or average value for a data set. • The mode is the number that occurs most often. • To determine the mean, add the numbers, then divide the sum by the number of numbers. • To determine the median, arrange the numbers in order. The median is the middle number. For an even number of numbers, the median is the mean of the two middle numbers The measures of spread are the range and the standard deviation. • The range is the difference between the greatest number and the least number in a data set. • The standard deviation tells how widely spread around the mean the data in a set are. To calculate the standard deviation: ➢ Calculate the mean. ➢ Subtract the mean from each data value. ➢ Square each difference. ➢ Add the squared numbers. ➢ Divide the sum by one less than the number of data items if the data are for a sample ➢ Divide by the number of data items if the data are for an entire population. ➢ Determine the square root of the result. Chapter Review 341 Chapter 07 7/6/07 9:43 AM Page 342 Home Quit What Should I Be Able to Do? 7.1 1. Is each type of data numeric or 7.3 recommend whether to collect data from a sample or the entire population. If you recommend a sample, suggest a sampling technique. Explain the reason for your suggestion. a) Surveying the residents of a condominium to determine their opinions about a proposed renovation b) Surveying students at your school to determine whether they would participate in a fundraiser for a local hospital c) Testing chocolate bars produced each day in a factory to check for peanut cross-contamination categorical? Identify those data that are numeric as continuous or discrete. a) A yes/no response on a questionnaire b) The fuel consumption rating of a vehicle c) The colour options for a new car d) A person’s shoe size e) The type of transportation a person uses to get to work f) The distance a person travels to get to work 2. a) Make a frequency table for this set of data. Explain how you choose the intervals. Heights of trees in a woodlot (m) b) 18.0 21.3 17.1 23.5 19.8 17.9 17.0 21.5 19.2 19.0 20.6 19.5 14.5 12.4 24.0 15.4 17.6 22.8 13.6 21.7 5. A company wants to survey 500 of its employees about job satisfaction. The company employs 860 people in British Columbia, 1100 people in Ontario, and 560 people in New Brunswick. How many employees should be sampled in each province so that the number in each provincial sample is proportional to the number of employees in that province? Draw a histogram to display the data. Describe the distribution. 7.1 3. These data show the geographic origins of 7.2 international students at the University of Toronto in a recent year. 7.4 Region Number of undergraduate students Asia 2577 Americas 650 Europe 487 Middle East 359 Oceania and Africa 245 Graph the data. Explain how you decided which type of graph to draw 342 CHAPTER 7: One-Variable Data 4. Identify each population below, then 6. Suppose you want to determine data about the geographic origins of students at your school. a) Would you do a census or collect data from a sample? Why? If you suggest using a sample, recommend an appropriate sampling technique. b) Write a question you could include on a questionnaire to collect these data. Chapter 07 7/6/07 9:43 AM Page 343 Home Quit 7. Which question would you use on a a) questionnaire? Explain your choice. a) How do you get to school on a typical day? _________________________ b) How do you usually travel to school (select one): walk ____ bike ____ car ____ public transit ____ other (please specify) ____ 7.5 b) c) d) 8. Calculate the mean, median, and mode heights for the tree data in question 2. Which measure do you think best represents the data? Explain your choice. e) f) 9. Lila had 10 members of a high school volleyball team and 10 people randomly selected from a shopping mall try serving a ball 10 times each. She counted the number of successful serves for each person. Lila calculated the mean and the standard deviation for each group. Which group do you think would have a greater standard deviation? Why? 7.7 7.6 North Bay Vancouver Halifax Winnipeg May 11 17⬚C 16⬚C 15⬚C 10⬚C May 12 19⬚C 13⬚C 16⬚C 15⬚C May 13 15⬚C 15⬚C 17⬚C 13⬚C May 14 16⬚C 17⬚C 20⬚C 14⬚C May 15 14⬚C 21⬚C 24⬚C 14⬚C May 16 17⬚C 28⬚C 16⬚C 23⬚C May 17 20⬚C 20⬚C 18⬚C 19⬚C May 18 21⬚C 19⬚C 18⬚C 15⬚C 11. How many sit-ups can a typical Canadian teenager do in 1 min? a) What are some issues that you would have to consider when designing an experiment to answer this question? b) Write a plan for the experiment. Include an explanation of how you would select people to participate in the experiment. 7.5 10. A travel agent is gathering data to help a client plan a trip. He found data on the maximum temperatures in a few cities for one week during the previous year. Determine the measures of central tendency for North Bay. Determine the measures of spread for North Bay. Repeat parts a and b for each of the other cities. Choose one of the cities. Which measure of central tendency do you think best describes the average weather? Why is it best? What do the measures of spread tell about the temperatures? Did you use a spreadsheet for parts a or b? Explain the reason for your choice. 7.8 12. Suppose you need to find data about each subject. Explain how you would search for the data. a) The maximum temperatures for your community or region for one month last year b) The population of each province in 1981, 1991, and 2001 c) The distance the average Canadian commutes to work or the number of minutes it takes the average Canadian to commute to work Chapter Review 343 Chapter 07 7/6/07 9:43 AM Page 344 Home Quit Practice Test Multiple Choice: Choose the correct answer for questions 1 and 2. Justify each choice. 1. What is the name for a data set made up of some of the individuals in a target group? A. population B. sample C. distribution D. census 2. Which type of graph would not be suitable to display eye colours of students in a class? A. circle graph B. bar graph C. histogram D. pictograph Show your work for questions 3 to 6. 3. Communication The points scored in each game by a basketball player are given. a) b) Make a frequency table for these data. Explain how you choose the intervals. 11 8 17 3 22 13 8 16 10 18 10 19 12 10 9 20 6 13 15 20 5 20 13 14 12 7 9 10 19 20 21 17 14 8 16 Draw a histogram to display the data. Describe the shape of the distribution. 4. Knowledge and Understanding Use the data in question 3. Calculate the measures of central tendency and the range. b) Which measure of central tendency do you think best represents the data? Explain your choice. c) What additional information would the standard deviation provide? a) 5. Thinking A group wants to determine Ontarians’ opinions about raising the minimum wage. What sampling technique is used in each of these samples? Which sample do you think would best represent public opinion? Explain. a) Phone the human resources managers at the 500 largest companies in the province. b) Select several cities and rural areas. Telephone randomly selected households in each place. c) Ask people at employment centres in 10 cities across the province. d) Advertise on radio and in newspapers asking people to phone with their opinions. 6. Application Two sprinters’ times in seconds for running 100 m are given. Who would you choose to run the final leg of your relay team? Give reasons for your choice. 344 Kate 13.22 11.39 13.53 12.99 11.18 12.34 13.05 11.36 11.46 14.13 Fiona 12.50 12.66 12.25 12.31 12.37 12.56 12.74 13.11 12.19 12.61 CHAPTER 7: One-Variable Data