Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Document related concepts

Transcript

Grade: 7 Unit #5: Statistics and Probability Time: 25 Days Unit Overview In this unit, students begin their study of probability, learning how to interpret probabilities and how to compute probabilities in simple settings. They also learn how to estimate probabilities empirically. Probability provides a foundation for the inferential reasoning developed in the second half of this module. Additionally, students build on their knowledge of data distributions that they studied in Grade 6, compare data distributions of two or more populations, and are introduced to the idea of drawing informal inferences based on data from random samples. Random sampling: In earlier grades students have been using data, both categorical and measurement, to answer simple statistical questions, but have paid little attention to how the data were selected. A primary focus for Grade 7 is the process of selecting a random sample, and the value of doing so. If three students are to be selected from the class for a special project, students recognize that a fair way to make the selection is to put all the student names in a box, mix them up, and draw out three names “at random.” Individual students realize that they may not get selected, but that each student has the same chance of being selected. In other words, random sampling is a fair way to select a subset (a sample) of the set of interest (the population). A statistic computed from a random sample, such as the mean of the sample, can be used as an estimate of that same characteristic of the population from which the sample was selected. This estimate must be viewed with some degree of caution because of the variability in both the population and sample data. A basic tenet of statistical reasoning, then, is that random sampling allows results from a sample to be generalized to a much larger body of data, namely, the population from which the sample was selected. “What proportion of students in the seventh grade of your school chooses football as their favorite sport?” Students realize that they do not have the time and energy to interview all seventh graders, so the next best way to get an answer is to select a random sample of seventh graders and interview them on this issue. The sample proportion is the best estimate of the population proportion, but students realize that the two are not the same and a different sample will give a slightly different estimate. In short, students realize that conclusions drawn from random samples generalize beyond the sample to the population from which the sample was selected, but a sample statistic is only an estimate of a corresponding population parameter and there will be some discrepancy between the two. Understanding variability in sampling allows the investigator to gauge the expected size of that discrepancy. The variability in samples can be studied by means of simulation. Students are to take a random sample of 50 seventh graders from a large population of seventh graders to estimate the proportion having football as their favorite sport. Suppose, for the moment, that the true proportion is 60%, or 0.60. How much variation can be expected among the sample proportions? The scenario of selecting samples from this population can be simulated by constructing a “population” that has 60% red chips and 40% blue chips, taking a sample of 50 chips from that population, recording the number of red chips, replacing the sample in the population, and repeating the sampling process. (This can be done by hand or with the aid of technology, or by a combination of the two.) Record the proportion of red chips in each sample and plot the results. The dot plots show results for 200 such random samples of size 50 each. Note that the sample proportions pile up around 0.60, but it is not too rare to see a sample proportion down around 0.45 or up around .0.75. Thus, we might expect a variation of close to 15 percentage points in either direction. Interestingly, about that same amount of variation persists for true proportions of 50% and 40%, as shown in the dot plots. Students can now reason that random samples of size 50 are likely to produce sample proportions that are within about 15 percentage points of the true population value. They should now conjecture as to what will happen if the sample size is doubled or halved, and then check out the conjectures with further simulations. Why are sample sizes in public opinion polls generally around 1000 or more, rather than as small as 50? Informal comparative inference: To estimate a population mean or median, the best practice is to select a random sample from that population and use the sample mean or median as the estimate, just as with proportions. But, many of the practical problems dealing with measures of center are comparative in nature, as in comparing average scores on the first and second exam or comparing average salaries between female and male employees of a firm. Such comparisons may involve making conjectures about population parameters and constructing arguments based on data to support the conjectures (MP3). If all measurements in a population are known, no sampling is necessary and data comparisons involve the calculated measures of center. Even then, students should consider variability figures in the margin show the female life expectancies for countries of Africa and Europe. It is clear that Europe tends to have the higher life expectancies and a much higher median, but some African countries are comparable to some of those in Europe. The mean and MAD for Africa are 53.6 and 9.5 years, respectively, whereas those for Europe are 79.3 and 2.8 years. In Africa, it would not be rare to see a country in which female life expectancy is about ten years away from the mean for the continent, but in Europe the life expectancy in most countries is within three years of the mean. For random samples, students should understand that medians and means computed from samples will vary from sample to sample and that making informed decisions based on such sample statistics requires some knowledge of the amount of variation to expect. Just as for proportions, a good way to gain this knowledge is through simulation, beginning with a population of known structure. The following examples are based on data compiled from nearly 200 middle school students in the Washington, DC area participating in the Census at Schools Project. Responses to the question, “How many hours per week do you usually spend on homework?,” from a random sample of 10 female students and another of 10 male students from this population gave the results plotted to the left. Females have a slightly higher median, but students should realize that there is too much variation in the sample data to conclude that, in this population, females have a higher median homework time. An idea of how much variation to expect in samples of size 10 is needed. Simulation to the rescue! Students can take multiple samples of size 10 from the Census of Schools data to see how much the sample medians themselves tend to vary. The sample medians for 100 random samples of size 10 each, with 100 samples of males and 100 samples of females, is shown to left. This plot shows that the sample medians vary much less than the homework hours themselves and provides more convincing evidence that the female median homework hours is larger than that for males. Half of the female sample medians are within one hour of 4 while half of the male sample medians are within half hour of 3, although there is still overlap between the two groups. Chance processes and probability models In Grade 7, students build their understanding of probability on a relative frequency view of the subject, examining the proportion of “successes” in a chance process—one involving repeated observations of random outcomes of a given event, such as a series of coin tosses. “What is my chance of getting the correct answer to the next multiple choice question?” is not a probability question in the relative frequency sense. “What is my chance of getting the correct answer to the next multiple choice question if I make a random guess among the four choices?” is a probability question, because the student could set up an experiment of multiple trials to approximate the relative frequency of the outcome. And two students doing the same experiment will get nearly the same approximation. These important points are often overlooked in discussions of probability. Students begin by relating probability to the long-run (more than five or ten trials) relative frequency of a chance event, using coins, number cubes, cards, spinners, bead bags, and so on. Hands-on activities with students collecting the data on probability experiments are critically important, but once the connection between observed relative frequency and theoretical probability is clear, they can move to simulating probability experiments via technology (graphing calculators or computers). It must be understood that the connection between relative frequency and probability goes two ways. If you know the structure of the generating mechanism (e.g., a bag with known numbers of red and white chips), you can anticipate the relative frequencies of a series of random selections (with replacement) from the bag. If you do not know the structure (e.g., the bag has unknown numbers of red and white chips), you can approximate it by making a series of random selections and recording the relative frequencies. This simple idea, obvious to the experienced, is essential and not obvious at all to the novice. The first type of situation, in which the structure is known, leads to “probability”; the second, in which the structure is unknown, leads to “statistics.” A probability model provides a probability for each possible non-overlapping outcome for a chance process so that the total probability over all such outcomes is unity. The collection of all possible individual outcomes is known as the sample space for the model. For example, the sample space for the toss of two coins (fair or not) is often written as {TT, HT, TH, HH}. The probabilities of the model can be either theoretical (based on the structure of the process and its outcomes) or empirical (based on observed data generated by the process). In the toss of two balanced coins, the four outcomes of the sample space are given equal theoretical probabilities of ¼ because of the symmetry of the process—because the coins are balanced, an outcome of heads is just as likely as an outcome of tails. Randomly selecting a name from a list of ten students also leads to equally likely outcomes with probability 0.10 that a given student’s name will be selected. If there are exactly four seventh graders on the list, the chance of selecting a seventh grader’s name is 0.40. On the other hand, the probability of a tossed thumbtack landing point up is not necessarily ½ just because there are two possible outcomes; these outcomes may not be equally likely and an empirical answer must be found by tossing the tack and collecting data. The product rule for counting outcomes for chance events should be used in finite situations like tossing two or three coins or rolling two number cubes. There is no need to go to more formal rules for permutations and combinations at this level. Students should gain experience in the use of diagrams, especially trees and tables, as the basis for organized counting of possible outcomes from chance processes. For example, the 36 equally likely (theoretical probability) outcomes from the toss of a pair of number cubes are most easily listed on a two-way table. An archived table of census data can be used to approximate the (empirical) probability that a randomly selected Florida resident will be Hispanic. After the basics of probability are understood, students should experience setting up a model and using simulation (by hand or with technology) to collect data and estimate probabilities for a real situation that is sufficiently complex that the theoretical probabilities are not obvious. For example, suppose, over many years of records, a river generates a spring flood about 40% of the time. Based on these records, what is the chance that it will flood for at least three years in a row sometime during the next five years? Supporting Cluster Standards Use random sampling to draw inferences about a population. 7.SP.1 Understand that statistics can be used to gain information about a population by examining a sample of the population; generalizations about a population from a sample are valid only if the sample is representative of that population. Understand that random sampling tends to produce representative samples and support valid inferences. 7.SP.2 Use data from a random sample to draw inferences about a population with an unknown characteristic of interest. Generate multiple samples (or simulated samples) of the same size to gauge the variation in estimates or predictions. For example, estimate the mean word length in a book by randomly sampling words from the book; predict the winner of a school election based on randomly sampled survey data. Gauge how far off the estimate or prediction might be. Investigate chance processes and develop, use and evaluate probability models. 7.SP.5 Understand that the probability of a chance event is a number between 0 and 1 that expresses the likelihood of the event occurring. Larger numbers indicate greater likelihood. A probability near 0 indicates an unlikely event, a probability around ½ indicates an event that in neither unlikely nor likely, and a probability near 1 indicates a likely event. 7.SP.6 Approximate the probability of a chance event by collecting data on the chance process that produces it and observing its long-run relative frequency, and predict the approximate relative frequency given the probability. For example, when rolling a number cube 600 times, predict that a 3 or 6 would be rolled roughly 200 times, but probably not exactly 200 times. 7.SP.7 Develop a probability model and use it to find probabilities of events. Compare probabilities from a model to observed frequencies; if the agreement is not good, explain possible sources of the discrepancy. a) Develop a uniform probability model by assigning equal probability to all outcomes, and use the model to determine probabilities of events. For example, if a student is selected at random from a class, find the probability that Jane will be selected and the probability that a girl will be selected. b) Develop a probability model (which may not be uniform) by observing frequencies in data generated from a chance process. For example, find the approximate probability that a spinning penny will land heads up or that a tossed paper cup will land open-end down. Do the outcomes for the spinning penny appear to be equally likely based on the observed 7.SP.8 Find probabilities of compound events using organized lists, tables, tree diagrams, and simulation. a) Understand that, just as with simple events, the probability of a compound event is the fraction of outcomes in the sample space for which the compound event occurs. b) Represent sample spaces for compound events using methods such as organized lists, tables and tree diagrams. For an event described in everyday language (e.g., “rolling double sixes”), identify the outcomes in the sample space which compose the event. c) Design and use a simulation to generate frequencies for compound events. For example, use random digits as a simulation tool to approximate the answer to the question: If 40% of donors have type A blood, what is the probability that it will take at least 4 donors to find one with type A blood? Supporting Cluster Standards Unpacked 7.SP.1 Students recognize that it is difficult to gather statistics on an entire population. Instead a random sample can be representative of the total population and will generate valid predictions. Students use this information to draw inferences from data. A random sample must be used in conjunction with the population to get accuracy. For example, a random sample of elementary students cannot be used to give a survey about the prom. Example 1: The school food service wants to increase the number of students who eat hot lunch in the cafeteria. The student council has been asked to conduct a survey of the student body to determine the students’ preferences for hot lunch. They have determined two ways to do the survey. The two methods are listed below. Determine if each survey option would produce a random sample. Which survey option should the student council use and why? 1. Write all of the students’ names on cards and pull them out in a draw to determine who will complete the survey. 2. Survey the first 20 students that enter the lunchroom. 3. Survey every 3rd student who gets off a bus. 7.SP.2 Students collect and use multiple samples of data to make generalizations about a population. Issues of variation in the samples should be addressed. Example 1: Below is the data collected from two random samples of 100 students regarding student’s school lunch preference. Make at least two inferences based on the results. Student Sample Hamburgers Tacos Pizza Total #1 12 14 74 100 #2 12 11 77 100 Solution: • Most students prefer pizza. • More people prefer pizza and hamburgers and tacos combined. 7.SP.5 This is the students’ first formal introduction to probability. Students recognize that the probability of any single event can be can be expressed in terms such as impossible, unlikely, likely, or certain or as a number between 0 and 1, inclusive, as illustrated on the number line below. The closer the fraction is to 1, the greater the probability the event will occur. Larger numbers indicate greater likelihood. For example, if someone has 10 oranges and 3 apples, you have a greater likelihood of selecting an orange at random. Students also recognize that the sum of all possible outcomes is 1. Example 1: There are three choices of jellybeans – grape, cherry and orange. If the probability of getting a grape is and the probability of getting cherry is , what is the probability of getting orange? Solution: The combined probabilities must equal 1. The combined probability of grape and cherry is 5 5 . The probability of orange must equal to 10 10 get a total of 1. Example 2: The container below contains 2 gray, 1 white, and 4 black marbles. Without looking, if Eric chooses a marble from the container, will the probability be closer to 0 or to 1 that Eric will select a white marble? A gray marble? A black marble? Justify each of your predictions. Solution: White marble: Gray marble: Black marble: Closer to 0 Closer to 0 Closer to 1 Students can use simulations such as Marble Mania on AAAS or the Random Drawing Tool on NCTM’s Illuminations to generate data and examine patterns. Marble Mania http://www.sciencenetlinks.com/interactives/marble/marblemania.html Random Drawing Tool - http://illuminations.nctm.org/activitydetail.aspx?id=67 7.SP.6 Students collect data from a probability experiment, recognizing that as the number of trials increase, the experimental probability approaches the theoretical probability. The focus of this standard is relative frequency -- The relative frequency is the observed number of successful events for a finite sample of trials. Relative frequency is the observed proportion of successful event, expressed as the value calculated by dividing the number of times an event occurs by the total number of times an experiment is carried out. Example 1: Suppose we toss a coin 50 times and have 27 heads and 23 tails. We define a head as a success. The relative frequency of heads is: 27 = 54% 50 The probability of a head is 50%. The difference between the relative frequency of 54% and the probability of 50% is due to small sample size. The probability of an event can be thought of as its long-run relative frequency when the experiment is carried out many times. Students can collect data using physical objects or graphing calculator or web-based simulations. Students can perform experiments multiple times, pool data with other groups, or increase the number of trials in a simulation to look at the long-run relative frequencies. Example 2: Each group receives a bag that contains 4 green marbles, 6 red marbles, and 10 blue marbles. Each group performs 50 pulls, recording the color of marble drawn and replacing the marble into the bag before the next draw. Students compile their data as a group and then as a class. They summarize their data as experimental probabilities and make conjectures about theoretical probabilities (How many green draws would are expected if 1000 pulls are conducted? 10,000 pulls?). Students create another scenario with a different ratio of marbles in the bag and make a conjecture about the outcome of 50 marble pulls with replacement. (An example would be 3 green marbles, 6 blue marbles, 3 blue marbles.) Students try the experiment and compare their predictions to the experimental outcomes to continue to explore and refine conjectures about theoretical probability. Example 3: A bag contains 100 marbles, some red and some purple. Suppose a student, without looking, chooses a marble out of the bag, records the color, and then places that marble back in the bag. The student has recorded 9 red marbles and 11 purple marbles. Using these results, predict the number of red marbles in the bag. 7.SP.7 Probabilities are useful for predicting what will happen over the long run. Using theoretical probability, students predict frequencies of outcomes. Students recognize an appropriate design to conduct an experiment with simple probability events, understanding that the experimental data give realistic estimates of the probability of an event but are affected by sample size. Students need multiple opportunities to perform probability experiments and compare these results to theoretical probabilities. Critical components of the experiment process are making predictions about the outcomes by applying the principles of theoretical probability, comparing the predictions to the outcomes of the experiments, and replicating the experiment to compare results. Experiments can be replicated by the same group or by compiling class data. Experiments can be conducted using various random generation devices including, but not limited to, bag pulls, spinners, number cubes, coin toss, and colored chips. Students can collect data using physical objects or graphing calculator or web-based simulations. Students can also develop models for geometric probability (i.e. a target). Example 1: If Mary chooses a point in the square, what is the probability that it is not in the circle? Solution: The area of the square would be 12 x 12 or 144 units squared. The area of the circle would be 113.04 units squared. The probability that a point is not in the circle would be 30.96 or 21.5% 144 Example 2: Jason is tossing a fair coin. He tosses the coin ten times and it lands on heads eight times. If Jason tosses the coin an eleventh time, what is the probability that it will land on heads? Solution: The probability would be 1 . The result of the eleventh toss does not depend on the previous results. 2 Example 3: Devise an experiment using a coin to determine whether a baby is a boy or a girl. Conduct the experiment ten times to determine the gender of ten births. How could a number cube be used to simulate whether a baby is a girl or a boy or girl? Example 4: Conduct an experiment using a Styrofoam cup by tossing the cup and recording how it lands. How many trials were conducted? How many times did it land right side up? How many times did it land upside down/ How many times did it land on its side? Determine the probability for each of the above results 7.SP.8 Students use tree diagrams, frequency tables, and organized lists, and simulations to determine the probability of compound events. Example 1: How many ways could the 3 students, Amy, Brenda, and Carla, come in 1st, 2nd and 3rd place? Solution: Making an organized list will identify that there are 6 ways for the students to win a race A, B, C A, C, B B, C, A B, A, C C, A, B C, B, A Example 2: Students conduct a bag pull experiment. A bag contains 5 marbles. There is one red marble, two blue marbles and two purple marbles. Students will draw one marble without replacement and then draw another. What is the sample space for this situation? Explain how the sample space was determined and how it is used to find the probability of drawing one blue marble followed by another blue marble. Example 3: A fair coin will be tossed three times. What is the probability that two heads and one tail in any order will result? Solution: 3 8 HHT, HTH and THH so the probability would be . Example 4: Show all possible arrangements of the letters in the word FRED using a tree diagram. If each of the letters is on a tile and drawn at random, what is the probability of drawing the letters F-R-E-D in In that order? What is the probability that a “word” will have an F as the first letter? Solution: There are 24 possible arrangements (4 choices • 3 choices • 2 choices • 1 choice) The probability of drawing F-R-E-D in that order is 1 . 24 The probability that a “word” will have an F as the first letter is 6 1 or . 24 4 Instructional Strategies Grade 7 is the introduction to the formal study of probability. Through multiple experiences, students begin to understand the probability of chance (simple and compound), develop and use sample spaces, compare experimental and theoretical probabilities, develop and use graphical organizers, and use information from simulations for predictions. Help students understand the probability of chance is using the benchmarks of probability: 0, 1 and ½. Provide students with situations that have clearly defined probability of never happening as zero, always happening as 1 or equally likely to happen as to not happen as 1/2. Then advance to situations in which the probability is somewhere between any two of these benchmark values. This builds to the concept of expressing the probability as a number between 0 and 1. Use this to build the understanding that the closer the probability is to 0, the more likely it will not happen, and the closer to 1, the more likely it will happen. Students learn to make predictions about the relative frequency of an event by using simulations to collect, record, organize and analyze data. Students also develop the understanding that the more the simulation for an event is repeated, the closer the experimental probability approaches the theoretical probability. Have students develop probability models to be used to find the probability of events. Provide students with models of equal outcomes and models of not equal outcomes are developed to be used in determining the probabilities of events. Students should begin to expand the knowledge and understanding of the probability of simple events, to find the probabilities of compound events by creating organized lists, tables and tree diagrams. This helps students create a visual representation of the data; i.e., a sample space of the compound event. From each sample space, students determine the probability or fraction of each possible outcome. Students continue to build on the use of simulations for simple probabilities and now expand the simulation of compound probability. Providing opportunities for students to match situations and sample spaces assists students in visualizing the sample spaces for situations. Students often struggle making organized lists or trees for a situation in order to determine the theoretical probability. Having students start with simpler situations that have fewer elements enables them to have successful experiences with organizing lists and trees diagrams. Ask guiding questions to help students create methods for creating organized lists and trees for situations with more elements. Students often see skills of creating organized lists, tree diagrams, etc. as the end product. Provide students with experiences that require the use of these graphic organizers to determine the theoretical probabilities. Have them practice making the connections between the process of creating lists, tree diagrams, etc. and the interpretation of those models. Additionally, students often struggle when converting forms of probability from fractions to percents and vice versa. To help students with the discussion of probability, don’t allow the symbol manipulation conversions to detract from the conversations. By having students use technology such as a graphing calculator or computer software to simulate a situation and graph the results, the focus is on the interpretation of the data. Students then make predictions about the general population based on these probabilities. Additional Cluster Standards Draw informal comparative inferences about two populations 7.SP.3 Informally assess the degree of visual overlap of two numerical data distributions with similar variabilities, measuring the difference between the centers by expressing it as a multiple of a measure of variability. For example, the mean height of players on the basketball team is 10 cm greater than the mean height of players on the soccer team, about twice the variability (mean absolute deviation) on either team; on a dot plot, the separation between the two distributions of heights is noticeable. 7.SP.4 Use measures of center and measures of variability for numerical data from random samples to draw informal comparative inferences about two populations. For example, decide Additional Cluster Standards Unpacked 7.SP.3 This is the students’ first experience with comparing two data sets. Students build on their understanding of graphs, mean, median, Mean Absolute Deviation (MAD) and interquartile range from 6th grade. Students understand that 1. a full understanding of the data requires consideration of the measures of variability as well as mean or median, 2. variability is responsible for the overlap of two data sets and that an increase in variability can increase the overlap, and 3. median is paired with the interquartile range and mean is paired with the mean absolute deviation . Example: Jason wanted to compare the mean height of the players on his favorite basketball and soccer teams. He thinks the mean height of the players on the basketball team will be greater but doesn’t know how much greater. He also wonders if the variability of heights of the athletes is related to the sport they play. He thinks that there will be a greater variability in the heights of soccer players as compared to basketball players. He used the rosters and player statistics from the team websites to generate the following lists. Basketball Team – Height of Players in inches for 2010 Season 75, 73, 76, 78, 79, 78, 79, 81, 80, 82, 81, 84, 82, 84, 80, 84 Soccer Team – Height of Players in inches for 2010 73, 73, 73, 72, 69, 76, 72, 73, 74, 70, 65, 71, 74, 76, 70, 72, 71, 74, 71, 74, 73, 67, 70, 72, 69, 78, 73, 76, 69 To compare the data sets, Jason creates a two dot plots on the same scale. The shortest player is 65 inches and the tallest players are 84 inches. In looking at the distribution of the data, Jason observes that there is some overlap between the two data sets. Some players on both teams have players between 73 and 78 inches tall. Jason decides to use the mean and mean absolute deviation to compare the data sets. The mean height of the basketball players is 79.75 inches as compared to the mean height of the soccer players at 72.07 inches, a difference of 7.68 inches. The mean absolute deviation (MAD) is calculated by taking the mean of the absolute deviations for each data point. The difference between each data point and the mean is recorded in the second column of the table The difference between each data point and the mean is recorded in the second column of the table. Jason used rounded values (80 inches for the mean height of basketball players and 72 inches for the mean height of soccer players) to find the differences. The absolute deviation, absolute value of the deviation, is recorded in the third column. The absolute deviations are summed and divided by the number of data points in the set. The mean absolute deviation is 2.14 inches for the basketball players and 2.53 for the soccer players. These values indicate moderate variation in both data sets. Solution: There is slightly more variability in the height of the soccer players. The difference between the heights of the teams (7.68) is approximately 3 times the variability of the data sets (7.68 ÷ 2.53 = 3.04; 7.68 ÷ 2.14 = 3.59). Soccer Players (n = 29) Height (in) 65 67 69 69 69 70 70 71 71 71 72 72 72 72 73 73 73 73 73 73 74 74 74 74 76 76 76 78 ∑ = 2090 Deviation from Mean (in) Absolute Deviation (in) -7 -5 -3 -3 -3 -2 -2 -1 -1 -1 0 0 0 0 +1 +1 +1 +1 +1 +1 +2 +2 +2 +2 +4 +4 +4 +6 7 5 3 3 3 2 2 1 1 1 0 0 0 0 1 1 1 1 1 1 2 2 2 2 4 4 4 6 ∑ = 62 Mean = 2090 ÷ 29 =72 inches MAD = 62 ÷ 29 = 2.14 inches Basketball Players (n = 16) Deviation Height (in) from Mean (in) 73 -7 75 -5 76 -4 78 -2 78 -2 79 -1 79 -1 80 0 80 0 81 +1 81 +1 82 +2 82 +2 84 +4 84 +4 84 +4 ∑ = 1276 Mean = 1276 ÷ 16 =80 inches MAD = 40 ÷ 16 = 2.53 inches Absolute Deviation (in) 7 5 4 2 2 1 1 0 0 1 1 2 2 4 4 4 ∑ = 40 7.SP.4 Students compare two sets of data using measures of center (mean and median) and variability (MAD and IQR). Showing the two graphs vertically rather than side by side helps students make comparisons. For example, students would be able to see from the display of the two graphs that the ideas scores are generally higher than the organization scores. One observation students might make is that the scores for organization are clustered around a score of 3 whereas the scores for ideas are clustered around a score of 5. Example 1: The two data sets below depict random samples of the management salaries in two companies. Based on the salaries below which measure of center will provide the most accurate estimation of the salaries for each company? • Company A: 1.2 million, 242,000, 265,500, 140,000, 281,000, 265,000, 211,000 • Company B: 5 million, 154,000, 250,000, 250,000, 200,000, 160,000, 190,000 Solution: The median would be the most accurate measure since both companies have one value in the million that is far from the other values and would affect the mean. Focus Standards for Mathematical Practice MP.2 Reason abstractly and quantitatively. Students reason quantitatively by posing statistical questions about variables and the relationship between variables. Students reason abstractly about chance experiments in analyzing possible outcomes and designing simulations to estimate probabilities. MP.3 Construct viable arguments and critique the reasoning of others. Students construct viable arguments by using sample data to explore conjectures about a population. Students critique the reasoning of other students as part of poster or similar presentations. MP.4 Model with mathematics. Students use probability models to describe outcomes of chance experiments. They evaluate probability models by calculating the theoretical probabilities of chance events, and by comparing these probabilities to observed relative frequencies. MP.5 Use appropriate tools strategically. Students use simulation to approximate probabilities. Students use appropriate technology to calculate measures of center and variability. Students use graphical displays to visually represent distributions. MP.6 Attend to precision. Students interpret and communicate conclusions in context based on graphical and numerical data summaries. Students make appropriate use of statistical terminology. Skills and Concepts Prerequisite Skills/Concepts: Students should already be able to… Advanced Skills/Concepts: Some students may be ready to… Summarize and describe distributions. 6.SP.5 Summarize numerical data sets in relation to their context, such as by: Reporting the number of observations. Describing the nature of the attribute under investigation, including how it was measured and its units of measurement. Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered. Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the Investigate patterns of association in bivariate data (8.SP.1-4) Understand independence and conditional probability and use them to interpret data. (S-CP.1-5) Use the rules of probability to compute probabilities of compound events in a uniform probability model. (S-CP.6-7) Skills: Students will be able to … Recognize and identify that different sampling techniques must be used in real life situations, because it is very difficult to survey an entire population. (7.SP.1) Select appropriate sample sizes based on a population in real-life situations and explain why generalizations about a population from a data were gathered. Understand ratio concepts and use ratio reasoning to solve problems. 6.RP.3c Find a percent of a quantity as a rate per 100 (e.g., 30% of a quantity means 30/100 times the quantity); solve problems involving finding the whole, given a part and the percent. Analyze proportional relationships and use them to solve real-world and mathematical problems. 7.RP.2 Recognize and represent proportional relationships between quantities. sample are valid only if the sample is random and representative of that population. (7.SP.1) Collect data from a sample population in order to predict information about a population. (7.SP.1) Interpret data from a random sample to draw inferences about a population with an unknown characteristic of interest. (7.SP.2) Generate multiple samples (or simulated samples) of the same size to determine the variation in estimates or predictions by comparing the samples. (7.SP.2) Identify the degree of overlap between two numerical sets of data. (7.SP.3) Visually compare two numerical data distributions with like ranges. (7.SP.3) Measure the difference between the centers of two different data distributions and express this difference as a multiple of a measure of variability. (7.SP.3) Use measures of center and measures of variability for numerical data from random samples to draw informal comparative inferences about two populations.(7.SP.4) Represent the probability of a chance event as a number between 0 and 1. (7.SP.5) Use the terms “likely”, “unlikely,” to describe the probability represented by the fractions used. (7.SP.5) Approximate the probability of a chance event by collecting data on the chance process that produces it and observing its long-run relative frequency. (7.SP.6) Predict the approximate relative frequency of a chance event given the probability. (7.SP.6) Develop a uniform probability model by assigning equal probability to all outcomes, and use the model to determine probabilities of events. (7.SP.7) Develop a probability model (which may not be uniform) by observing frequencies in data generated from a chance process. (7.SP.7) Compare probabilities from a model to observed frequencies. (7.SP.7) If the agreement between a model and observed frequencies is not good, explain possible sources of the discrepancy. (7.SP.7) Find probabilities of compound events using organized lists, tables, tree diagrams, and simulation. (7.SP.8) Represent the probability of a compound event as the fraction of outcomes in the sample space for which the compound event occurs. (7.SP.8) Represent sample spaces for compound events using methods such as organized lists, tables and tree diagrams. (7.SP.8) For an event described in everyday language (e.g., “rolling double sixes”), identify the outcomes in the sample space which compose the event. (7.SP.8) Design and use a simulation to generate frequencies for compound events. (7.SP.8) Academic Vocabulary Probability (A number between and that represents the likelihood that an outcome will occur.) Probability model (A probability model for a chance experiment specifies the set of possible outcomes of the experiment—the sample space—and the probability associated with each outcome.) Compound event (An event consisting of more than one outcome from the sample space of a chance experiment.) Tree diagram (A diagram consisting of a sequence of nodes and branches. Tree diagrams are sometimes used as a way of representing the outcomes of a chance experiment that consists of a sequence of steps, such as rolling two number cubes, viewed as first rolling one number cube and then rolling the second.) Simulation (The process of generating “artificial” data that are consistent with a given probability model or with sampling from a known population.) Long-run relative frequency (The proportion of the time some outcome occurs in a very long sequence of observations.) Random sample Inference Measures of center Measures of variability Mean absolute deviation (MAD) Shape Unit Resources