Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mathematics A30 Module 1 Lesson 1 Mathematics A30 Statistics 1 Lesson 1 Mathematics A30 2 Lesson 1 Statistics Introduction The word statistics is a very common word. Announcers at sports events are always eager to give you the statistics for every possible situation! Local, provincial and federal governments use statistics in many ways to show trends in the economy or population growth and decline. Businesses of all types keep statistics as a measure of what has happened in the past in hopes of making things better in the future. Statistics don't mean very much if they can't be analyzed properly. This lesson will show you different ways of analyzing data by measures of central tendency. The data will also be organized so that the methods of analysis will be easier. The graphing calculator will be used as an extension when creating box and whisker plots. An appendix at the end of this lesson will take you through the steps necessary when creating a box and whisker plot using the graphing calculator. Part of the assignment will be activity based in that you will be responsible for organizing and analyzing a set of data with the knowledge and skills that you have attained from the lesson. Mathematics A30 3 Lesson 1 Mathematics A30 4 Lesson 1 Objectives After completing this lesson you will be able to • list and describe the methods used to collect data. • obtain data for real-world situations by using simulations. • obtain data for real-world situations by using the Monte Carlo simulation. • review the methods for determining the measures of central tendency. • construct box and whisker plots from simulated data. • define and utilize the concept of percentiles (including the first, second, and third quartiles). • solve related problems using statistical inference. Mathematics A30 5 Lesson 1 Mathematics A30 6 Lesson 1 1.1 Introduction to Statistics and Data Collection Data analysis is a big part of many businesses and institutions. People are always trying to determine bigger and better ways to do things. Situations can only get better if people know what has happened in the past. The principle goal of data analysis is to draw information from data. It is an opportunity to investigate and to explore information that is given in data. Data is often open to a variety of alternative explanations and sometimes a problem may have no single correct answer. Statistics is the mathematical study of data. All statistics begin with the collection of data, whether for politics, sport, research, or industry. In order to analyze data, you must go through the process of collecting, organizing and displaying that data. Descriptive statistics are sets of data that have gone through this process. Once data has been collected, organized and displayed it can then be useful in making conclusions and predictions. The statistical process follows these steps: Collect Data Organize and Display Analyze Results Use Results for Predictions and Decisions There are many different ways to collect data. • household surveys • censuses • telephone surveys • experiments • questionnaires These different ways of data collection can be categorized by the following methods of gathering statistical information. Mathematics A30 7 Lesson 1 • Census • You gather information by surveying the entire population being studied. • Every person or item in the population is surveyed. • Often this method is too time-consuming and costly to be the best way to collect the data. • Simulation • Information is gathered using experimental or indirect methods. • This is used as a substitute when normal sampling methods are not practical or safe. • Sample • A section of the population is used to obtain representative information about the topic that is being studied. • It is important that the sample should represent the entire population, so a random sample is chosen. • Validity is a measure of how well a sample represents the entire population. It is important the other factors have not skewed or altered the results. • Samples are often named by the method that the data is collected. Clustered sample • a sample chosen from a particular portion of the population Stratified sample • a random sample which is based on dividing the population into groups based on a common feature Destructive sample • a sampling method often used to test products in manufacturing, where the item tested is destroyed or can't be returned to the population Example 1 For each of the following questions, determine the population being studied, the sample group, the type of sample, and 2 sources of sample error. a) Inspection of grapefruits by slicing 3 grapefruits from each of 400 crates. b) Determining the number of eligible voters in Saskatchewan by doing a door to door survey. c) Determining how many households in Estevan have cable TV by phoning every 20th phone number listed in the Estevan exchange. d) Polling Saskatchewan readers of Sports Illustrated to determine which sport is the most popular for people to watch. Mathematics A30 8 Lesson 1 Solution: a) Inspection of grapefruits • Population: • Sample group: • Type of sample: • Errors: b) Number of eligible voters • Population: • Sample group: • Type of sample: • Errors: c) Cable TV watchers • Population: • Sample group: • Type of sample: • Errors: d) Most popular sport • Population: • Sample group: • Type of sample: • Errors: • • • • • contents of 400 crates 1 200 grapefruits destructive sample sample size may be inadequate visual inspection may have more validity • • • • • every Saskatchewan household every Saskatchewan household census people may be missed if they are not at home people with more than one residence may be listed twice • • • • • all households in Estevan 20th name in the Estevan phone book random sample businesses aren't separated from residences households with no phones or with unlisted numbers will be missed • • • • all households in Saskatchewan Sports Illustrated subscribers stratified sample lots of people watch sports that aren't subscribers to Sports Illustrated Saskatchewan residents may be biased to Canadian sports • Mathematics A30 9 Lesson 1 Simulations and The Monte Carlo Method The data collection for some real-world situations is very difficult because of time or money constraints or when sampling methods are not practical or safe. One example of this would be using crash test dummies to test safety standards in a vehicle. In these instances, a simulation of an experiment can be conducted to acquire data that will help in analyzing the situation. A simulation is a procedure used to answer questions about a real-world situation by setting up an experiment that will portray the actual outcome. Simulations that involve an element of chance are referred to as Monte Carlo simulations. A Monte Carlo simulation often involves the following methods: • rolling a dice • tossing a coin • using a table of random numbers • using calculators having a random number function The most important part of a Monte Carlo simulation is designing an experiment to simulate the data that is needed to analyze the problem. Example 2 Design an experiment to estimate the probability of a family with four children having all girls. Solution: Determine the possible methods. 1. Toss a coin • Heads - Boy • Tails – Girl 2. Roll a dice • Even number - Boy • Odd number – Girl 3. Table of random numbers • Even number - Boy • Odd number - Girl Mathematics A30 10 Lesson 1 Activity 1.1 • • • Toss a coin to determine the probability of a family with four children having all girls. Toss the coin in sets of four. If the four tosses are all tails, enter yes in the table. For any other combination of heads and tails, enter no. Conduct 50 sets of four tosses. Tosses • Total Record the total number of "yes" responses after • 10 sets of tosses. • 20 sets of tosses. • 30 sets of tosses. • 40 sets of tosses. • 50 sets of tosses. The probability of an event happening is the number of favourable outcomes divided by the number of total outcomes. What is the probability of a family of four children having all girls? What would you expect would be the result if you conducted the experiment with 100 tosses? 200 tosses? A table of random numbers can also be used to simulate a situation. One of these tables is on the next page. Even though the numbers are grouped in fives, any total number can be used. Example 3 will use the table of random numbers to simulate an experiment. Mathematics A30 11 Lesson 1 Table of Random Numbers 98299 26431 31098 06439 43274 62016 74576 87395 71610 42808 50712 00780 04669 48862 51823 82841 98575 71690 10140 07702 02554 63110 14985 88016 92051 08614 43438 32489 92085 34118 13873 65725 79669 78448 32520 87432 30871 31501 28672 60639 34188 83554 77509 56618 68932 31878 99380 40649 89969 15851 63911 51768 50717 34523 92143 18507 13119 54631 54664 50695 86670 83438 59867 46204 76752 87577 69628 50972 84909 32355 27401 07287 05701 22244 27165 61516 31010 54732 54854 09574 02216 57412 77813 60381 37637 72766 94994 00137 14293 87392 24451 49382 50294 65107 22281 43244 62978 01227 41155 05975 87838 58593 25906 74234 79677 16420 28451 56787 55096 56464 98769 02557 68652 86106 00665 01353 18510 62632 61924 84951 52927 49678 39065 51072 73419 45502 10030 63005 30070 54955 53549 66868 74245 30295 38884 33937 59299 98425 24517 24567 30206 43084 08753 95492 82720 80922 50326 14697 69161 08850 18604 69059 61528 22953 99900 80628 07899 32169 71860 03601 56145 71179 63282 41650 57596 32658 41125 20012 53453 33940 26792 99900 64673 09010 24921 Mathematics A30 12 Lesson 1 Example 3 Use the table of random numbers to estimate the probability that a student guessing the answers on a 10-question true-false test will get 7 or more correct. Solution: Read the problem. On the test, each of the ten responses are independent of each other. The answers are either correct or not correct. Within the ten responses, find the number of times that the answers are correct 7, 8, 9 or 10 times. Develop a plan. Use the random table of numbers. • A correct response is an even number. • An incorrect response is an odd number. Start anywhere in the table and select a group of ten numbers across to the right. Record the number of even numbers in the group of ten. Carry out the plan. Starting at Random Numbers Number of Even Digits 1st number, 1st row 98299 62016 6 6th number, 3rd row 32489 79669 5 2nd number 5th row 42808 51823 7 5th number, 9th row 54664 46204 9 3rd number, 10th row 15851 92143 3 1st number, 6th row 87432 34188 6 5th number, 11th row 24451 43244 7 3rd number, 12th row 57412 94995 3 3rd number, 18th row 62632 39065 6 1st number, 21st row 30206 80922 8 6th number, 15th row 05975 79677 2 1st number, 23rd row 08753 14697 4 Mathematics A30 13 Lesson 1 There were four times that the number of even digits in the random number was 7 or more. The probability of guessing 7 or more correct answers is: P(A) number of outcomes in the event 4 1 total number of possible outcomes 12 3 Write a concluding statement. The probability of guessing 7 or more correct answers on a true-false test with 10 questions is 0.3. The more trials that you conduct, the more accurate you will be in determining the actual probability of being able to guess correctly 7 or more questions on a true-false test. Mathematics A30 14 Lesson 1 Activity 1.12 This activity will be handed in with Assignment 1. Your favourite fast food restaurant distributes one of 6 different toys with each Kid's Meal. How many times do you have to order a Kid's Meal so that you are able to collect all 6 toys? • • • • • • • Use one die. Shake the die. For each trial, cross out the number that appears on the die. Determine the total number of shakes that it takes to complete each row. Every number will have to be crossed out at least once. Each row represents one trial. If you do not have a dice, use 6 pieces of numbered paper in a container. Trial #1 #2 #3 #4 #5 #6 #7 #8 Toss the Die 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 Total Shakes 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 • You may perform more trials to get a more accurate result. On average, how many times do you have to buy a Kid's Meal before you get all six of the toys? Mathematics A30 15 Lesson 1 Mathematics A30 16 Lesson 1 Exercise 1.1 1. Design an experiment to estimate the probability that a family with 3 children will have 2 boys and a girl. 2. The A-side final of a hockey league championship is a best of seven series. Theodore and Langenburg are evenly matched. If a team must win four games to win the championship, estimate the probability that the series will last seven games using random numbers. 3. There are 20 questions on a multiple choice quiz. Each multiple choice question has four responses, a, b, c, and d. Design an experiment to predict the probability that a student, guessing at the answers, will have at least half the questions correct. 1.2 Measures of Central Tendency Often it is useful to find a single number that can best represent the entire sample or set of data that is being studied. This single number is called a statistic, and the measure of central tendency is one example of such a statistic. Three different measures of central tendency will be studied in this course. Each has its advantages and disadvantages. The most appropriate choice of a measure depends on the situation and the data that is presented. It is not unusual for all three measures to be recorded and comparisons made between the three. Mean The first measure of central tendency is the mean. This is the arithmetic average of the numbers in the sample group. The symbol for the mean of a set of data is x . The average of a set of numbers is the sum of the numbers divided by the total number of numbers. Mathematics A30 17 Lesson 1 Example 1 The following data represents the results of a Mathematics A30 midterm test: 62, 57, 86, 74, 93, 44, 73, 90, 72, 79, 82, 87, 38, 97, 63, 90, 89, 62, 58, 74, 66, 75, 90, 89, 63. Determine the mean. Solution: • • There are 25 test scores. The sum of the test scores is 1853. 1853 25 x 74 .1 Determine the mean. x The mean of the results of a Mathematics A30 midterm exam is 74.1% A scientific calculator can be used to average numbers. Median The second measure of central tendency is the median. This is the middle number when all of the data is arranged from smallest to largest. When you are given a set of data, the data can be arranged in an array. This is another way to say that the data is arranged from smallest to largest values. Mathematics A30 18 Lesson 1 Example 2 The age of the mathematics teachers at a school is as follows: 27, 61, 35, 38, 58, 50, 48, 33, 37. Order the data in an array and determine the median. Solution: Arrange the data in order from smallest to largest. 27, 33, 35, 37, 38, 48, 50, 58, 61 Determine the middle number. The median of the data is 38. The median age of the mathematics teachers at the school is 38 years old. • If there are 20 numbers in the sample group, there will be 2 numbers that are the middle numbers, the 10th number and the 11th number. If this is the case, the median is the average of these two middle numbers. This is the same for all sample groups that have an even number of data. Mathematics A30 19 Lesson 1 In the first example there were not very many numbers, and it was easy to order them from smallest to largest. If you have a sample group where there is a lot of data, a stem and leaf plot can help you to organize the data. The following table shows a stem and leaf plot, where the stem is used to organize the "first" or tens digits, and the leaf is used to record the "last" or units digits. Once the table is completed it is much easier to write the data in an array. The data from Example 2 was as follows: 27, 61, 35, 38, 58, 50, 48, 33, 37. This data is shown in the following table. Stem Leaf 2 7 3 5, 8, 3, 7 4 8 5 8, 0 6 1 These values can now be written in an array. If the data is repeated in the sample group it is still represented in the stem and leaf plot. Example 3 The following data is the same data as in Example 1, representing the results of a Mathematics A30 midterm test: 62, 57, 86, 74, 93, 44, 73, 90, 72, 79, 82, 87, 38, 97, 63, 90, 89, 62, 58, 74, 66, 75, 90, 89, 63. Enter the data in a stem leaf plot, organize the data in an array and then determine the median of the data. Mathematics A30 20 Lesson 1 Solution: Enter the data in a stem and leaf plot. Stem Leaf 3 8 4 4 5 7, 8 6 2, 3, 2, 6, 3 7 4, 3, 2, 9, 4, 5 8 6, 2, 7, 9, 9 9 3, 0, 7, 0, 0 Arrange the data in an array. 38, 44, 57, 58, 62, 62, 63, 63, 66, 72, 73, 74, 74, 75, 79, 82, 86, 87, 89, 89, 90, 90, 90, 93, 97 Determine the median or the middle term. • • There are 25 terms. The middle term will be the 13th term. 38, 44, 57, 58, 62, 62, 63, 63, 66, 72, 73, 74, 74, 75, 79, 82, 86, 87, 89, 89, 90, 90, 90, 93, 97 This is the 13th term The median of the test scores on the Mathematics A30 midterm test is 74. Mathematics A30 21 Lesson 1 Mode The mode is the third way to measure central tendency. The mode is the most frequently occurring number or result in the sample group. This is perhaps the most inefficient way of determining central tendency as will be shown in the example on test scores. Example 4 The following data is the same data as in Examples 1 and 3, representing the results of a Mathematics A30 midterm test: 62, 57, 86, 74, 93, 44, 73, 90, 72, 79, 82, 87, 38, 97, 63, 90, 89, 62, 58, 74, 66, 75, 90, 89, 63. Determine the mode. Solution: Determine the number that is repeated the most often. • The score, 90, is repeated three times. The mode of this data is 90. Is the mode a good measure of central tendency in this example? Which measure would you say best describes the data? Is there much difference between the median and the mean in this situation? Mathematics A30 22 Lesson 1 Example 5 Data was collected for the monthly precipitation in the grassland area of Shaunavon. The following table shows the results. Determine the mean, median and mode of the data. Which value is the best measure of central tendency? Monthly Precipitation (cm) J F M A M J J A S O N D 1.2 2.0 2.1 2.8 4.0 8.0 5.1 3.9 3.1 2.1 1.5 1.4 Solution: Determine the mean. • • The sum of the data is 37.2. There are 12 numbers in the sample group. 37 .2 12 x 3.1 x The mean is 3.1 cm. Mathematics A30 23 Lesson 1 Determine the median. Construct a stem and leaf plot. Stem Leaf 1 .2, .5, .4 2 .0, .1, .8, .1 3 .9, .1 4 .0 5 .1 8 .0 Arrange the data in an array. 1.2, 1.4, 1.5, 2.0, 2.1, 2.1, 2.8, 3.1, 3.9, 4.0, 5.1, 8.0 1.2, 1.4, 1.5, 2.0, 2.1, 2.1, 2.8, 3.1, 3.9, 4.0, 5.1, 8.0 These two are the middle values. 2 .1 + 2 .8 2 Median 2 .45 Find the average of the two middle values. Median The median is 2.45 cm. Determine the mode. • • The value, 2.1, is repeated 2 times in the sample group. No other value has been repeated. The mode is 2.1 cm. Mathematics A30 24 Lesson 1 • Which value would be the best indicator of central tendency? Mode • The values are spread out and the 2.1 is only repeated 2 times, therefore this is not a good measure. Mean • There is an extreme value of 8.0 cm that is not close to the other values. This increases the mean and perhaps skews this value a little. Median • This is the best measure of central tendency because it shows the average of the middle two terms. The following are the three most common measures of central tendency: • Mean: This is the arithmetic average of the numbers in the sample group. • Median: This is the middle number when all the data is arranged in order from smallest to largest. • Mode: This is the most frequently occurring number or result in the sample group. Mathematics A30 25 Lesson 1 Exercise 1.2 1. The mature height of species of trees in Saskatchewan recommended for shelterbelts is as follows: Species Mature Height (m) Green Ash 15 Manitoba Maple 14 American Elm 20 Siberian Elm 10 Willow 14 Poplar 20 Caragana 4 Villosa Lilac 4 Chokecherry 7 Buffaloberry 5 Honeysuckle 4 White Spruce 20 Colorado Spruce 25 Scots Pine 20 Determine the three measures of central tendency for this data. Which value is the best measure of central tendency? Mathematics A30 26 Lesson 1 2. The estimated price of crops in Saskatchewan for 1995 is shown in the following table. Crop Price $/t Winter Wheat 140 Spring Wheat 150 Duram 180 Oats 90 Barley 85 Fall Rye 90 Spring Rye 90 Flax 255 Canola 305 Mixed Grains 85 Mustard Seed 265 Sunflower Seed 275 Lentils 360 Field Peas 165 Canary Seed 240 Determine the three measures of central tendency for this data. Which value is the best measure of central tendency? 3. The winning team must win four out of seven games. The scores are shown in the table. Game Team A Team B 1 12 1 2 7 8 3 10 3 4 1 2 5 3 4 6 2 1 7 0 1 Explain why the mean score is not a good measure to determine the winner. Mathematics A30 27 Lesson 1 1.3 Box and Whisker Plots When analyzing data, it is often helpful to see the full range of results. Graphs are one of the most common ways used to display data in a meaningful manner. The box and whisker plot is one way of showing statistical data. Once the data has been arranged and plotted, it is easier to analyze the results. It is very important to understand how to determine the median value of a set of data. This is the measure of central tendency that will guide the construction of the box and whisker plots. The data from the examples in Section 1.2 on the results of a Mathematics A30 midterm test will again be used to illustrate the process used to develop a box and whisker plot. The results of a Mathematics A30 midterm test: 62, 57, 86, 74, 93, 44, 73, 90, 72, 79, 82, 87, 38, 97, 63, 90, 89, 62, 58, 74, 66, 75, 90, 89, 63 The following steps are used when organizing data to be entered on a box and whisker plot: 1. Organize the data into an array and determine the median. 38, 44, 57, 58, 62, 62, 63, 63, 66, 73, 74, 74, 74, 75, 79, 82, 86, 87, 89, 89, 90, 90, 90, 93, 97 • • • The median of the data is 74. This value is also called the 2nd quartile. The data has now been divided into two parts. There are 12 values in the upper half of the data and 12 values in the lower half of the data. Mathematics A30 28 Lesson 1 2. Determine the median of the upper half of the data. 75, 79, 82, 86, 87, 89, 89, 90, 90, 90, 93, 97 These two are the middle terms. • • 3. The average of the middle two terms is 89, and therefore the median of the upper terms is 89. Another name for this value is the 3rd quartile. Determine the median of the lower half of the data. 38, 44, 57, 58, 62, 62, 63, 63, 66, 73, 74, 74 These two are the middle terms. • • 4. Draw a number line which includes the entire range of data from the smallest number to the largest number. • • 5. The average of the middle two terms is 62.5 and therefore the median of the lower terms is 62.5. Another name for this value is the 1st quartile. For this example the number line will have to go from 38 to 97. Label the number line with an even scale. Plot the data on the number line by placing dots on the line or above the line to indicate each score. Mathematics A30 29 Lesson 1 6. • • • • 7. • Draw a vertical line through the 2nd quartile or the median which is 74. Draw a vertical line through the 3rd quartile or the upper median which is 89. Draw a vertical line through the 1st quartile or the lower median which is 62.5. Draw a box around the values between the 1st and 3rd quartiles. Draw whiskers extending from each of the quartiles to the lowest and highest values. How can you use the box and whisker plot to analyze the data? • • • 50% of the data is contained in the box. 25% of the data is between the upper quartile and the upper extreme. 25% of the data is between the lower quartile and the lower extreme. A box and whisker plot can, at a glance, give you a quick impression of certain important features of the set of data. Location of the data: • The median shows the centre of the data. Spread of the data: • • The length of the box shows the spread of the middle 50% of the data. The length of each whisker shows the spread of the values in the upper and lower 25% of the data. Mathematics A30 30 Lesson 1 Symmetry of the data: • The symmetry of the box with respect to the median will usually determine the symmetry of the data with respect to the median. How can you analyze the data from the box and whisker plot of the results on the Mathematics A30 test? • • • • The median is a good indication of the centre of the data. The values in the box are spread out showing that there is a range of values in the middle 50% of the data. The lower 25% of the data is spread out more because of a lower extreme. The data within the box is fairly symmetrical, showing that the entire sample group is symmetrical as well. This means that the median is a good indication of central tendency. A graphing calculator can create a box and whisker plot. The appendix at the end of this lesson (page 49) will explain how to do this with the TI-83 Plus graphing calculator. • • • Mathematics A30 Use this procedure to enter the above data from the question on the test scores. Set the WINDOW at Xmin = 30 Xmax = 100 Xscl = 10 Use the TRACE function to determine the upper and lower extremes, the 1st and 3rd quartiles and the median. 31 Lesson 1 Example 1 The number of passes attempted and completed by a Saskatchewan Roughrider quarterback in each of the games in the 1996 season is given in the table. a) b) Mathematics A30 Show each of the sets of data in a box and whisker plot. Analyze the data. Game Passes Attempted Passes Completed vs. Calgary 44 21 vs. Ottawa 33 19 vs. Edmonton 25 16 vs. Hamilton 38 22 vs. Toronto 28 14 vs. Calgary 33 17 vs. Montreal 25 16 vs. Edmonton 33 19 vs. Montreal 33 23 vs. Winnipeg 25 12 vs. Toronto 33 20 vs. Ottawa 23 11 vs. B.C. 29 14 vs. Winnipeg 27 11 vs. Hamilton 31 22 vs. Winnipeg 26 13 vs. B.C. 22 13 vs. Calgary 39 20 32 Lesson 1 Solution: a) Make an array of the attempted passes. Use a stem and leaf plot if necessary. 22, 23, 25, 25, 25, 26, 27, 28, 29, 31, 33, 33, 33, 33, 33, 38, 39, 44 Determine the 1st, 2nd and 3rd quartiles. 22, 23, 25, 25, 25, 26, 27, 28, 29, 31, 33, 33, 33, 33, 33, 38, 39, 44 1st quartile 3rd quartile 2nd quartile Create a box and whisker plot. Make an array of the completed passes. 11, 11, 12, 13, 13, 14, 14, 16, 16, 17, 19, 19, 20, 20, 21, 22, 22, 23 Determine the 1st, 2nd and 3rd quartiles. 11, 11, 12, 13, 13, 14, 14, 16, 16, 17, 19, 19, 20, 20, 21, 22, 22, 23 1st quartile 2nd quartile 3rd quartile Create a box and whisker plot. Mathematics A30 33 Lesson 1 Both box and whisker plots are shown on the same scale so that a comparison between the data can be made. Use the graphing calculator to create a box and whisker plot for each set of data. • Set the WINDOW at: Xmin = 0 Xmax = 60 Xscl = 10 b) Analyze the data. • • • • The number of completions is less than the number of passes. The spread of the data for the middle 50% of the data is about the same for both the attempted passes and the completed passes. The number of passes in the upper 25% of the data is more spread out than in the upper 25% for the number of completions. Both sets of data are fairly symmetrical. It could be concluded that during the 1996 season, more passes by Saskatchewan Roughrider quarterbacks did not result in more completions. Percentiles When studying box and whisker plots it was necessary to determine the median and the upper and lower quartile numbers. The following observations about percent could be made from these values: • • • 25% of the data was below the 1st quartile. 50% of the data was below the 2nd quartile. 75% of the data was below the 3rd quartile. Percentiles allow you to determine the position of a number in a distribution of data. • • • The 1st quartile is at the 25th percentile. The 2nd quartile is at the 50th percentile. The 3rd quartile is at the 75th percentile. Mathematics A30 34 Lesson 1 Other points can also mark the percentages of distribution, such as 5%, 15%, 62%, etc. Each of these values also states the percent of the data that is below. Percentile The nth percentile P n of a distribution is the number below which n% of the values fall. Example 2 The following list contains the shoe sizes of some of the teachers at a school. 9, 4, 11, 6, 10, 7.5, 7.5, 7.5, 7, 6, 7.5, 8.5, 7.5, 9, 6.5, 8, 7, 8 Organize the data to determine P 10 , P 25 , P 60 , P 75 , and P 90 . Solution: Read the problem. A list of shoe sizes is given. Determine the 10th, 25th, 60th, 75th and 90th percentiles. Develop a plan. Arrange the data in an array and determine the median. The 1st quartile is at the 25th percentile. The 3rd quartile is at the 75th percentile. Find the other percentiles by determining how many values are below them. Mathematics A30 35 Lesson 1 Carry out the plan. Create a stem and leaf plot. Stem Leaf 4 .0 6 .0, .0, .5 7 .5, .5, .5, .0, .5, .5, .0 8 .5, .0, .0 9 .0, .0 10 .0, 11 .0, Write the data in an array. 4, 6, 6, 6.5, 7, 7, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8.5, 9, 9, 10, 11 Determine the median. 4, 6, 6, 6.5, 7, 7, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8.5, 9, 9, 10, 11 These are the middle two values The median is 7.5. Determine the 1st quartile. 4, 6, 6, 6.5, 7, 7, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8.5, 9, 9, 10, 11 The value of the 1st quartile is 7. • Second quartile Therefore P25 is 7. Mathematics A30 36 Lesson 1 Determine the 3rd quartile. • Therefore P75 is 8.5. Determine P10 . • • • 10% of the values are below this value. 10% of 18 values is 1.8. Only one value will be below the 10th percentile. 4, 6, 6, 6.5, 7, 7, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8.5, 9, 9, 10, 11 • Therefore P10 is 6. Determine P60 . • • • 60% of the values are below this value. 60% of 18 values is 10.8. 10 values will be below the 60th percentile. 4, 6, 6, 6.5, 7, 7, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8.5, 9, 9, 10, 11 • Therefore P60 is 7.5. Determine P90 . • • • 90% of the values are below this value. 90% of 18 values is 16.2. 16 values will be below the 90th percentile. 4, 6, 6, 6.5, 7, 7, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8.5, 9, 9, 10, 11 • Therefore P90 is 10. Mathematics A30 37 Lesson 1 Example 3 The following graph shows the percentiles of the height and weight of girls from the ages of 2 to 18. a) Using the percentile chart, determine what percentile Sacha, a 5 year old girl would be for a weight of 18 kg and a height of 114 cm. Explain what this means. b) A public health nurse tells a father that Paula, his 11 year old daughter is at the 75th percentile for weight and at the 30th percentile for height. What is her weight and height. Describe her physical stature. Percentiles Charts Mathematics A30 38 Lesson 1 Mathematics A30 39 Lesson 1 Mathematics A30 40 Lesson 1 Solution: a) b) • • • • She would be at the 50th percentile for weight and the 90th percentile for height. 50% of other girls her age would be lighter than Sacha. Only 10% of other girls her age would be taller than Sacha. Sacha would more than likely be tall and slim. • • • Paula would weigh 43 kg. Paula would be 141 cm tall. Paula would more than likely be short and chubby. Exercise 1.3 1. In Exercise 3.2, questions 1 and 2 showed data from the mature height of trees, and the estimated price of crops in Saskatchewan. a) b) 2. Create a box and whisker plot for each of these sets of data. Determine P10 , P40 , P75 , and P85 for each set of data. Four boys and their fathers attended the public health clinic. The following information was gathered and put into a table. Complete the percentiles for each child. Age Mark Kristian Joey Blair Mathematics A30 2 7 12 16 Weight (kg) 15 23 35 73 Percentiles 41 Height (cm) 88 62 160 173 Percentile Lesson 1 1.4 Data Analysis Tourism Saskatchewan uses data from Statistics Canada and Customs Canada to monitor and analyze the number of vehicles that enter Saskatchewan from the United States. These vehicles carry tourists that will be staying in Saskatchewan for a limited amount of time. This section will present data that was attained from this study and show you how to analyze this data using all of the different methods that have been described in this lesson. The following table shows data from two different port of entries from the United States into Saskatchewan. One is Northgate and the other one is Oungre. The number of vehicles entering Saskatchewan is broken down into each month. Month Northgate Oungre January 154 270 February 182 286 March 389 294 April 247 356 May 447 615 June 893 1 037 July 1 864 1 092 August 1 454 860 September 454 740 October 393 446 November 250 325 December 280 324 Mathematics A30 42 Lesson 1 Determine the mean number of vehicles entering each port monthly. Northgate: 7007 12 x 584 Oungre: 6645 12 x 554 x x The mean number of vehicles entering Northgate each month is 584 and entering Oungre each month is 554. Determine the modes. There is no mode value for either of the sets of data. Determine the medians by creating a stem and leaf plot. Northgate: Stem Leaf 1 54, 82 2 47, 50, 80 3 89, 93 4 47, 54 8 93 14 54 18 64 Write the data in an array. 154, 182, 247, 250, 280, 389, 393, 447, 454, 893, 1 454, 1 864 Mathematics A30 43 Lesson 1 Determine the median and 1st and 3rd quartiles. 154, 182, 247, 250, 280, 389, 393, 447, 454, 893, 1454, 1864 • • • The 2nd quartile (median) is 391 vehicles per month. The 3rd quartile is 674 vehicles per month. The 1st quartile is 249 vehicles per month. Oungre: Stem Leaf 2 70, 86, 94 3 56, 25, 24 4 46 6 15 7 40 8 60 10 37, 92 Write the data in an array. 270, 286, 294, 324, 325, 356, 446, 615, 740, 860, 1037, 1092 Mathematics A30 44 Lesson 1 Determine the median and 1st and 3rd quartiles. 270, 286, 294, 324, 325, 356, 446, 615, 740, 860, 1037, 1092 • • • The 2nd quartile (median) is 401 vehicles per month. The 3rd quartile is 800 vehicles per month. The 1st quartile is 309 vehicles per month. Represent each set of data on a box and whisker plot. • Keep the same scale for both box and whisker plots so that the data can be compared. Northgate: Oungre: Mathematics A30 45 Lesson 1 Use your graphing calculator to show these two box and whisker plots. Enter the data for Northgate in L1. Enter the data for Oungre in L2. Use the appendix at the end of this lesson to guide you through the steps. The graphing calculator will also show you the value of the median, and upper and lower quartiles by using the TRACE function key. Analyze the data. • • • • • Both sets of data have similar central tendencies. The mean and median for both parts is very similar. The box and whisker plots give a better indication of differences. The data for the port of Northgate is spread out further as shown by the length of the upper whisker. Northgate has a couple of months that are really busy, and the other months the flow of traffic is fairly steady. The data for the port of Oungre is not spread out as far. The values are not symmetrical in that the lower end of the graph is bunched up and therefore the values are similar. The upper 50% of the data are spread out evenly. Can you think of any other conclusions that can be made from this data? Mathematics A30 46 Lesson 1 Exercise 1.4 1. Two other ports of entry of United States tourists entering Saskatchewan have been chosen. Analyze the data in the same manner as in this section. Month Coronach Estevan January 83 346 February 85 367 March 214 604 April 248 536 May 353 619 June 330 720 July 391 878 August 372 717 September 332 621 October 143 613 November 144 605 December 131 661 Mathematics A30 47 Lesson 1 Mathematics A30 48 Lesson 1 Appendix Using the TI-83 Plus Graphing Calculator for Box and Whisker Plots 1. Clear off any data that has been previously entered into the calculator: • Press y= to clear any functions that have been entered. • Press STAT, ENTER and three columns will appear. To clear these columns, use the arrows and move to the top of the column, press CLEAR, then use the down arrow ()to go back to the column. It should now be empty. Do this for all the columns that have data in them. 2. To enter your data into the calculator: • Press STAT, ENTER (This takes you back to the three columns, which should now be empty). • To enter your data, move the cursor to the first line in L1. • Type in the first number, then press ENTER. The cursor will then move down to line two. Repeat this until you have all your data entered. If you have more than one set of data, you move the cursor to the first line in L2 and repeat the process. 3. To calculate the Mean: • Press 2nd, MODE (QUIT) to get back to the original screen. • Press 2nd, STAT, use the arrow to move over to MATH, then use the down arrow to get to mean (3), press ENTER. You will see mean( and a flashing cursor. Press 2nd, 1, ), ENTER. The mean will appear on the right. This is the mean for the data in column 1(L1). 4. To calculate the Median: • Repeat the steps for finding the mean, except move the arrow down to median. 5. To construct a Box and Whisker Plot: • You need to specify the values for the x and y axes. To do this press WINDOW, the cursor will be waiting on Xmin. Type your value, then ENTER. It will then move to Xmax. Type your value, then ENTER. It will then move to Xscl. This is the amount you want your x values to go up by. Type your value, then ENTER. Your max/min values correspond to your data. The y-values are not necessary for a box and whisker plot. • Press Stat Plot (2nd, y=). Mathematics A30 49 Lesson 1 • • • Press ENTER, ENTER, (turns the stat plot on). Use the arrows to move down to Type, then across to the fifth picture which is a box and whisker plot, press ENTER, then move down to Xlist and across to the list that corresponds to your graph. (You can graph three at a time). Do this for each list that you have entered. To draw the box and whisker plot, press GRAPH. Using the TRACE button you can determine the values of the median, 1st quartile and 3rd quartile. Simply move the cursor left and right with the arrow keys(). Mathematics A30 50 Lesson 1 Answers to Exercises Exercise 1.1 Answers may vary. Here are some suggestions. 1. The simulation can be done with the method similar to the one in Example 2 except that three coins may be tossed at once. A “yes” is recorded when the coins land with two heads and one tail up. 2. Select groups of seven digits at random from the table of random numbers, as was done in Example 3. Let an even digit represent a win and an odd digit a loss. If the first game played is represented by the digit on the left, any series of 4 wins or 4 losses before the last game is recorded as a NO. Probability of going seven games = 3. number of yes' s number of seven digit groups Select a row at random from the random numbers table and going from left to right select the first two digits that are either 1, 2, 3 or 4 and bypass the other digits. The first digit represents the correct answer of Question 1 and the second digit represents a guess at the answer of Question 1. If the two digits are the same, the guess is correct. The third digit represents the correct answer for Question 2 and the fourth digit represents the guess for Question 2, etc. One trial is completed when 20 guesses are made. Do about 10 or 20 trials. Probability that at least half the choices are correct = Mathematics A30 number of trials in which at least half are correct number of trials 51 Lesson 1 Exercise 1.2 Exercise 1.3 1. Mean 13 Median 14 Mode 20 The mean or the median is a good measure of central tendency. 2. Mean 185 Median 165 Mode 90 The mean is a good measure of central tendency. 3. Team A lost the series but their average score over the seven games is much higher than that of team B. 1. Height of Trees for Shelterbelts: a. Create an array. 4, 4, 4, 5, 7, 10, 14, 14, 15, 20, 20, 20, 20, 25 2nd quartile 1st quartile b. P10: 3rd quartile 10% of 14 is 1.4 One value lies below the 10th percentile. Therefore P10 = 4. P40 = 10 P75 = 20 P85 = 20 Mathematics A30 52 Lesson 1 Estimated Price of Crops in Saskatchewan for 1995: a. Create an array. 85, 85, 90, 90, 90, 140, 150, 165, 180, 240, 255, 265, 275, 305, 360 median b. P10: 10% of 15 = 1.5 One value lies below the 10th percentile. Therefore P10 = 85. P40 = 150 P75 = 265 P85 = 275 2. Age Mark Kristian Joey Blair Mathematics A30 2 7 12 16 Weight (kg) 15 23 35 73 53 Percentiles 90th 50th 25th 82nd Height (cm) 88 62 160 173 Percentile 75th 5th 92nd 50th Lesson 1 Exercise 1.4 1. Coronach: Mean: 235.5 Mode: no mode Median: 231 1st quartile: 137 3rd quartile: 342.5 2. Estevan: Mean: 607.25 Mode: no mode Median: 616 1st quartile: 570 3rd quartile: 689 Mathematics A30 • The means show that on average, the traffic through Estevan is much greater than through Coronach. • In both parts, the box and whisker plots are fairly symmetrical. This shows that either the mean or the median is a good indicator of central tendency. • The data for Coronach is closer together showing a steady stream of traffic through this port. • The middle 50% of data for Estevan is close together, but the whiskers extend out farther showing months where the traffic is extremely heavy, or at the lower end, not heavy at all. • You may also have more conclusions from this data. 54 Lesson 1 Mathematics A30 Module 1 Assignment 1 Mathematics A30 55 Assignment 1 Mathematics A30 56 Assignment 1 Staple here to the upper left corner of your assignment Before submitting your assignment, please complete the following procedures: Print your name and address, with postal code. This address sheet will be used when mailing back your corrected assignment. 1. Write your name and address and the course name and assignment number in the upper right corner of the first page of each assignment. Student Number 2. Number all the pages and place them in order. 3. Complete the required information details on this address sheet. 4. Staple this address sheet to the appropriately numbered assignment. Use one address sheet for each assignment. Name 8404 Course Number 01 Assignment Number Street Address or P.O. Box Mathematics A30 Course Title City/Town, Province Country Distance-Learning Teacher’s Name 5. Staple the appropriately numbered Assignment Submission Sheet to the upper left corner, on top of this address sheet. Postal Code: Mark Assigned: Mathematics A30 58 Assignment 1 Assignment 1 Values Question Part A can be answered in the space provided. You also have the option to do the remaining questions in this assignment on separate lined paper. If you choose this option, please complete all of the questions on the separate paper. (8) (8) A. 1. Hand in Activity 1.12. 2. The production of flaxseed in Saskatchewan for the years 1980-1995 in thousands of tonnes was as follows: (The data has been organized in an array.) 122, 123, 128, 145, 146, 160, 191, 206, 212, 230, 237, 260, 263, 306, 312, 407 Mathematics A30 a. Label the 1st and 3rd quartiles of the data and the median of the data. b. Underline all the data in the first quarter (the data below the first quartile) and double underline all the data in the fourth quarter (the data above the 3rd quartile). 59 Assignment 1 (8) Year Pop. 3. ‘82 74 Mathematics A30 The population of sheep and lambs in Saskatchewan for the years 1982-1995 in thousands was as follows: ‘83 66 ‘84 62 ‘85 57 ‘86 53 ‘87 59 ‘88 64 ‘89 72 ‘90 83 a. Use a stem-and-leaf plot to order the data. b. Draw a box and whisker plot for the data. 60 ‘91 92 ‘92 85 ‘93 82 ‘94 79 ‘95 83 Assignment 1 (8) 4. Mathematics A30 a. Describe in detail a simulation using one die to select one person from a group of 2 persons. In your simulation does each person have an equal chance of being selected? b. Describe in detail a simulation using a table of random numbers to select three persons from a group of 12 persons. 61 Assignment 1 (8) 5. Katrina got 75% on a mathematics exam. Her mark included with the marks of the other students in her class were: 75, 75, 53, 64, 70, 79, 92 55, 65, 65, 44, 60, 89 71, 69, 68, 61, 60, 60 Calculate her percentile rank in the class. (8) 6. For the given data determine P45 , P50 , and P90 . 21, 13, 35, 19, 35, 10, 17, 56, 35, 20, 50, 22, 13, 9, 39, 22, 0, 2, 24, 38, 6, 5, 44, 22, 24, 25, 3, 13, 6, 12, 13, 33, 59, 10, 7, 63, 9, 25, 1, 15 Mathematics A30 62 Assignment 1 (8) 7. Determine the mean, median, and mode for the data in Question 6. (8) 8. Select a population of 100 digits from the table of random numbers in Section 1.1 and determine the frequency with which each digit occurs. State the probability of each digit occurring if a digit is selected at random from the table. Be sure to organize your data in an appropriate table. Mathematics A30 63 Assignment 1 Answer Part B and Part C in the space provided. Evaluation of your solution to each problem will be based on the following. (30) B. • A correct mathematical method for solving the problem is shown. • The final answer is accurate and a check of the answer is shown where asked for by the question. • The solution is written in a style that is clear, logical, well organized, uses proper terms, and states a conclusion. 1. The grade twelve students at Grassy Side High wrote an exam on probability and after an additional review of the material wrote a similar exam one week later. The scores for both exams are listed. EXAM #1 (March 20) 67 0 37 63 59 95 15 61 59 20 29 32 41 50 45 35 100 71 65 43 70 70 41 43 71 29 57 46 89 38 66 56 91 46 30 50 96 65 40 50 64 62 5 60 97 85 5 91 90 89 26 55 54 95 45 50 53 85 50 47 50 61 55 50 EXAM #2 (March 27) 95 64 85 84 Mathematics A30 35 34 51 97 40 46 55 96 52 50 86 81 64 Assignment 1 a. Mathematics A30 Show a stem and leaf plot for the data of each exam. 65 Assignment 1 Mathematics A30 b. Write the data for each exam in an array. c. Represent each set of data in a box-and-whisker plot. 66 Assignment 1 Mathematics A30 d. How many students are above or equal to the 90th percentile in each of the exams? e. Determine P60 for each exam. f. If a student made 70% in the first exam, in what percentile is this mark? If this student hoped to remain in at least the same percentile in the next exam, what score would have to be made on the next exam? 67 Assignment 1 (6) C. 1. (STUDENT JOURNAL) Each lesson assignment will include a question called STUDENT JOURNAL. For each assignment you will be asked to present written material on some aspect of the course. For this assignment: a. Write one or two paragraphs introducing yourself including a reason for taking this course and what you would like to learn from this course. b. Summarize the content of this lesson into a single page. Do this in a way which may be used to review at a later date. _____ 100 Mathematics A30 68 Assignment 1