Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Accelerated Math I Unit 6: Data Analysis and Probability – Teachers Edition Table of Contents Introduction Task 1: Regression Learning Task …………………………………………..…….1 Written by George R. Shirley, Jr. Airline Online Instructions written by Susan Hotle, Edited by George R. Shirley, Jr. Summative Evaluation for Regression based on Climate Change……….19 Written by George R. Shirley, Jr. Task 2: Probabilities Learning Task……………………………………………...25 Written by Brittany Luken, Edited by George R. Shirley, Jr. Task 3: Correlation and its Meaning……………………………………………...30 Written by Brittany Luken, Edited by George R. Shirley, Jr. Data Analysis and Probability Using Applications from the Airline Industry with Evaluation Exploring Climate Change The purpose of this unit is to introduce Data Analysis and Probability using real data gleaned from the airline industry and simulations involving the airline industry. Task 1 uses a program simulating building and maintaining an airline to generate real data that is used to explore the topic of regression analysis. Task 2 takes place during the maintaining part of task 1 and explores the topic of probability by evaluating delayed departures in the aviation industry. Task 3 uses departure times to explore linear models. All of this material was developed for Dr. Laurie Garrow in the School of Civil and Environmental Engineering at Georgia Tech as a part of a statistics project by Brittany Luken and a G.I.F.T. project by George R. Shirley, Jr., with guidance from Brittany Luken and Susan Hotle. Task 1 - Regression Learning Task: Regression Analysis through Airline Simulation Accelerated Math I Performance Standards: MA1D5. Students will determine an algebraic model to quantify the association between two quantitative variables. a. Gather and plot data that can be modeled with linear and quadratic functions. b. Examine the issues of curve fitting by finding good linear fits to data using simple methods such as the median-median line and “eyeballing.” c. Understand and apply the processes of linear and quadratic regression for curve fitting using appropriate technology. Process Standards: MA1P1. Students will solve problems (using appropriate technology). a. Build new mathematical knowledge through problem solving. b. Solve problems that arise in mathematics and in other contexts. c. Apply and adapt a variety of appropriate strategies to solve problems. d.. Monitor and reflect on the process of mathematical problem solving. MA1P2. Students will reason and evaluate mathematical arguments. a. Recognize reasoning and proof as fundamental aspects of mathematics. b. Make and investigate mathematical conjectures. c. Develop and evaluate mathematical arguments and proofs. d. Select and use various types of reasoning and methods of proof. MA1P3. Students will communicate mathematically. a. Organize and consolidate their mathematical thinking through communication. b. Communicate their mathematical thinking coherently and clearly to peers, teachers, and others. c. Analyze and evaluate the mathematical thinking and strategies of others. d. Use the language of mathematics to express mathematical ideas precisely. MA1P4. Students will make connections among mathematical ideas and to other disciplines. a. Recognize and use connections among mathematical ideas. b. Understand how mathematical ideas interconnect and build on one another to produce a coherent whole. c. Recognize and apply mathematics in contexts outside of mathematics. 2 MA1P5. Students will represent mathematics in multiple ways. a. Create and use representations to organize, record, and communicate mathematical ideas. b. Select, apply, and translate among mathematical representations to solve problems. c. Use representations to model and interpret physical, social, and mathematical phenomena. I - Engage Activity Materials: Smartboard/Whiteboard/ActivSlate Activity Type: Small group, Full class Timetable: This task should take one class period (30 to 45 minutes). Instructions: 1. Divide students into small groups of no more than three students. This may be adjusted depending on the needs of the teacher. 2. Give groups 6 minutes to develop a list of variables that influence the success of an airline through brainstorming. 3. Following the completion of the teaks the teacher needs to discuss the explanatory/response relationship of some of the variables and be sure to introduce (time, profit/loss) where time is the number of iterations of the simulation. 4. If possible, it would be helpful to have someone from the airline industry come in and share ideas related to this discussion. Task(s): 1. Students are to develop an all inclusive list of the variables they identified through the brainstorming process. a. Provide students 3 minutes within their small groups to identify variables that influence airline success b. Provide students 3 minutes to intermingle and share group findings. c. The teacher will make an extensive list on the board of the variables found by all groups. Answers will vary but may include: percentage of on-time flights, percentage of delayed flights, airline name recognition, cost of airline fuel, pilot salaries, aircraft maintenance costs, cost of terminal rental, cost of terminal ownership, food costs, navigator salaries, stewardess salaries, cargo costs, ticket agent costs, etc. 2. Within their small groups, students are to identify each of the variables as qualitative or quantitative and explain why they made each choice. Qualitative Quantitative 3 Airline name recognition Percentage of on-time flights Percentage of delayed flights Cost of airline fuel Pilot salaries Etc (any variable where finding the average would have meaning) 3. Using class discussion, the teacher will label each variable on the board as qualitative or quantitative. Answers will be based on the student responses. 4. For quantitative variables, students are to group them and identify explanatory and response variables if possible. Answers will vary. 5. The teacher will discuss the findings from #4 while writing them on the board. While discussing the answers in #4 the teacher needs to explain explanatory and response variables. II – Explore Activity Materials: Airline Online1, Instruction packets2 as needed, Chart for recording simulations results for each iteration of the task, the chart will have to be modified to meet the instructor’s timetable. Activity Type: Small group Timetable: This is an ongoing task that will take all of the first two days and parts of five or more days thereafter, depending on the desire of the instructor. Instructions: 1. Discuss simulation and its importance as a mathematical tool for generating data. 2. Each small group is to create an airline for use in the simulation task. 3. After developing the airline, a simulation iteration will be run to determine the profit and/or loss generated by each airline. 4. Students will then record the information from the simulation and modify their airline to improve its profitability. Task(s): 1. Using the instruction packet as a guide, each group will create an airline they believe will generate the greatest profit. [ 2 days ] 2. Each group will record the specifics of their airline and answer questions found in the instruction packet regarding how and why they made particular decisions while creating their 4 airline. 3. Following each simulation iteration, each group will modify their airline to improve its profitability. 4. Each group must maintain a record of all modifications made during the improvement process. Instruction Packet: This instruction packet is a modification of the one developed by Susan Hotle at the Georgia Institute of Technology for use in guiding students through the process of setting-up and maintaining an airline using the program Airline Online found at www.airlinesimulation.com/. Airline Simulation for High School This is a group (2-3 people) activity that uses the simulation program Airline Online. The program puts your group in charge of the financial and operational decisions for your own airline company. Your group will be competing against other groups to get the highest company liquidated value after the simulation has run 3.5 years. Each group starts with $1 billion. This packet contains: Your groups website, username, and password Airport Simulation Instructions (to be turned in at the due date for the project) Three-letter codes for particular airports Airplane Types (Flight Charts will be created by your teacher and presented at the time of the simulation) Usernames and Passwords www.airlineonline.com/ Username Password These will be filled in according to the needs of the class and the set-up the teacher uses. Airport Simulation Instructions Name of Airline/Username: _________________________________________________ Group Members: _________________________________________________________ 5 NOTE BEFORE STARTING: Decisions made during this simulation cannot be undone. This program does not give the teacher or you the ability to reset your account or undo any actions (including buying airplanes, hiring staff, and purchasing maintenance bases). To be safe, make sure you have the approval of your group members before taking action. It is highly recommended that you follow this order when creating your airline. Throughout this process, this task will require that you answer questions. Please be specific and complete in answering these questions, as this will be used in determining your grade for this portion of the assignment. Questions will be printed in italics and underlined for convenience. Day 1 Building an Airline (Part 1) Go to the website, sign in and press Start Simulation. Make sure pop-ups are allowed on your computer. After you get to the home page, in order to see all of the tabs at any point in the simulation press Main at the top of the screen. Staffing Tab – Go to Managers. Hire the managers you want. The number employed will stay zero until the simulation is run. The following is a list of what each staff member does: Maintenance Manager – Scheduling: provides advice to an airline on setting up A and B checks during scheduling. This manager will also provide information on the check effectiveness. Maintenance Manager – Staffing: increases the effectiveness of staff by 20%. Maintenance Manager – Purchasing: reduces final maintenance costs by 10%. Maintenance Manager – Training: increases effectiveness of staff by 20% (reduces A/B check times by 20% and can be used in addition to staffing to get 40%). However, this also increases the maintenance staff costs. Maintenance Manager – Reporting: provides access to maintenance reports. These can be viewed from the maintenance report button on the maintenance screen. Operations Analyst: provides you access to all financial reporting via the Analyst section of the simulation. Advertising Manager: Assists you when setting up an advertising campaign by providing suggestions on effective budgeting and allocation of funds. Cargo Manager: provides you with additional information, via the Analyst section, on your cargo operations. Explain which managers you chose and why. Answers will vary. The teacher needs to remind students that once you hire a manager it will cost you double to let them go later on. Although students will hire as they see fit, certain managers may be more important to the success of the airline than others. 6 Aircraft Tab – Go to Buy Used Aircraft. Because it takes years to obtain a new airplane and this simulation is only for 3.5 years, you should only buy used aircrafts. Before buying, look at the attached Airplane Types sheet. Keep in mind, you will have to purchase a maintenance base for each type of engine (General Electric or Rolls-Royce) and train staff for each type of airplane (Boeing or Regional) you buy. To buy an airplane, click view next to the airplane and then purchase it. It is best that you have at least $300 million left over after buying airplanes to cover future costs. Explain your purchasing decisions and why they were made. Students should explain why they purchased particular aircraft types, particular engine types, and the number of aircraft. Airports Tab – At the bottom of the screen, type in the three-letter code of the airport in which you want to purchase space. Purchase offices, cargo handling centers, terminals, and maintenance bases. These are not needed at every airport. You should look at the attached flight chart for your specific airline before purchasing. Each airplane will need to go to an engine overhaul centre multiple times a week. An aircraft will be grounded if it has been 7500 hours since the last C check or if it has flown 22,000 hours. D checks reset flown hours to zero. Buying a terminal is only needed if you feel the airport will be so busy that your airplane might not have a terminal to go to. Explain what you did and why. Where did you put your maintenance bases? Why? Students should explain why they picked particular cities for maintenance bases, cargo handling centers, offices, and terminals. Staffing Tab – Hire the pilots and staff to operate the airplanes. Make an educated guess with the staffing members. You will come back to fix them later (without being penalized financially). You have to staff an airplane before you can schedule it. The “Actual Number” employed will stay zero until the simulation is run. Explain who you hired and why. If students hired a staffing manager they will have recommendations as to the number of staff to hire and the salary for each staff member. Students should explain if they keep these recommendations or made changes and why. Service Levels Tab – Create a New Flight Profile. This is where you designate what services each class will get. Save the profile. You can make more than one profile. Explain the profile(s) you set up and why. What effect do you believe your profile(s) will have on your profit/loss? 7 Students may decide to make up profiles based on the distance of the particular routes. Students may also decide to make profiles based on the planes that travel to and from particular locations. Students need to explain what effect the different profiles may have on the earnings of the airline. Day 2 Building an Airline (Part 2) Analyst Tab (if you hired any) – Look over the analyst information which includes a list of the airports with the highest demand and click on the City Pair Data button for more demand information. This tab is full of useful information for scheduling. It is helpful for students to look through all of the information provided under the analyst tab and where they can use it in future decisions. Scheduling Tab – Select the aircraft you want to schedule. Choose the destination and click Complete Scheduling. Select the times for A and B checks (the table at the top of the screen specifies the time requirements for A and B checks). The maintenance checks work only if the aircraft is arriving at one of your maintenance bases and if there is enough time for a particular maintenance check. Generally, a plane needs about 2 A checks and 1 B check every week. You need to select Apply Maintenance Checks and then press Complete Scheduling three more times. Then proceed to assign prices and luggage fees. If you want to keep the prices that are originally there while adding the service profile to a flight, click View next to the flight and then at the bottom of the screen select the service level profile. Select the flight and choose Apply to Selected. Finish the scheduling of that airplane by pressing Complete. Explain what you did and why. Did the size of your airplanes affect the destination they went? How? Did the distance to destination affect the service level for the flight? How? Scheduling takes a lot of time. Students need to look at which airlines fly to the same cities they fly to and which cities are unique to their particular airline. Students also need to consider the distance between cities and the length of time they are at a maintenance base for A and B checks. Students need to look at what other airlines charge for different routes and makes decisions on prices accordingly. Going to the analyst tab throughout this process and taking notes is helpful. Aircraft Tab – Go to configuration. Create a new configuration. You can make more than one. (The attached Airplane Types sheet can help you with figuring out the approximate sizes of the planes.) To make a cargo only airplane, save the configuration screen without anything checked. Then assign those configurations to the airplanes. (You may have to press the Back button at the top of the screen before assigning configurations to the airplanes.) If you get an error, it is because you have too many seats to fit that particular airplane. Explain what you did and why. 8 There are a number of reasons for creating particular configurations. Students should give reasons for why they created each configuration. Also, students do not have to use each configuration created. Aircraft Tab – Go to cargo infrastructure. Buy x-rays machines, containers, and loading equipment as needed. If you hired an analyst, suggested numbers should appear. Did you have recommendations? Did you follow the recommendations? Why? If you did not have recommendations, how did you choose the number of items to purchase? Answers will vary. Staffing Tab – Go to Staff Salaries. Make sure your staffing numbers are reasonable for the number of airplanes owned. If you hired an analyst, suggested staffing numbers should appear. How did you decide what was reasonable for staffing your planes? Students may wish to go to the analyst tab to see if they meet union requirements. Students should discuss the decisions they made concerning staffing and salaries. Advertising Tab – Fill out how much you want to spend in each category for advertising and the percentages. If you hired an advertising manager, you can click on the suggestions button so you will not have to mentally make the percentages sum to 100%. Also, set the commission percentage. How did you allocate your advertising dollars? Why? It is helpful to look at the effectiveness of different categories in the advertising campaign. The analyst tab will also let students know if they are not using their advertising dollars wisely. Airfares Tab – Search the average prices your competitors have on flights between any two destinations. You can also change your prices on tickets in order to have competitive pricing. Analyst Tab – If you hired an analyst, this will give you any overall recommendations on your airline. Aircraft Tab – Go to Maintenance. Schedule C and D checks if needed. An aircraft will be grounded if it has been 7500 hours since the last C check or if it has flown 22,000 hours. D checks reset flown hours to zero. Every quarter, an airplane flies about 800 hours (if it flies about 9 hours per day). 9 Explain what you did and why. Students will need to look at the total hours flown and estimate when C and D maintenance checks are needed. Based on this information, students need to explain why they did or did not decide to schedule a particular maintenance check. When you have completed these steps, feel free to play around with the other functions of this program (connecting flights, code shares, bonds, etc.). Completing the above steps makes sure you have a functioning airline. A simulation will be run following this time period and at the end of the maintenance time for each of the following seven days. End of class days 3, 4, 5, 6, 7, 8, and 9. The teacher may choose to continue this process until the Probability section of this unit is complete. Maintaining an Airline Complete the Profitability Chart for the most recent simulation. If your airline went bankrupt during the latest simulation run, you will have to start from scratch again. You will start just as you did at the beginning of the simulation. If your airline made it through the latest simulation run, continue with the following instructions for maintaining an airline. Aircraft Tab – Go to maintenance. Schedule C and D checks if needed. An aircraft will be grounded if it has been 7500 hours since the last C check or if it has flown 22,000 hours. D checks reset flown hours to zero. Every quarter, an airplane flies about 800 hours (if it flies about 9 hours per day). Explain what you did and why. Students should discuss all maintenance changes, addition, they made to their airline and why. Analyst Tab – If you hired an analyst this will give you any overall recommendations for your airline. Adjust you settings as needed. What changes did you make and why? Did you purchase additional airplanes? Why? Did you add additional routes? Where and Why? Answers will vary. Teachers should monitor the reasonableness of the changes and voice recommendations if needed. This is particularly important if a group of students goes bankrupt during a simulation period. Airfares Tab – Search the average prices of your competitors for any flights between two destinations. You can change prices on tickets to have competitive pricing. 10 Did you make any changes? What did you change? Why? Answers will vary. Teachers should monitor the reasonableness of the changes and voice recommendations if needed. This is particularly important if a group of students goes bankrupt during a simulation period. Three-letter Airport Codes MRY LAX LAS PHX SLC ABQ DEN DFW Monterey Los Angeles Las Vegas Phoenix Salt Lake City Albuquerque Denver Dallas-Fort Worth Type Engine Max Capacity (all seats in economy) Max Distance (km) Max Distance (miles) Fuel Efficiency (l/km) Min Runway length (m) MSP ORD MDW ATL BWI LGA JFK Minneapolis-St. Paul Chicago-O’Hare Chicago-Midway Atlanta Baltimore New York-Laguardia New York-Kennedy Airplane Types 777200ER CRJ700 RR Boeing Regional RollsGeneral Royce Electric 717200HG W Boeing RollsRoyce 767400ER GE Boeing General Electric 117 375 440 3815 10420 2371 CRJ900 EMB 190LR ERJ145LR Regional General Electric Regional General Electric Regional RollsRoyce 70 90 98 50 11037 3124 2774 4260 2955 6475 6858 1941 1724 2647 1836 3.2 8.7 11.9 2 2.3 2.5 1.5 1570 2600 2600 1560 1870 1800 1245 11 The teacher may decide to allow other aircraft according to the needs of the students and the restrictions the teacher desires to impose. End of Airport Simulation Instructions ________________________________________________________________________ Profitability Chart: Iteration 1 2 3 4 5 6 7 Beginning Balance Ending Balance Profit/Loss Percent Profit/Loss This section should take the two days following the simulation activity. This may be adjusted depending on the duration of the Probability section of the unit. Day 1 III – Explain Activity Materials: Profitability Chart from the Explore Activity, Median-Median Line Activity, Least Squares Regression Line Activity, Sum of Squared Error Activity. Activity Type: Small group, Large group/Class Timetable: This task will take two days following the simulation (Explore) activity. Instructions: 1. Students will plot the data from the Profitability Chart and discuss the correlation coefficient of the data and the linearity of the data. 2. Using the plot from #1, students will draw a possible line of best fit and find the equation of their line (“Eyeballing” a line of best fit). 3. Using the data found in the Profitability Chart and the Median-Median Line Activity, students will generate a median-median line of best fit. 4. Using the data found in the Profitability Chart and the Least Squares Regression Line (LSRL) Activity, students will generate a LSRL for the data. 5. Using the Sum of Squared Error (SSE) Activity, students will compute the SSE for each line and use this measure to compare the two lines. 12 Tasks: “Eyeballing” a line of best fit: 1. Create data points using the form (iteration, percent profit/loss). ( _____, _______ ), ( _____, _______ ), ( _____, _______ ), ( _____, _______), ( _____, _______ ), ( _____, _______ ), ( _____, _______), ( _____, _______) 2. Plot the points on a Coordinate plane where the x-axis covers the domain [0, highest number of iterations used] and the range is the range of the profit/loss in your Profitability Chart. Answers will vary. 3. Draw a line that you believe comes closest to hitting all of your data points. Answers will vary. 4. Choose two points on your line and find the equation of the line that goes through those two points. You may use any form to find your equation but please write you final equation in y = mx + b form to be consistent. Answers will vary but should match the data from #3. Students should list the points they use and show their work for finding the linear equation. Median-Median Line Activity: 1. Create data points using the form (iteration, percent profit/loss). ( _____, _______ ), ( _____, _______ ), ( _____, _______ ), ( _____, _______), ( _____, _______ ), ( _____, _______ ), ( _____, _______), ( _____, _______) Answers should match those from #1 in the previous section. 13 2. Plot the points on a Coordinate plane where the x-axis covers the domain [0, highest number of iterations used] and the range is the range of the profit/loss in your Profitability Chart. The graph, at this point, should match the one in the previous section. 3. Order your data by iteration and break the set up into three equal parts. (If it is not possible to have three equal parts, make sure the first and third parts are equal and the middle part contains +/- one data point when compared to the first and third parts.) Answers will vary. 4. Find the point (median X, median Y) for each of the three parts of your data. This will yield points M1, M2, and M3, which are the corresponding (median X, median Y) for each of your three subsets of data. Answers will vary. 5. Find the equation of the line, in y = mx + b form, passing through points M1 and M3. Think of the y-intercept of this line as (b1). Answers will vary. 6. Using the slope from the equation in #5 and the point M2, find the equation of the line, in y = mx + b form, parallel to the line in #5 and passing through the point M2. Think of the y-intercept of this line as (b2). Answers will vary. 7. If (b1) > (b2) then let (b3) = (b1) - |(b1) - (b2)|/3, if (b2) > (b1) then let (b3) = (b1) + |(b1) – (b2)|/3. The Median-Median Line is the line with the equation yˆ mx b3 , where m is the same slope as found in #5 and #6 above and b3 is the y-intercept that is 1/3 above or below b1, depending on the position of b2. Answers will vary. Least Squares Regression Line (LSRL) Activity: 1. Create data points using the form (iteration, percent profit/loss). ( _____, _______ ), ( _____, _______ ), ( _____, _______ ), ( _____, _______), ( _____, _______ ), ( _____, _______ ), ( _____, _______), ( _____, _______) 14 Answers should match #1 in both of the previous sections. 2. Plot the points on a Coordinate plane where the x-axis covers the domain [0, highest number of iterations used] and the range is the range of the profit/loss in your Profitability Chart. The graph should match the ones in the previous sections, #2. 3. Using the TI-83, TI-84, or TI-nSpire, go to STAT, EDIT, and place the x-values in L1 and the corresponding y-values in L2, pressing enter after each value. Go to STAT, CALC, and then go down to LinReg (a+bx), it should be #8 on your list, press ENTER, and press ENTER again (the calculator will default to L1, L2). The calculator will then give you the values for a (the yintercept) and b (the slope) of your LSRL. If you have diagnostics turned on the calculator will also give you values for r and r-squared. Your teacher may not allow the use of graphing calculators since some states will not allow their use on End of Course Tests. In this case, your teacher will provide an alternate method for finding the LSRL. LSRL: ŷ _____________________________ Answers will vary. The teacher may decide to give each group their particular LSRL. Day 2 Sum of Squared Error (SSE) Activity: Use your data from the Profitability Chart and the appropriate regression equation to complete the following charts used to find the Sum of Squared Error. 1. Sum of Squared Error for the line created by “eyeballing” ( y yˆ ) Iteration (x) Profit/Loss (y) Predicted value ( ŷ ) 1 2 3 4 5 6 7 The sum of the rows is the Sum of Squared Error ( y yˆ ) 2 15 2. Sum of Squared Error for the Median-Median Line ( y yˆ ) Iteration (x) Profit/Loss (y) Predicted value ( ŷ ) 1 2 3 4 5 6 7 The sum of the rows is the Sum of Squared Error 3. Sum of Squared Error for the LSRL ( y yˆ ) Iteration (x) Profit/Loss (y) Predicted value ( ŷ ) 1 2 3 4 5 6 7 The sum of the rows is the Sum of Squared Error ( y yˆ ) 2 ( y yˆ ) 2 All answers for this section should vary. 4. Compare the SSE for each regression line. The regression line with the lowest SSE is the better regression line. This process works for comparing any regression lines for a given set of data. Which of your lines is the best fit line? The SSE for the LSRL should be lowest. 5. Discuss your results with other groups in the class. What do you discover? This provides teachers the opportunity to explain the meaning of the LSRL in terms of SSE. 6. Explain why you think your discovery in #4 is important. This should open the class up to a discussion of regression lines and curves in general. This discussion will be extended in the next activity. 16 7. If you were to continue this process what would you predict your percent profit/loss for the 10th iteration as compared to the 9th iteration? This process is called extrapolation because you are predicting for values outside the domain. Would you consider extrapolation good or bad? Why? Aanswers may vary as students do not have a concept of extrapolation yet. 8. Predictions for values inside the domain is called interpolation. Which would be better, extrapolation or interpolation? Why? Answers may vary. Following this activity the teacher needs to have a class discussion of interpolation and extrapolation. The teacher may choose to use this as a full-class activity instead of small group/large group. This activity should take 1 day including the wrap-up discussion. IV – Extend Activity Materials: Profitability Chart from the Explore Activity, Scatter Plot and Sum of Squared Error for the LSRL in the Explain Activity component, Sum of Squared Error Activity for this component. Activity Type: Small group, Large group/Class Timetable: This task will take one day following the Explain component. Instructions: 1. Students will graph the LSRL on the scatter plot of the data from the Explain Activity component. 2. Students will create a quadratic regression of the data from the Explore Activity. 3. Using the Sum of Squared Error (SSE) Activity, students will compute the SSE for the LSRL and the quadratic regression model. 4. The teacher will present additional modeling equations and their use in modeling real phenomena. Tasks: 1. Copy your original data into the following table and create a scatterplot of the data. Profit/Loss Iteration (x) Percentage 1 2 3 4 5 6 7 17 2. Graph your LSRL on the scatterplot. The answers to #1 and #2 should follow from the previous work. 3. (a) How close does your line come to hitting all of the data points? Answer will vary. (b) What measure quantitatively supports your argument in part (a)? SSE. 4. Is there a better model than a linear one? ____________ Using the following procedure, create a quadratic model of the data. Using the TI-83, TI-84, or TI-nSpire, go to STAT, EDIT, and place the x-values in L1 and the corresponding y-values in L2, pressing enter after each value. Go to STAT, CALC, and then go down to QuadReg, it should be #5 on your list, press ENTER, and press ENTER again (the calculator will default to L1, L2). The calculator will then give you the values for a, b, and c of your Quadratic Regression model. Your teacher may not allow the use of graphing calculators since some states will not allow their use on End of Course Tests. In this case, your teacher will provide an alternate method for finding the Quadratic Regression model or the model itself. QuadReg: ŷ The teacher may need to provide the quadratic regressions if students are not using graphing calculators. 5. Graph your quadratic regression equation on the scatterplot in question #1. Did it come closer to hitting the data points than the LSRL? Answers will vary and should provide the opportunity to discuss quadratic regressions and the shape of the graph that should be expected. 6. Defend your answer in question #5 by finding the SSE for your Quadratic Regression model and comparing it to the SSE for your LSRL. Sum of Squared Error for the LSRL Iteration (x) Profit/Loss (y) Predicted value ( ŷ ) 1 2 3 ( y yˆ ) ( y yˆ ) 2 18 4 5 6 7 The sum of the rows is the Sum of Squared Error Sum of Squared Error for the Quadratic Model ( y yˆ ) Iteration (x) Profit/Loss (y) Predicted value ( ŷ ) 1 2 3 4 5 6 7 The sum of the rows is the Sum of Squared Error ( y yˆ ) 2 Which is the better model and why? A discussion of the SSE should be provided. Answers will vary as to which is the better fit depending on the actual data. 7. Notice that there are other regression models available on your calculator. Explain when each of the following models would be used. (a) LinReg: (b) QuadReg: (c) CubicReg: (d) LnReg: (e) ExpReg: (f) Logistic: 19 The teacher should graph the parent graphs of each of the above regression equations and explain the general shape of the data that would be useful in determining which regression model should be used. For this course we will be using only linear and quadratic regression models. Summative evaluation day for the regression part of this unit. This activity can be used in conjunction with the remaining summative activities for the unit. V – Evaluate Activity Materials: GRASP based Summative Evaluation Task with scoring guide.. Activity Type: Individual Timetable: This task will take one day at the end of the unit. Instructions: 1. Students will plot the Average Temperatures for McDonough from 1998 to 2007 from the data in the provided chart. 2. Showing all necessary work, students will determine each of the following regression models: a. Median-Median line b. Least Squares Regression Line c. Quadratic Regression 3. Using the method of SSE and showing all work, students will determine which of the above regression models is most appropriate for the data and defend their answer. 4. Students will determine whether average temperatures are decreasing, staying the same, or increasing and defend their determination based on the findings of their regression model. 5. Students will predict the average temperature in McDonough for August 1, 1990, August 1, 1997, August 1, 2005, and August 1, 2009. Students will then determine which of their predictions they would consider most accurate. 6. Provided the actual data for August 1, 1990, August 1, 1997, August 1, 2005, and August 1, 2009, students will discuss the relationship of the real data to the predicted results and whether adding this data prior to making their models would have an effect on the model. 7. Students will write a short paper discussing their findings. Tasks: 20 Summative Assessment for Regression Part of Data Analysis and Probability The total number of points that can be ___________________________ achieved in a particular situation are found in square brackets [ points ] beside the ____________ situation. Student Name Date _______________ Period The Guidelines for this performance assessment are: Real-world Goal: The goal is to take a defendable position on the concept of Climate Change. Real-world Role: Your role as a data analyst is to organize and analyze the provided data to demonstrate evidence for or against the idea of an increase in temperature over time. Real-world Audience: Your target audience is the average person who may or may not understand the process of analyzing data to demonstrate evidence for or against climate change. Real-world Situation: You are provided average daily temperatures for the same date over multiple years for a given location. You must organize and analyze this information and make a judgment on its implications regarding climate change. Real-world Products and Performances: You are provided a task that will be used to defend your conclusions concerning climate change. The final part of the task is to communicate with your audience the conclusion you draw by writing a short paper stating your position on climate change based on the data provided and the defense you have for your position. Standards: As you work through the task, you will acquire points that sum to a final grade for the overall task. Data: The following data, gleaned from www.wunderground.com, is the average daily temperatures for McDonough, GA on August 1 of the given year in the form date temperature. 1998 – 78o, 1999 – 86o, 2000 – 78o, 2001 – 78o, 2002 – 82o, 2003 – 78o, 2004 – 84o, 2006 – 84o, 2007 – 83o. 1. Organize the data in the chart at the right with X being the number of years since 1990 and Y being the average temperature on August 1 of the year. [ 5 pts. ] X 8 9 10 11 12 13 14 16 17 Y 78 86 78 78 82 78 84 84 83 21 2. Plot the points from your chart as a scatterplot on the graph below. Make sure you label each axis. [ 5 pts. ] 3. Showing all work, for the data in #1, find the (a) Median-Median Line, (b) Least Squares Regression Line, and (c) the Quadratic Regression Equation. Graph these regression equations on the graph in #2. [ 30 pts. total ] (a) Median-Median Line (Predicted Temperature) = 69.4286 + 0.8571 (number of years since 1990) (b) LSRL (Predicted Temperature) = 76.2794 + 0.4044 (number of years since 1990) (c) Quadratic Regression Equation (Predicted Temperature) = 0.0751x2 – 1.4796x + 87.4508, where x is the number of years since 1990. 22 4. Showing all work, determine the SSE for each of the regression equations in #3. Based on the SSE and the graphs in #2, discuss which is the best regression equation and defend your answer. [ 30 points total ] (a) SSE for Median-Median Line Sum of Squared Error for the Median-Median Line ( y yˆ ) Predicted temp ( ŷ ) year (x) temp (y) 8 78 76.2854 1.7146 9 86 77.1425 8.8575 10 78 77.9996 0.0004 11 78 78.8567 -0.8567 12 82 79.7138 2.2862 13 78 80.5709 -2.5709 14 84 81.4280 2.572 16 84 83.1422 0.8578 17 83 83.9993 -0.9993 The sum of the rows is the Sum of Squared Error (b) SSE for LSRL Sum of Squared Error for the LSRL ( y yˆ ) Predicted temp ( ŷ ) year (x) temp (y) 8 78 79.5146 -1.5146 9 86 79.9190 6.0810 10 78 80.3234 -2.3234 11 78 80.7278 -2.7278 12 82 81.1322 0.8678 13 78 81.5366 -3.5366 14 84 81.9410 2.0590 16 84 82.7498 1.2502 17 83 83.1542 -0.1542 The sum of the rows is the Sum of Squared Error (c) SSE for Quadratic Regression Equation Sum of Squared Error for the Quadratic Regression ( y yˆ ) Predicted temp ( ŷ ) year (x) temp (y) 8 78 80.4204 -2.4204 9 86 80.2175 5.7825 10 78 80.1648 -2.1648 11 78 80.2623 -2.2623 12 82 80.5100 1.4900 13 78 80.9079 -2.9079 14 84 81.4560 2.5440 ( y yˆ ) 2 2.9399 78.4553 0.0000 0.7339 5.2267 6.6095 6.6152 0.7358 0.9986 102.3149 ( y yˆ ) 2 2.2940 36.9786 5.3982 7.4409 0.7531 12.5075 4.2395 1.5630 0.0238 71.1986 ( y yˆ ) 2 5.8583 33.4373 4.6864 5.1180 2.2201 8.4559 6.4719 23 16 17 84 83.0028 0.9972 0.9944 83 84.0015 -1.0015 1.0060 The sum of the rows is the Sum of Squared Error 68.2483 5. Based on your findings at this point, are the average temperatures on August 1 in McDonough, GA increasing, staying the same, or decreasing. Defend your answer. [ 5 points ] Answers may vary but should be based on the findings from problems 2 and 3. The solutions from problems 2 and 3 should be a part of the discussion for full credit. 6. Using the regression equation you found to be best in #4, predict the average temperature for the following dates and explain why it is or is not a good prediction. [ 5 points total ] August 1, 1990 - __________ 87.4508o extrapolation August 1, 1997 - __________ 80.7755o extrapolation August 1, 2005 - __________ 82.1633o interpolation August 1, 2009 - __________ 86.4638o extrapolation 7. At this point in the task, see your teacher to get the actual average temperatures in McDonough, GA on the above dates. Discuss the relationship between your predicted values and the real data. If you had added these data points to the original data set, would it have an effect on the regression equation you found in #4? If so, what would that effect have been? [ 5 points ] Answers will vary. Some students will discuss the graph some will input the points and discuss the change in the regression equation. Grade according to the completeness of the discussion. 24 8. Based on the results of your findings, take a position on climate change and defend that position using the information and understanding you have developed during this task. Write a short paper describing your position and it defense in such a way that a person who does not understand data analysis and has not seen your work in problems #1 through #7 will understand what you are discussing. [ 15 points ] Answers will vary. Teachers will need to determine if the defense follows the information found in throughout the task. Resource data for use in the Evaluation Activity. This data was downloaded from the website www.wunderground.com. Year 1990 1991 1992 1993 1994 1995 1996 Average Temperature on August 1 in McDonough, GA 78o 78o 75o 82o 76o 81o 78o Year 1997 1998 1999 2000 2001 2002 2003 Average Temperature on August 1 in McDonough, GA 69o 78o 86o 78o 78o 82o 78o Year 2004 2005 2006 2007 2008 2009 Average Temperature on August 1 in McDonough, GA 84o 77o 84o 83o 80o 81o 25 26 Task 2: INTRODUCTION On-time departure is an essential part of a successful transportation network. This problem is set up to evaluate delayed departures in the aviation ind ustry. By looking at Delta's flights departing out of Atlanta's Hartsfield-Jackson Airport (ATL) on Saturday, September 13, 2008 various delay parameters can be evaluated. This data can lead to improving the efficiency of the aviation network. This topic was chosen since it has a wide-spread interest. Most people traveling have important schedules that they need to or want to meet. Thus, this topic should peak the interest of many of the high school students who are learning probability and statistics with it. Specifically, the problem is going to be set up as if particular individuals, Tim and Brittany, are traveling out of Atlanta on a Delta flight. The problem is set up to use two sets of data. The first set of data is from the Bureau of Transportation Statistics, and simply offers on-time arrival performance for the entire aviation network (1). The second set of data contains all the Delta flights that flew out of ATL on Saturday, September 13, their scheduled departure time, their actual departure time, and whether or not the flight arrived to Atlanta late. This data has been simplified to include the flight number and relative departure time and how late it originally arrived in Atlanta. This data set can be found in the appendix. It was acquired from FlightStats.com (2). WEB RESOURCES http:// www.transtats.bts.gov/OT_Delay/OT_DelayCausetasp http://www.flightstats.com/go/FlightStatus/flightStatusByAirport.do 27 FIVE ESSENTIAL QUESTIONS AND THE STANDARDS THEY MEET Using the Accelerated Mathematics 1 track, the following questions were developed and expanded upon: 1. What are the various causes of delayed flights? a. Probability calculations: i.What is the probability that a flight departs early? ii.What is the probability that a flight departs on-time? iii.What is the probability that a flight is delayed? iv.What is the probability that a flight is cancelled? v.What is the probability that a flight is early or on-time? b. Are Delta flights out of Atlanta delayed more or less often than the national average? c. Conditional Probability: Given that their flight is delayed, what is the probability that the cause of delay is air carrier delay? d. What are the changes that both flights are delayed? e. What's the probability that the first flight is delayed due to weather and the second is delayed due to security? [MA1D2] 2. Should someone departing out of Altanta's Hartsfield-Jackson airport except to leave on time? a. What percentage of Delta flights out of Atlanta's Hartsfield-Jackson airport depart early, on-time (+ 15 minutes), or late? b. What is the average delay time? What is the most common delay time? c. What is the expected value of departure time of a Delta Flight out of ATL compared to its scheduled departure time? [MA1D2, MA2D1(b)] 3. Should someone departing out of any given airport except to leave on time? a. How does Delta's delay time out of ATL compare to that of the entire population? [MA2D1 (a)] 4. Does the amount of departure delay dependent on the significant of the late arrival of the aircraft? a. Plot the departure delay against arrival delay of all the flights with delayed departures that arrived to the airport late. b. Determine the regression line for predicting departure time from arrival time and display it on the graph . c. What does the low R2 value mean mathematically and in the context of how departure time relates to the arrival time of each aircraft? [MA1D5] 5. Although, most travelers only consider delay, a fair amount of times airplanes actually leave ahead of schedule. Is this ahead of schedule departure significant? a. Find the probability that a flight is not late. b. Find the probability that a flight leaves 15 minutes late at the latest. c. Find the probability that the flight leaves over 45 minutes late. d. Find the probability that a flight leaves between 10 and 30 minutes late [MA2D3] 28 Task Tim and Brittany are planning their honeymoon. They have decided to fly out of Atlanta's Hartsfield-Jackson Airport (ATL) on Tuesday, September 15, 2009. They have acquired delay information for flights departing out of Atlanta's Hartsfield-Jackson Airport (ATL) on the Tuesday a year prior to their scheduled trip, assuming it will mimic similar behavior for their trip in 2009. Brittany and Tim want your help analyzing the data to determine whether or not they will have an on-time departure to their honeymoon. First, fill out the percentages of early flights, on-time flights, delayed flights, and cancelled flights in the chart below. Early On-Time Delayed Cancelled # of Flights 603 223 81 3 % of Flights .6626 .2451 .089 .0033 Remember that probability is the number that meet a condition divided by the total in the # correct sample space. P( x) total 1. A flight is considered early if it leaves before its scheduled departure time. What is the probability that Tim and Brittany's flight departs out of ATL early? P(early departure) = 603 / (603+223+81+3) = .6626 2. An on-time departure is any flight that leaves between its scheduled departure time and fifteen minutes late. What is the probability that Tim and Brittany's flight departs out of ATL on-time? P(on-time departure) = 223 / (603+223+81+3) = .2451 3. A delayed flight is any flight that departs 15 minutes after its schedule departure time. What is the probability that Tim and Brittany's flight has a delayed departure? P(delayed departure) = 81 / (603+223+81+3) = .089 4. What's the probability that Tim and Brittany's flight is cancelled? P(delayed departure) = 3 / (603+223+81+3) = .0033 29 5. Tim and Brittany always arrive early to the airport. What are the chances that their flight departs either early or on-time? P(early or on-time departure) = (603+223) / (603+223+81+3) = .9077 Brittany and Tim decided to compare Delta's flight departure statistics to the national on -time arrival performance. Data acquired from the Bureau of Transportation Statistics can be seen below: National On-Time Arrival Performance (%) Diverted, 0.3 Cancelled, 1.697 Aircraft Arriving Late. 7.17 Security, 0.05 National Aviation System Delay. 7.78 Weather Delay, 1.01 Air Carrier Delay, 6.3 6. Are Delta flights out of ATL on this particular day delayed more often or less often than the national average? On this particular day, 8.9% of Delta's flights were delayed. This is significantly less than the National On-Time Arrival Performance delay percentage of 22.31% as found by slimming air carrier delay, weather delay, national aviation system delay, security delay, and delay caused by aircrafts arriving late. 7. Given that their flight is delayed, what is the probability that the cause of delay is air carrier delay? P(air carrier delay | flight delayed)=6.3 / (6.3+1.01+7.78+.05+7.17) = .2824 30 Assume Tim and Brittany have to transfer flights on their way to their honeymoon destination: 8. What are the chances that both flights are delayed? (Hint: use Atlanta delay statistics for the first leg and the national performance for the second leg, and assume delays are independent) Probability of both flights delayed = .089 * .2231 = .0199 9. What's the probability that the first flight is delayed due to weather and the second is delayed due to security? P(flight 1 weather delay and flight 2 security delay) = .0101*.0005 = .00000505 31 Task 3: Correlation and its meaning As observed above in the National On-Time Arrival Performance, 7.17% of flights are delayed due to the aircraft arriving late. Tim and Brittany have speculated that, on average, the later the flight arrives to ATL, the longer the delay will be. Departure time for flights with previous legs arriving late into ATL can be seen below. This data was obtained from www.flightstatus.com. Flight Late Arrival Departure DL 802 21 12 DL 2016 52 49 DL 6773 28 15 DL 1548 17 31 DL 724 16 12 DL 4516 85 96 DL 1272 20 -2 DL 4422 29 14 DL 4798 37 49 DL 6515 18 0 DL 6493 21 8 DL 879 46 47 DL 1447 33 -2 DL 4416 22 25 DL 363 57 60 DL 411 17 9 DL 1551 25 37 DL 4427 15 -3 DL 1619 19 -1 DL 1592 40 -1 DL 836 27 14 32 DL 727 16 -3 DL 19 36 31 DL 1679 35 5 DL 4740 36 44 DL 4715 15 -2 DL 1541 18 -5 DL 763 31 4 DL 857 37 -3 DL 756 27 -7 DL 881 18 -2 DL 1268 23 11 DL 4594 21 8 DL 1717 27 -8 DL 1635 18 22 DL 5880 26 -3 DL 4294 45 35 DL 1041 64 79 DL 745 29 37 DL 6492 27 5 DL 1577 26 8 DL 6622 101 99 DL 4801 20 3 DL 4025 17 8 DL 705 30 30 DL 1682 15 29 DL 2010 20 -5 DL 1622 27 32 33 DL 6908 15 10 DL 865 21 -3 DL 971 24 40 DL 4230 26 23 DL 6284 46 70 DL 4447 32 31 DL 1770 32 -3 DL 6322 98 111 DL 6448 79 85 DL 46 78 128 DL 973 24 26 DL 6456 138 150 DL 34 55 70 DL 1566 30 -1 DL 4612 15 15 DL 5859 20 -6 DL 947 23 26 DL 4343 66 63 DL 4544 63 75 DL 2024 19 -6 DL 50 19 15 DL 4910 22 17 DL 349 28 29 DL 317 17 5 DL 1681 31 35 DL 6345 38 45 DL 1145 16 -5 34 DL 341 26 -2 DL 3674 16 10 DL 5905 26 45 DL 1033 34 50 DL 4695 17 21 DL 76 17 24 DL 63 18 23 DL 180 42 79 DL 4658 29 30 DL 1555 60 59 DL 114 19 DL 4450 103 125 DL 719 41 33 DL 211 19 DL 4579 41 50 DL 6434 39 47 DL 8517 33 72 DL 4845 86 94 DL 6631 47 70 DL 1766 33 41 DL 4309 46 56 DL 147 42 40 DL 897 22 22 5 5 Brittany and Tim speculate that there is a linear relationship between how delayed a flight arrives to ATL and how late it departs from ATL. They want you to follow the steps below to predict a straight line relationship between arrival time and departure time of flights. 1. By hand or using technology, draw a scatterplot of the data: Graph the data from the table, arrival time (x) and departure time (y). The teacher will need to provide graph paper or allow students to use a graphing calculator and explain how to use it to graph this information. 35 2. Determine the regression line for predicting departure time from arrival time and display it and the r -value on the graph. The teacher may decide to provide students with the regression equation (LSRL) and the corresponding r-value. Predicted departure time = 13.571+ 0.3371 arrival time r = 0.2834 3. What does the low r-value represent mathematically? The data is not very linear, and thus the line is probability not a good fit for the data. However, the line still represents the average y values for each x value input for the sample space. 4. What does the low r-value mean in the context of how departure time relates to the arrival time of each aircraft? There's large variation in departure time for each arrival time of the aircrafts. Although there's a positive slope for the best fit line, the linear relationship between x and y is not as strong as we may have expected. 5. Assume that the specific flight Tim and Brittany are planning to take arrives to ATL 35 minutes late. Using the linear regression equation, on average how late will their flight depart out of ATL? Predicted departure time = 13.571+ 0.3371 arrival time = 13.571 + 0.3371 *3 5 = 25.3695 minutes