Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
What Every Math Professor Needs to Know About Statistics Rick Cleary Babson College For AMATYC Denver, 11/19/16 Thanks! • To AMATYC and President Jane Tanner for the invitation! Thanks! • To AMATYC and President Jane Tanner for the invitation! • To all of the excellent teachers who helped me think about these topics. Thanks! • To AMATYC and President Jane Tanner for the invitation! • To all of the excellent teachers who helped me think about these topics. • To all of you for the really important work you do. I am glad a got to experience it! Three friends of mine… • Prof. Ken Mann… Ken's Website • Prof. Jim Hobert … Jim's Website • Prof. Marty Wells … Marty's website Three friends of mine… • Prof. Ken Mann… Ken's Website -Mohawk Valley Community College • Prof. Jim Hobert … Jim's Website -SUNY Ulster • Prof. Marty Wells … Marty's website -Shasta College A bit about Babson… • Located in Wellesley, Massachusetts • 2200 undergraduates, all business majors with different concentrations. A bit about Babson… • Located in Wellesley, Massachusetts • 2200 undergraduates, all business majors with different concentrations. • All students take at least two semesters of a quant methods sequence with statistics, a little calculus, math of finance and operations research. A bit about Babson… • Located in Wellesley, Massachusetts • 2200 undergraduates, all business majors with different concentrations. • All students take at least two semesters of a quant methods sequence with statistics, a little calculus, math of finance and operations research. • Quite a few two year college transfers! A fun puzzle! • Find the next number in each sequence A.) 91, 72, 45, 9, 81 … B.) 13, 19, 60, 51, 9 … (This is the one previewed for the conference.) Sequence A • 91, 72, 45, 9, 81 … • The next number is … Sequence A • The next number is … 63 Sequence A • The next number is … 63 -Sequence was 91, 72, 45, 9, 81 … Consider each as a two digit number. Reverse the digits then take absolute value of the difference. Sequence B • 13, 19, 60, 51, 9, … • The next number is … Last call! B.) 13, 19, 60, 51, 9, … Last call! B.) 13, 19, 60, 51, 9, … But before we answer, let’s list some numbers that we’re pretty sure are NOT next! Last call! B.) 13, 19, 60, 51, 9, … How about π ? Probably not? Last call! B.) 13, 19, 60, 51, 9, … How about π ? Probably not? 1,346,700? Last call! B.) 13, 19, 60, 51, 9, … How about π ? Probably not? 1,346,700? - 215? What’s NOT next? • Thinking about what the answer is NOT gives us quite a bit or information ... What’s NOT next? • Thinking about what the answer is NOT gives us quite a bit or information ... • OK … on to the solution: Solution • I don’t know! Solution • I don’t know! …B is a list of five numbers randomly generated using Minitab on a uniform distribution on {1, 2, …, 100}. LESSON NUMBER 1 • Some things are inherently random. Somebody might correctly guess the next number in the sequence, but they would have to be lucky. LESSON NUMBER 1 • Some things are inherently random. Somebody might correctly guess the next number in the sequence, but they would have to be lucky. • Or to put it another way: LESSON NUMBER 1 • Some things are inherently random. Somebody might correctly guess the next number in the sequence, but they would have to be lucky. • Or to put it another way: The world is more random than most people think! Other examples: • Stars in the night sky… pretty random but people attached stories to them! Other examples: • Stars in the night sky… pretty random but people attached stories to them! • Stock market results … analysts on the news offer reasons every day for an inherently random process! Another fun experiment • Step 1: Draw a square Another fun experiment • Step 1: Draw a square • Step 2: Place five points (x’s) at random in the square Another fun experiment • Step 1: Draw a square • Step 2: Place five points (x’s) at random in the square • Step 3: Draw nine equal sub-squares in your original, i.e. a tic-tac-toe board in your original square. A guess… • Your five points are in five different subsquares, aren’t they? (And if yours aren’t, your students will be!) A guess… • Your five points are in five different subsquares, aren’t they? (And if yours aren’t, your students will be!) • THAT’S NOT RANDOM!!! That’s a pattern. Real randomness is ‘clumpier’ than people expect. A quick representation of Lesson #1 • MATH: y = f(x) A quick representation of Lesson #1 • MATH: • STAT: y = f(x) y = f(x) + error Lesson #2 - Models • Quick quiz … 1.) True/False: Mathematics is a useful tool for modeling real world problems. Lesson #2 - Models • Quick quiz … 1.) True/False: Mathematics is a useful tool for modeling real world problems. 2.) Name one of the problems. Lesson #2 - Models Let’s see what the crowd likes for math models: What about stat models? They are everywhere … like in any comparative sentence! What about stat models? They are everywhere … like in any comparative sentence! “This is a pretty good sandwich for a fast food place!” Lesson #2 in language… “This is a pretty good sandwich for a fast food place!” Quality = f(venue) + error Lesson #2 in language… “This is a pretty good sandwich for a fast food place!” Quality = f(venue) + error “Pedroia has a lot of power for such a small guy.” Lesson #2 in language… “This is a pretty good sandwich for a fast food place!” Quality = f(venue) + error “Pedroia has a lot of power for such a small guy.” Power = f(size) + error Power = f(size) + error • A mathematician modeling this relationship might develop an equation or system that has inputs like swing plane, bat speed, pitch speed and location and an output of a trajectory… Power = f(size) + error • A mathematician modeling this relationship might develop an equation or system that has inputs with details swing plane, bat speed, pitch speed and location and an output of a trajectory… • A statistician would be interested in the strength of the association between the inputs and the outputs. How much of power can be explained by size? To Recap: • Lesson 1: The world is more random than people think. • Lesson 2: Statements that can be interpreted as statistical models are common. From models to decisions: • Statistical models are easy to find, but studying them and applying them takes a careful approach. This is where statistical expertise comes in! Models to decisions: • Statistical models are easy to find, but studying them and applying them takes a careful approach. This is where statistical expertise comes in! -What model? -What population? Models to decisions: • Statistical models are easy to find, but studying them and applying them takes a careful approach. This is where statistical expertise comes in! -What model? -What population? -What sample? -What measures? Models to decisions: • Statistical models are easy to find, but studying them and applying them takes a careful approach. This is where statistical expertise comes in! -What model? -What population? -What sample? -What measures? -Outliers? -Presentation? A timely aside… • Statisticians reputations took a hit with the recent election. But a sophisticated look at the question paints a much less negative picture. A timely aside… • Statisticians reputations took a hit with the recent election. But a sophisticated look at the question paints a much less negative picture. • Election eve www.fivethirtyeight.com had the probability of Trump winning the election at about 29%. Events like that happen all the time! A timely aside: • By and large the election results nationally were well within the margins of errors of the polls. A timely aside: • By and large the election results nationally were well within the margins of errors of the polls. • Media outlets and their consumers want clear cut results like “Clinton will win” when a statement like “Clinton appears likely to win, but Trump has a reasonable chance” tells the story! A timely aside: • That said, the fact that most polling data had a small but consistent bias is something that statisticians within those polling firms will need to address. A timely aside: • That said, the fact that most polling data had a small but consistent bias is something that statisticians within those polling firms will need to address. • Opinion: The most likely causes are non-response, not false response; and incorrectly predicting likelihood of voting Probability and Statistics • Many math departments have a course with this title. Probability and Statistics • Many math departments have a course with this title. Inside joke: Why are most Prob and Stat courses like BWI, the BaltimoreWashington International Airport? Probability and Statistics • Many math departments have a course with this title. Inside joke: Why are most Prob and Stat courses like BWI, the BaltimoreWashington International Airport? Answer: BWI is a lot closer to Baltimore. Lesson 3… • Statisticians teaching probability should try to back up the theoretical probability models with real data and statistical thinking. Speaking of sports… Sports media and sports fans love streaks … Famous examples? Speaking of sports… Sports media and sports fans love streaks … Famous examples? HOWEVER, sports media and sports fans tend to underestimate the probability of streaks by taking them out of context. More on streaks… • Google the phrase “Sports Probability Streaks” and the resulting link is: http://www.mathaware.org/mam/2010/ess ays/ClearyStreaks.pdf Streak Example: • At a 2007 game, the Boston Red Sox hit four home runs in a row. Streak Example: • At a 2007 game, the Boston Red Sox hit four home runs in a row. • The next day, the Boston Globe quoted a local math professor that the chance of this was about one in two million. Streak Example: • At a 2007 game, the Boston Red Sox hit four home runs in a row. • The next day, the Boston Globe quoted a local math professor that the chance of this was about one in two million. • Done by taking MLB home run rate (.027) to the fourth power. Surprising streaks • Seems reasonable, right? Surprising streaks • Seems reasonable, right? • EXCEPT this was the fourth time it had happened in about 140,000 MLB games. Surprising streaks • Seems reasonable, right? • EXCEPT this was the fourth time it had happened in about 140,000 MLB games. • So how was the expert opinion off by two orders of magnitude??? Two reasons… • Small but significant reason… the four players (Manny Ramirez, Mike Lowell, JD Drew, Jason Varitek) all had much higher than average home run rates. Two reasons… • Small but significant reason… the four players (Manny Ramirez, Mike Lowell, JD Drew, Jason Varitek) all had much higher than average home run rates. • Big reason: Context! Four home runs in a row during the game vs. four home runs in a row RIGHT NOW! Context explained… • About 80 hitters come up during a major league baseball game so the Red Sox and opponent (Yankees) had MANY CHANCES to start such a streak! Context explained… • About 80 hitters come up during a major league baseball game so the Red Sox and opponent (Yankees) had MANY CHANCES to start such a streak! • This is why streak probabilities are underestimated; people forget there are many chances to start a streak. Classroom fun… 1.) Have some students actually flip a coin about 80 times, tell others to just simulate by writing H’s/T’s. The ones with streaks of five or more H’s or T’s really did the flips. (You can look like a genius!) Classroom fun… 1.) Have some students actually flip a coin about 80 times…The ones with streaks of five or more H’s or T’s really did the flips. (You can look like a genius! … OR not!) Classroom fun… 2.) Sports interested students: Find some NBA teams with records of about .500 at www.basketball-reference.com See if they had a winning streak or losing streak of at least five in a row. (About 85% chance that yes they did…) Classroom fun… 2.) Sports interested students: Find some NBA teams with records of about .500 at www.basketball-reference.com See if they had a winning streak or losing streak of at least five in a row. (About 85% chance that yes they did…) Does ANY fan, writer, blogger think a five game winning streak is random? Lesson 3 Lesson 1 • Recapping again: 1.) The world is more random than you think. 2.) Stat models are everywhere and take that randomness into account. 3.) Check claims with real data … and back to 1! What About the Other Way? • What can statisticians learn from mathematicians? What About the Other Way? • What can statisticians learn from mathematicians? • Sometimes statisticians need to evaluate integrals in high dimensional spaces. What About the Other Way? • What can statisticians learn from mathematicians? • Sometimes statisticians need to evaluate integrals in high dimensional spaces. • Examples: Variance calculations, joint probabilities, Bayesian analysis… Stat learns from Math… • These high dimensional integrals are sometimes easiest to evaluate in a particular order ... Stat learns from Math… • These high dimensional integrals are sometimes easiest to evaluate in a particular order ... • A key question: WHEN CAN WE CHANGE THE ORDER OF INTEGRATION? Math to The Rescue! Fubini’s Theorem! Suppose A and B are complete measure spaces. Suppose f(x,y) is A × B measurable. If where the integral is taken with respect to a product measure on the space over A × B, then we can change the order of integration… i.e. Fubini continued… A slight modification to Fubini • A statistician has a slightly different answer to the question, “When can we change the order of integration?” A slight modification to Fubini • A statistician has a slightly different answer to the question, “When can we change the order of integration?” Whenever we want! Neat Streak #2… Barton College basketball… 1/21/13 … Barton 76, Pfeiffer 68 Neat Streak #2… • Barton College basketball… 1/21/13 … Barton 76, Pfeiffer 68 1/24/13 … Barton 76, Queens 68 Neat Streak #2… • Barton College basketball… 1/21/13 … Barton 76, Pfeiffer 68 1/24/13 … Barton 76, Queens 68 1/26/13 … Barton 76, Erskine 68 Neat Streak #2… • Barton College basketball… 1/21/13 … Barton 76, Pfeiffer 68 1/24/13 … Barton 76, Queens 68 1/26/13 … Barton 76, Erskine 68 What’s the next score in THIS sequence?? Neat Streak #2… • Barton College basketball… 1/21/13 … Barton 76, Pfeiffer 68 1/24/13 … Barton 76, Queens 68 1/26/13 … Barton 76, Erskine 68 What’s the next score in THIS sequence?? (Alas, they lost to Mount Olive, 80-75.) What are the chances? • Of a team winning three straight games by the same score? What are the chances? • Of a team winning three straight games by the same score? • -For YOUR favorite or hometown team, this season, the probability is very small! What are the chances? • Of a team winning three straight games by the same score? • -For YOUR favorite or hometown team, this season, the probability is very small! • -For SOME team in somebody’s town during the next 20 years? There’s a pretty good chance it will happen again! Neat Streak #3… • The Chicago Blackhawks opened this year’s National Hockey League season with 24 straight games without a regulation time loss. Neat Streak #3… • The Chicago Blackhawks opened this year’s National Hockey League season with 24 straight games without a regulation time loss. • Again, what are the chances? And again, the media was interested: http://www.usatoday.com/story/sports/nhl/bla ckhawks/2013/03/04/blackhawks-streakodds/1963659/ Getting quoted is (usually) fun… • Later in the year, this appeared in USAToday; • http://www.usatoday.com/story/sports/2 013/05/23/home-iceadvantage/2354541/ The lesson here is… A business implication: Mathematicians rarely get consulting jobs solving problems from Calculus I. But statisticians routinely get paid (and praised) for solving problems from Stat 101. • THANKS for your attention and for all the good work you do!