Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Surprising Consequences of Randomness LS 829 Mathematics in Science and Civilization Feb 6, 2010 5/25/2017 LS 829 - 2010 1 Sources and Resources • Statistics: A Guide to the Unknown, 4th ed., by R.Peck, et al. Publisher: Duxbury, 2006 • Taleb, N. N. (2008) Fooled by Randomness The Hidden Role of Chance in the Markets and Life, 2nd Edition. Random House. • Mlodinow, L (2008) The Drunkard’s Walk. Vintage Books. New York. • Rosenthal, J.S. (2005) Struck by Lightning Harper Perennial. Toronto. • www.stat.sfu.ca/~weldon 5/25/2017 LS 829 - 2010 2 Introduction • Randomness concerns Uncertainty - e.g. Coin • Does Mathematics concern Certainty? - P(H) = 1/2 • Probability can help to Describe Randomness & “Unexplained Variability” • Randomness & Probability are key concepts for exploring implications of “unexplained variability” 5/25/2017 LS 829 - 2010 3 Abstract Real World Mathematics Applications of Mathematics Probability Applied Statistics Useful Principles 5/25/2017 Surprising Findings Nine Findings and Associated Principles LS 829 - 2010 4 Example 1 When is Success just Good Luck? An example from the world of Professional Sport 5/25/2017 LS 829 - 2010 5 5/25/2017 LS 829 - 2010 6 5/25/2017 LS 829 - 2010 7 Sports League - Football Success = Quality or Luck? 2007 AFL LADDER TEAM Geelong Port Adelaide West Coast Eagles Kangaroos Hawthorn Collingwood Sydney Swans Adelaide St Kilda Brisbane Lions Fremantle Essendon Western Bulldogs Melbourne Carlton Richmond 5/25/2017 Played 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 WinDraw Loss Points FOR Points Against Ratio 18 4 2542 1664 153 15 7 2314 2038 114 15 7 2162 1935 112 14 8 2183 1998 109 13 9 2097 1855 113 13 9 2011 1992 101 12 1 9 2031 1698 120 12 10 1881 1712 110 11 1 10 1874 1941 97 9 2 11 1986 1885 105 10 12 2254 2198 103 10 12 2184 2394 91 9 1 12 2111 2469 86 5 17 1890 2418 78 4 18 2167 2911 74 3 1 18 1958 2537 77 LS 829 - 2010 Points 72 60 60 56 52 52 50 48 46 40 40 40 38 20 16 14 8 5/25/2017 LS 829 - 2010 9 Recent News Report “A crowd of 97,302 has witnessed Geelong break its 44-year premiership drought by crushing a hapless Port Adelaide by a record 119 points in Saturday's grand final at the MCG.” (2007 Season) 5/25/2017 LS 829 - 2010 10 Sports League - Football Success = Quality or Luck? 2007 AFL LADDER TEAM Geelong Port Adelaide West Coast Eagles Kangaroos Hawthorn Collingwood Sydney Swans Adelaide St Kilda Brisbane Lions Fremantle Essendon Western Bulldogs Melbourne Carlton Richmond 5/25/2017 Played 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 WinDraw Loss Points FOR Points Against Ratio 18 4 2542 1664 153 15 7 2314 2038 114 15 7 2162 1935 112 14 8 2183 1998 109 13 9 2097 1855 113 13 9 2011 1992 101 12 1 9 2031 1698 120 12 10 1881 1712 110 11 1 10 1874 1941 97 9 2 11 1986 1885 105 10 12 2254 2198 103 10 12 2184 2394 91 9 1 12 2111 2469 86 5 17 1890 2418 78 4 18 2167 2911 74 3 1 18 1958 2537 77 LS 829 - 2010 Points 72 60 60 56 52 52 50 48 46 40 40 40 38 20 16 14 11 Are there better teams? • How much variation in the total points table would you expect IF every team had the same chance of winning every game? i.e. every game is 50-50. • Try the experiment with 5 teams. H=Win T=Loss (ignore Ties for now) 5/25/2017 LS 829 - 2010 12 5 Team Coin Toss Experiment •Win=4, Tie=2, Loss=0 but we ignore ties. P(W)=1/2 •5 teams (1,2,3,4,5) so 10 games as follows •1-2,1-3,1-4,1-5,2-3,2-4,2-5,3-4,3-5,4-5 My experiment … • T T H T T H H H H T Team Points 3 16 Experiment But all teams Equal Quality 2 12 Result (Equal Chance to win) -----> 5 8 1 4 4 5/25/2017 LS0 829 - 2010 13 Implications? • Points spread due to chance? • Top team may be no better than the bottom team (in chance to win). 5/25/2017 LS 829 - 2010 14 Simulation: 16 teams, equal chance to win, 22 games 5/25/2017 LS 829 - 2010 15 Sports League - Football Success = Quality or Luck? 2007 AFL LADDER TEAM Geelong Port Adelaide West Coast Eagles Kangaroos Hawthorn Collingwood Sydney Swans Adelaide St Kilda Brisbane Lions Fremantle Essendon Western Bulldogs Melbourne Carlton Richmond 5/25/2017 Played 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 WinDraw Loss Points FOR Points Against Ratio 18 4 2542 1664 153 15 7 2314 2038 114 15 7 2162 1935 112 14 8 2183 1998 109 13 9 2097 1855 113 13 9 2011 1992 101 12 1 9 2031 1698 120 12 10 1881 1712 110 11 1 10 1874 1941 97 9 2 11 1986 1885 105 10 12 2254 2198 103 10 12 2184 2394 91 9 1 12 2111 2469 86 5 17 1890 2418 78 4 18 2167 2911 74 3 1 18 1958 2537 77 LS 829 - 2010 Points 72 60 60 56 52 52 50 48 46 40 40 40 38 20 16 14 16 Does it Matter? Avoiding foolish predictions Managing competitors (of any kind) Understanding the business of sport Appreciating the impact of uncontrolled variation in everyday life 5/25/2017 LS 829 - 2010 17 Point of this Example? Need to discount “chance” In making inferences from everyday observations. 5/25/2017 LS 829 - 2010 18 Example 2 - Order from Apparent Chaos An example from some personal data collection 5/25/2017 LS 829 - 2010 19 Gasoline Consumption Each Fill - record kms and litres of fuel used Smooth ---> Seasonal Pattern …. Why? 5/25/2017 LS 829 - 2010 20 Pattern Explainable? Air temperature? Rain on roads? Seasonal Traffic Pattern? Tire Pressure? Info Extraction Useful for Exploration of Cause Smoothing was key technology in info extraction 5/25/2017 LS 829 - 2010 21 Intro to smoothing with context … Jan 12, 2010 STAT 100 22 Optimal Smoothing Parameter? • Depends on Purpose of Display • Choice Ultimately Subjective • Subjectivity is a necessary part of good data analysis 5/25/2017 LS 829 - 2010 23 Summary of this Example • Surprising? Order from Chaos … • Principle - Smoothing and Averaging reveal patterns encouraging investigation of cause 5/25/2017 LS 829 - 2010 24 3. Weather Forecasting 5/25/2017 LS 829 - 2010 25 Chaotic Weather • 1900 – equations too complicated to solve • 2000 – solvable but still poor predictors • 1963 – The “Butterfly Effect” small changes in initial conditions -> large short term effects • today – ensemble forecasting see p 173 • Rupert Miller p 178 – stats for short term … 5/25/2017 LS 829 - 2010 26 Conclusion from Weather Example? • It may not be true that weather forecasting will improve dramatically in the future • Some systems have inherent instability and increased computing power may not be enough the break through this barrier 5/25/2017 LS 829 - 2010 27 Example 4 - Obtaining Confidential Information • • • • • How can you ask an individual for data on Incomes Illegal Drug use Sex modes …..Etc in a way that will get an honest response? There is a need to protect confidentiality of answers. 5/25/2017 LS 829 - 2010 28 Example: Marijuana Usage • Randomized Response Technique Pose two Yes-No questions and have coin toss determine which is answered Head 1. Do you use Marijuana regularly? Tail 2. Is your coin toss outcome a tail? 5/25/2017 LS 829 - 2010 29 Randomized Response Technique • Suppose 60 of 100 answer Yes. Then about 50 are saying they have a tail. So 10 of the other 50 are users. 20%. • It is a way of using randomization to protect Privacy. Public Data banks have used this. 5/25/2017 LS 829 - 2010 30 Summary of Example 4 • Surprising that people can be induced to provide sensitive information in public • The key technique is to make use of the predictability of certain empirical probabilities. 5/25/2017 LS 829 - 2010 31 5. Randomness in the Markets • 5A. Trends That Deceive • 5B. The Power of Diversification • 5C. Back-the-winner fallacy 5/25/2017 LS 829 - 2010 32 5A. Trends That Deceive People often fail to appreciate the effects of randomness 5/25/2017 LS 829 - 2010 33 The Random Walk 5/25/2017 LS 829 - 2010 34 Trends that do not persist 5/25/2017 LS 829 - 2010 35 Longer Random Walk 5/25/2017 LS 829 - 2010 36 Recent Intel Stock Price 5/25/2017 LS 829 - 2010 37 Things to Note • The random walk has no patterns useful for prediction of direction in future • Stock price charts are well modeled by random walks • Advice about future direction of stock prices – take with a grain of salt! 5/25/2017 LS 829 - 2010 38 5B. The Power of Diversification People often fail to appreciate the effects of randomness 5/25/2017 LS 829 - 2010 39 Preliminary Proposal I offer you the following “investment opportunity” You give me $100. At the end of one year, I will return an amount determined by tossing a fair coins twice, as follows: $0 ………25% of time (TT) $50.……. 25% of the time (TH) $100.……25% of the time (HT) $400.……25% of the time. (HH) Would you be interested? 5/25/2017 LS 829 - 2010 40 Stock Market Investment • Risky Company - example in a known context • Return in 1 year for 1 share costing $1 0.00 25% of the time 0.50 25% of the time 1.00 25% of the time 4.00 25% of the time i.e. Lose Money 50% of the time Only Profit 25% of the time “Risky” because high chance of loss 5/25/2017 LS 829 - 2010 41 Independent Outcomes • What if you have the chance to put $1 into each of 100 such companies, where the companies are all in very different markets? • What sort of outcomes then? Use cointossing (by computer) to explore 5/25/2017 LS 829 - 2010 42 Diversification Unrelated Companies • Choose 100 unrelated companies, each one risky like this. Outcome is still uncertain but look at typical outcomes …. One-Year Returns to a $100 investment 5/25/2017 LS 829 - 2010 43 Looking at Profit only Avg Profit approx 38% 5/25/2017 LS 829 - 2010 44 Gamblers like Averages and Sums! • The sum of 100 independent investments in risky companies is very predictable! • Sums (and averages) are more stable than the things summed (or averaged). Variation -----> Variation/n • Square root law for variability of averages 5/25/2017 LS 829 - 2010 45 Summary - Diversification • Variability is not Risk • Stocks with volatile prices can be good investments • Criteria for Portfolio of Volatile Stocks – profitable on average – independence (or not severe dependence) 5/25/2017 LS 829 - 2010 46 5C - Back-the-winner fallacy • Mutual Funds - a way of diversifying a small investment • Which mutual fund? • Look at past performance? • Experience from symmetric random walk … 5/25/2017 LS 829 - 2010 47 Implication from Random Walk …? • Stock market trends may not persist • Past might not be a good guide to future • Some fund managers better than others? • A small difference can result in a big difference over a long time … 5/25/2017 LS 829 - 2010 48 A simulation experiment to determine the value of past performance data • Simulate good and bad managers • Pick the best ones based on 5 years data • Simulate a future 5-yrs for these select managers 5/25/2017 LS 829 - 2010 49 How to describe good and bad fund managers? • Use TSX Index over past 50 years as a guide ---> annualized return is 10% • Use a random walk with a slight upward trend to model each manager. • Daily change positive with probability p Good manager ROR = 13%pa p=.56 Medium manager ROR = 10%pa p=.55 Poor manager 5/25/2017 ROR = 8% pa P=.54 LS 829 - 2010 50 5/25/2017 LS 829 - 2010 51 Simulation to test “Back the Winner” • 100 managers assigned various p parameters in .54 to .56 range • Simulate for 5 years • Pick the top-performing mangers (top 15%) • Use the same 100 p-parameters to simulate a new 5 year experience • Compare new outcome for “top” and “bottom” managers 5/25/2017 LS 829 - 2010 52 START=100 5/25/2017 LS 829 - 2010 53 Mutual Fund Advice? Don’t expect past relative performance to be a good indicator of future relative performance. Again - need to give due allowance for randomness (i.e. LUCK) 5/25/2017 LS 829 - 2010 54 Summary of Example 5C • Surprising that Past Perfomance is such a poor indicator of Future Performance • Simulation is the key to exploring this issue 5/25/2017 LS 829 - 2010 55 6. Statistics in the Courtroom • Kristen Gilbert Case • Data p 6 of article – 10 years data needed! • Table p 9 of article – rare outcome if only randomness involved. P-value logic. • Discount randomness but not quite proof • Prosecutor’s Fallacy P[E|I] ≠ P[I|E] 5/25/2017 LS 829 - 2010 56 Lesson from Gilbert Case • Statistical logic is subtle • Easy to misunderstand • Subjectivity necessary in some decisionmaking 5/25/2017 LS 829 - 2010 57 Example 7 - Lotteries: Expectation and Hope • Cash flow – – – – Ticket proceeds in (100%) 50 % Prize money out (50%) Good causes (35%) Administration and Sales (15%) •$1.00 ticket worth 50 cents, on average •Typical lottery P(jackpot) = .0000007 5/25/2017 LS 829 - 2010 58 How small is .0000007? • Buy 10 tickets every week for 60 years • Cost is $31,200. • Chance of winning jackpot is = …. 1/5 of 1 percent! 5/25/2017 LS 829 - 2010 59 Summary of Example 7 •Surprising that lottery tickets provide so little hope! •Key technology is simple use of probabilities 5/25/2017 LS 829 - 2010 60 Nine Surprising Findings 1. 2. 3. 4. 5A. 5B. 5C. 6. 7. Sports Leagues - Lack of Quality Differentials Gasoline Mileage - Seasonal Patterns Weather - May be too unstable to predict Marijuana – Can get Confidential info Random Walk – Trends that are not there Risky Stocks – Possibly a Reliable Investment Mutual Funds – Past Performance not much help Gilbert Case – Finding Signal amongst Noise Lotteries - Lightning Seldom Strikes 5/25/2017 LS 829 - 2010 61 Nine Useful Concepts & Techniques? 1. Sports Leagues - Unexplained variation can cause illusions - simulation can inform 2. Gasoline Mileage - Averaging (and smoothing) amplifies signals 3. Weather – Beware the Butterfly Effect! 4. Marijuana – Randomized Response Surveys 5A. Random Walks – Simulation can inform 5B. Risky Stocks - Simulation can inform 5C. Mutual Funds - Simulation can inform 6. Gilbert Case – Extracts Signal from Noise 7. Lotteries – 14 million is a big number! 5/25/2017 LS 829 - 2010 62 Role of Math? • Key background for – – – – Graphs Probabilities Simulation models Smoothing Methods • Important for constructing theory of inference 5/25/2017 LS 829 - 2010 63 Limitation of Math • Subjectivity Necessary in Decision-Making • Extracting Information from Data is still partly an “art” • Context is suppressed in a mathematical approach to problem solving • Context is built in to a statistical approach to problem solving 5/25/2017 LS 829 - 2010 64 The End Questions, Comments, Criticisms….. [email protected] 5/25/2017 LS 829 - 2010 65