Download Strategies of Modern Statistics

Document related concepts

Investment management wikipedia , lookup

Transcript
The Surprising Consequences
of Randomness
LS 829
Mathematics in Science and Civilization
Feb 6, 2010
5/25/2017
LS 829 - 2010
1
Sources and Resources
• Statistics: A Guide to the Unknown, 4th ed.,
by R.Peck, et al. Publisher: Duxbury, 2006
• Taleb, N. N. (2008) Fooled by Randomness
The Hidden Role of Chance in the Markets
and Life, 2nd Edition. Random House.
• Mlodinow, L (2008) The Drunkard’s Walk.
Vintage Books. New York.
• Rosenthal, J.S. (2005) Struck by Lightning
Harper Perennial. Toronto.
• www.stat.sfu.ca/~weldon
5/25/2017
LS 829 - 2010
2
Introduction
• Randomness concerns Uncertainty - e.g. Coin
• Does Mathematics concern Certainty? - P(H) = 1/2
• Probability can help to Describe Randomness &
“Unexplained Variability”
• Randomness & Probability are key concepts for
exploring implications of “unexplained variability”
5/25/2017
LS 829 - 2010
3
Abstract
Real World
Mathematics
Applications of Mathematics
Probability
Applied Statistics
Useful Principles
5/25/2017
Surprising Findings
Nine Findings and Associated Principles
LS 829 - 2010
4
Example 1 When is Success just
Good Luck?
An example from the world of
Professional Sport
5/25/2017
LS 829 - 2010
5
5/25/2017
LS 829 - 2010
6
5/25/2017
LS 829 - 2010
7
Sports League - Football
Success = Quality or Luck?
2007 AFL LADDER
TEAM
Geelong
Port Adelaide
West Coast Eagles
Kangaroos
Hawthorn
Collingwood
Sydney Swans
Adelaide
St Kilda
Brisbane Lions
Fremantle
Essendon
Western Bulldogs
Melbourne
Carlton
Richmond
5/25/2017
Played
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
WinDraw Loss Points FOR Points Against Ratio
18 4
2542
1664
153
15 7
2314
2038
114
15 7
2162
1935
112
14 8
2183
1998
109
13 9
2097
1855
113
13 9
2011
1992
101
12
1
9
2031
1698
120
12 10
1881
1712
110
11
1
10
1874
1941
97
9
2
11
1986
1885
105
10 12
2254
2198
103
10 12
2184
2394
91
9
1
12
2111
2469
86
5 17
1890
2418
78
4 18
2167
2911
74
3
1
18
1958
2537
77
LS 829 - 2010
Points
72
60
60
56
52
52
50
48
46
40
40
40
38
20
16
14
8
5/25/2017
LS 829 - 2010
9
Recent News Report
“A crowd of 97,302 has witnessed Geelong break
its 44-year premiership drought by crushing a hapless
Port Adelaide by a record 119 points in Saturday's
grand final at the MCG.” (2007 Season)
5/25/2017
LS 829 - 2010
10
Sports League - Football
Success = Quality or Luck?
2007 AFL LADDER
TEAM
Geelong
Port Adelaide
West Coast Eagles
Kangaroos
Hawthorn
Collingwood
Sydney Swans
Adelaide
St Kilda
Brisbane Lions
Fremantle
Essendon
Western Bulldogs
Melbourne
Carlton
Richmond
5/25/2017
Played
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
WinDraw Loss Points FOR Points Against Ratio
18 4
2542
1664
153
15 7
2314
2038
114
15 7
2162
1935
112
14 8
2183
1998
109
13 9
2097
1855
113
13 9
2011
1992
101
12
1
9
2031
1698
120
12 10
1881
1712
110
11
1
10
1874
1941
97
9
2
11
1986
1885
105
10 12
2254
2198
103
10 12
2184
2394
91
9
1
12
2111
2469
86
5 17
1890
2418
78
4 18
2167
2911
74
3
1
18
1958
2537
77
LS 829 - 2010
Points
72
60
60
56
52
52
50
48
46
40
40
40
38
20
16
14
11
Are there better teams?
• How much variation in the total points table
would you expect IF
every team had the same chance of winning
every game? i.e. every game is 50-50.
• Try the experiment with 5 teams.
H=Win T=Loss (ignore Ties for now)
5/25/2017
LS 829 - 2010
12
5 Team Coin Toss Experiment
•Win=4, Tie=2, Loss=0 but we ignore ties. P(W)=1/2
•5 teams (1,2,3,4,5) so 10 games as follows
•1-2,1-3,1-4,1-5,2-3,2-4,2-5,3-4,3-5,4-5
My experiment …
• T T H T T H H H H T
Team
Points
3
16
Experiment
But all teams Equal Quality
2
12
Result
(Equal Chance to win)
----->
5
8
1
4
4
5/25/2017
LS0
829 - 2010
13
Implications?
• Points spread due to chance?
• Top team may be no better than the
bottom team (in chance to win).
5/25/2017
LS 829 - 2010
14
Simulation: 16 teams, equal chance to win, 22 games
5/25/2017
LS 829 - 2010
15
Sports League - Football
Success = Quality or Luck?
2007 AFL LADDER
TEAM
Geelong
Port Adelaide
West Coast Eagles
Kangaroos
Hawthorn
Collingwood
Sydney Swans
Adelaide
St Kilda
Brisbane Lions
Fremantle
Essendon
Western Bulldogs
Melbourne
Carlton
Richmond
5/25/2017
Played
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
WinDraw Loss Points FOR Points Against Ratio
18 4
2542
1664
153
15 7
2314
2038
114
15 7
2162
1935
112
14 8
2183
1998
109
13 9
2097
1855
113
13 9
2011
1992
101
12
1
9
2031
1698
120
12 10
1881
1712
110
11
1
10
1874
1941
97
9
2
11
1986
1885
105
10 12
2254
2198
103
10 12
2184
2394
91
9
1
12
2111
2469
86
5 17
1890
2418
78
4 18
2167
2911
74
3
1
18
1958
2537
77
LS 829 - 2010
Points
72
60
60
56
52
52
50
48
46
40
40
40
38
20
16
14
16
Does it Matter?
Avoiding foolish predictions
Managing competitors (of any kind)
Understanding the business of sport
Appreciating the impact of uncontrolled variation
in everyday life
5/25/2017
LS 829 - 2010
17
Point of this Example?
Need to discount “chance”
In making inferences from
everyday observations.
5/25/2017
LS 829 - 2010
18
Example 2 - Order from
Apparent Chaos
An example from some
personal data collection
5/25/2017
LS 829 - 2010
19
Gasoline Consumption
Each Fill - record kms and litres of fuel used
Smooth
--->
Seasonal
Pattern
….
Why?
5/25/2017
LS 829 - 2010
20
Pattern Explainable?
Air temperature?
Rain on roads?
Seasonal Traffic Pattern?
Tire Pressure?
Info Extraction Useful for Exploration of Cause
Smoothing was key technology in info extraction
5/25/2017
LS 829 - 2010
21
Intro to smoothing with context …
Jan 12, 2010
STAT 100
22
Optimal Smoothing Parameter?
• Depends on Purpose of Display
• Choice Ultimately Subjective
• Subjectivity is a necessary part
of good data analysis
5/25/2017
LS 829 - 2010
23
Summary of this Example
• Surprising? Order from Chaos …
• Principle - Smoothing and Averaging reveal
patterns encouraging investigation of cause
5/25/2017
LS 829 - 2010
24
3. Weather Forecasting
5/25/2017
LS 829 - 2010
25
Chaotic Weather
• 1900 – equations too complicated to solve
• 2000 – solvable but still poor predictors
• 1963 – The “Butterfly Effect”
small changes in initial conditions ->
large short term effects
• today – ensemble forecasting see p 173
• Rupert Miller p 178 – stats for short term …
5/25/2017
LS 829 - 2010
26
Conclusion from
Weather Example?
• It may not be true that weather forecasting
will improve dramatically in the future
• Some systems have inherent instability and
increased computing power may not be
enough the break through this barrier
5/25/2017
LS 829 - 2010
27
Example 4 - Obtaining
Confidential Information
•
•
•
•
•
How can you ask an individual for data on
Incomes
Illegal Drug use
Sex modes
…..Etc
in a way that will get an honest
response?
There is a need to protect confidentiality of answers.
5/25/2017
LS 829 - 2010
28
Example: Marijuana Usage
• Randomized Response Technique
Pose two Yes-No questions and have coin
toss determine which is answered
Head 1. Do you use Marijuana regularly?
Tail 2. Is your coin toss outcome a tail?
5/25/2017
LS 829 - 2010
29
Randomized Response
Technique
• Suppose 60 of 100 answer Yes. Then about
50 are saying they have a tail. So 10 of the
other 50 are users. 20%.
• It is a way of using randomization to protect
Privacy. Public Data banks have used this.
5/25/2017
LS 829 - 2010
30
Summary of Example 4
• Surprising that people can be induced to
provide sensitive information in public
• The key technique is to make use of the
predictability of certain empirical
probabilities.
5/25/2017
LS 829 - 2010
31
5. Randomness in the Markets
• 5A. Trends That Deceive
• 5B. The Power of Diversification
• 5C. Back-the-winner fallacy
5/25/2017
LS 829 - 2010
32
5A. Trends That Deceive
People often fail to appreciate the
effects of randomness
5/25/2017
LS 829 - 2010
33
The Random Walk
5/25/2017
LS 829 - 2010
34
Trends that do not persist
5/25/2017
LS 829 - 2010
35
Longer Random Walk
5/25/2017
LS 829 - 2010
36
Recent Intel Stock Price
5/25/2017
LS 829 - 2010
37
Things to Note
• The random walk has no patterns useful for
prediction of direction in future
• Stock price charts are well modeled by
random walks
• Advice about future direction of stock
prices – take with a grain of salt!
5/25/2017
LS 829 - 2010
38
5B. The Power of
Diversification
People often fail to appreciate the
effects of randomness
5/25/2017
LS 829 - 2010
39
Preliminary Proposal
I offer you the following “investment opportunity”
You give me $100. At the end of one year, I will
return an amount determined by tossing a fair
coins twice, as follows:
$0 ………25% of time
(TT)
$50.……. 25% of the time
(TH)
$100.……25% of the time
(HT)
$400.……25% of the time. (HH)
Would you be interested?
5/25/2017
LS 829 - 2010
40
Stock Market Investment
• Risky Company - example in a known context
• Return in 1 year for 1 share costing $1
0.00
25% of the time
0.50
25% of the time
1.00
25% of the time
4.00
25% of the time
i.e. Lose Money 50% of the time
Only Profit 25% of the time
“Risky” because high chance of loss
5/25/2017
LS 829 - 2010
41
Independent Outcomes
• What if you have the chance to put $1 into
each of 100 such companies, where the
companies are all in very different markets?
• What sort of outcomes then? Use cointossing (by computer) to explore
5/25/2017
LS 829 - 2010
42
Diversification
Unrelated Companies
• Choose 100 unrelated companies, each one
risky like this. Outcome is still uncertain
but look at typical outcomes ….
One-Year Returns to a $100 investment
5/25/2017
LS 829 - 2010
43
Looking at Profit only
Avg Profit approx 38%
5/25/2017
LS 829 - 2010
44
Gamblers like
Averages and Sums!
• The sum of 100 independent investments in
risky companies is very predictable!
• Sums (and averages) are more stable than
the things summed (or averaged).
Variation -----> Variation/n
• Square root law for variability of averages
5/25/2017
LS 829 - 2010
45
Summary - Diversification
• Variability is not Risk
• Stocks with volatile prices can be good
investments
• Criteria for Portfolio of Volatile Stocks
– profitable on average
– independence (or not severe dependence)
5/25/2017
LS 829 - 2010
46
5C - Back-the-winner fallacy
• Mutual Funds - a way of diversifying a
small investment
• Which mutual fund?
• Look at past performance?
• Experience from symmetric random walk
…
5/25/2017
LS 829 - 2010
47
Implication from Random Walk
…?
• Stock market trends may not persist
• Past might not be a good guide to future
• Some fund managers better than others?
• A small difference can result in a big
difference over a long time …
5/25/2017
LS 829 - 2010
48
A simulation experiment to
determine the value of past
performance data
• Simulate good and bad managers
• Pick the best ones based on 5 years data
• Simulate a future 5-yrs for these select
managers
5/25/2017
LS 829 - 2010
49
How to describe good and bad
fund managers?
• Use TSX Index over past 50 years as a
guide ---> annualized return is 10%
• Use a random walk with a slight upward
trend to model each manager.
• Daily change positive with probability p
Good manager
ROR = 13%pa p=.56
Medium manager
ROR = 10%pa p=.55
Poor manager
5/25/2017
ROR = 8% pa P=.54
LS 829 - 2010
50
5/25/2017
LS 829 - 2010
51
Simulation to test
“Back the Winner”
• 100 managers assigned various p
parameters in .54 to .56 range
• Simulate for 5 years
• Pick the top-performing mangers (top 15%)
• Use the same 100 p-parameters to simulate
a new 5 year experience
• Compare new outcome for “top” and
“bottom” managers
5/25/2017
LS 829 - 2010
52
START=100
5/25/2017
LS 829 - 2010
53
Mutual Fund Advice?
Don’t expect past relative performance to be a
good indicator of future relative
performance.
Again - need to give due allowance for
randomness (i.e. LUCK)
5/25/2017
LS 829 - 2010
54
Summary of Example 5C
• Surprising that Past Perfomance is such a
poor indicator of Future Performance
• Simulation is the key to exploring this issue
5/25/2017
LS 829 - 2010
55
6. Statistics in the Courtroom
• Kristen Gilbert Case
• Data p 6 of article – 10 years data needed!
• Table p 9 of article – rare outcome if only
randomness involved. P-value logic.
• Discount randomness but not quite proof
• Prosecutor’s Fallacy P[E|I] ≠ P[I|E]
5/25/2017
LS 829 - 2010
56
Lesson from Gilbert Case
• Statistical logic is subtle
• Easy to misunderstand
• Subjectivity necessary in some decisionmaking
5/25/2017
LS 829 - 2010
57
Example 7 - Lotteries:
Expectation and Hope
• Cash flow
–
–
–
–
Ticket proceeds in (100%)
50 %
Prize money out (50%)
Good causes (35%)
Administration and Sales (15%)
•$1.00 ticket worth 50 cents, on average
•Typical lottery P(jackpot) = .0000007
5/25/2017
LS 829 - 2010
58
How small is .0000007?
• Buy 10 tickets every week for 60 years
• Cost is $31,200.
• Chance of winning jackpot is = ….
1/5 of 1 percent!
5/25/2017
LS 829 - 2010
59
Summary of Example 7
•Surprising that lottery tickets provide
so little hope!
•Key technology is simple use of
probabilities
5/25/2017
LS 829 - 2010
60
Nine Surprising Findings
1.
2.
3.
4.
5A.
5B.
5C.
6.
7.
Sports Leagues - Lack of Quality Differentials
Gasoline Mileage - Seasonal Patterns
Weather - May be too unstable to predict
Marijuana – Can get Confidential info
Random Walk – Trends that are not there
Risky Stocks – Possibly a Reliable Investment
Mutual Funds – Past Performance not much help
Gilbert Case – Finding Signal amongst Noise
Lotteries - Lightning Seldom Strikes
5/25/2017
LS 829 - 2010
61
Nine Useful
Concepts & Techniques?
1. Sports Leagues - Unexplained variation can
cause illusions - simulation can inform
2. Gasoline Mileage - Averaging (and smoothing)
amplifies signals
3. Weather – Beware the Butterfly Effect!
4. Marijuana – Randomized Response Surveys
5A. Random Walks – Simulation can inform
5B. Risky Stocks - Simulation can inform
5C. Mutual Funds - Simulation can inform
6. Gilbert Case – Extracts Signal from Noise
7. Lotteries – 14 million is a big number!
5/25/2017
LS 829 - 2010
62
Role of Math?
• Key background for
–
–
–
–
Graphs
Probabilities
Simulation models
Smoothing Methods
• Important for constructing theory of
inference
5/25/2017
LS 829 - 2010
63
Limitation of Math
• Subjectivity Necessary in Decision-Making
• Extracting Information from Data is still
partly an “art”
• Context is suppressed in a mathematical
approach to problem solving
• Context is built in to a statistical approach
to problem solving
5/25/2017
LS 829 - 2010
64
The End
Questions, Comments, Criticisms…..
[email protected]
5/25/2017
LS 829 - 2010
65