Download The Kelly criterion and its variants

Document related concepts

Beta (finance) wikipedia , lookup

Financialization wikipedia , lookup

Stock valuation wikipedia , lookup

Financial economics wikipedia , lookup

Investment fund wikipedia , lookup

Stock selection criterion wikipedia , lookup

Investment management wikipedia , lookup

Transcript
The Kelly criterion and its variants: theory and practice
in sports, lottery, futures & options trading
The symmetric downside Sharpe ratio and the evaluation of
great investors & speculators and their use of the Kelly
criterion
William T Ziemba
Alumni Professor at Financial Modeling and Stochastic Optimization, Emeritus,
Sauder School of Business, UBC, Vancouver, BC, Canada V6T 1Z2
email: [email protected]
Mathematical Finance Seminar
University of Chicago
April 6, 2007
References
MacLean, L C and W T Ziemba (2006) The Kelly criterion: theory and practice
Thorp, E. O. (2006). The Kelly criterion in blackjack, sports betting and the stock
market.
in S A Zenios and W T Ziemba, eds.,Handbook of Asset and Liability Management,
Volume A: Theory and Methodology, North Holland.
MacLean, Sanegre, Zhao, Ziemba (2004) Capital growth with security, J.Economic
Dynamics and Control, How to calculate the optimal Kelly fraction subject to being above a
wealth path with high probability
Ziemba, W T (2005) The Symmetric downside Sharpe ratio and the evaluation of great
investors and speculators, Journal of Portfolio Management (Fall).
Chapter 6 of W T Ziemba (2003) The Stochastic Programming Approach to Asset Liability
and Wealth Management, AIMR updated into various chapters in Ziemba and Ziemba
(2007), Scenarios for Risk Management and Global Investment Strategies,Wiley, July
which is Wilmott columns merged into a book
Samuelson, P A and W T Ziemba (2007) Understanding the finite properties of Kelly log
betting: a tale of five investors, Tech Report UBC
2
Abstract
•
The Kelly or capital growth criteria maximizes the expected logarithm as its utility function
period by period.
•
It has many desirable properties such as being myopic in that today’s optimal decision does
not depend upon yesterday’s or tomorrow’s data,
•
it asymptotically maximizes long run wealth almost surely and it attains arbitrarily large
wealth goals faster than any other strategy.
•
Also in an economy with one log bettor and all other essentially different strategy wagers,
the log bettor will eventually get all the economy’s wealth.
•
The drawback of log with its essentially zero Arrow-Pratt absolute risk aversion is that in
the short run it is the most risky utility function one would ever consider.
•
Since there is essentially no risk aversion, the wagers it suggests are very large and
typically undiversified.
•
Simulations show that log bettors have much more final wealth most of the time than those
using other strategies but can essentially go bankrupt a small percentage of the time, even
facing very favorable investment choices.
•
One way to modify the growth-security profile is to use either ad hoc or scientifically
computed fractional Kelly strategies that blend the log optimal portfolio with cash. to keep
one above the highest possible wealth path with high probability or to risk adjust the wealth
with convex penalties for being below the path
3
Abstract (cont’d)
•
For log normally distributed assets this simply means using a negative power utility
function whose risk aversion coefficient is 1:1 determined by the fraction and vice versa.
•
For other asset returns this is an approximate solution.
•
Thus one moves the risk aversion away from zero to a higher level.
•
This results in a smoother wealth path but usually has less growth.
•
This talk is a review of the good and bad properties of the Kelly and fractional Kelly
strategies and a discussion of their use in practice by great investors and speculators
most of whom have become centi-millionaires or billionaires by isolating profitable
anomalies and betting on them well with these strategies.
•
The latter include Bill Bentor the Hong Kong racing guru, Ed Thorp , the inventor of
blackjack card counting who compiled one of the finest hedge fund records.
•
Both of these gamblers had very smooth, low variance wealth paths.
•
Additionally legendary investors such as John Maynard Keynes (0.8 Kelly) running the
King’s College Cambridge endowment, George Soros (? Kelly) running the Quantum
funds and Warren Buffett (full Kelly) running Berkshire Hathaway had similarly good
results but had much more variable wealth paths.
•
The difference seems to be in the choice of fraction and other risk control measures that
relate to true diversification and position size relative to liquid assets under management.
4
Success in investments has two key pillars:
•
devising a strategy with positive expectation and
•
betting the right amount to balance growth of one’s fortune against the risk of
losses.
This talk discusses the Kelly or capital growth log utility criteria for investing.
A strategy which has wonderful asymptotic long run properties
•
the log bettor will dominate other strategies with probability one and
•
accumulate unbounded amount more wealth.
But in the short run the strategy can be very risky since it has very low Arrow-Pratt risk
aversion.
5
Fractional Kelly strategies provide more security but with less growth.
•
Examples from blackjack, horseracing,
illustrate the theory and its use in practice.
•
I have been fortunate to work/consult with seven individuals who turned a
humble beginning with essentially zero wealth into hundreds of millions (at
least five are billionaires) using security market imperfections and anomalies
in racing, futures trading and options mispricings.
•
Once they reach 200-300 million, then often log --> linear: bet on anything
with a “positive expectation” as long as you diversify and move their wealth
into the best hedge and alternative investment funds
•
All of them used Kelly or fractional Kelly betting strategies.
lotteries
and
futures
trading
6
Some points to learn from this research
• Means are by far the most important aspect of any portfolio problem.
• You must have the mean right to have good performance.
• If you have the mean right and do not overbet you should do well.
• In levered bets, it’s the left tail that can lead to trouble so you must not overbet or you can
have a large disaster occurring without warning.
• Behavioral and other anomalies can yield strategies that have positive means.
• These biases yield ideas that yield profitable positive mean strategies in racing, sports
betting and options markets.
• The capital growth or Kelly criterion strategy yields the most wealth in the long run and
dominates all other essentially different strategies.
• But in the short run, the expected log criterion with its essentially zero Arrow-Pratt risk
aversion index is very risky and can have substantial losses.
• The most you should ever bet is the log optimal amount; betting more is suboptimal and
betting double yields a zero growth rate.
• Negative power utility, which blends cash with the expected log maximizing portfolio
provides more security but has less long run growth.
• These fractional Kelly strategies are attractive for many investment situations;
determination of what fraction to use depends on constrained optimization models .
7
Growth versus Security: Tradeoffs in Dynamic Investment Analysis
• One is faced with a sequence of investments in periods 1, …, n
some favorable, some unfavorable
• Given an initial fortune, how should one invest over time to have long-run
growth of their fortune while at the same time maintaining its security?
• Develop computational schemes so that the investor can have a desired
growth and security tradeoff.
• Find simple operational policies that achieve these tradeoffs
• Transactions costs are crucial in practice so stochastic programming is
needed
• Use results to analyze favorable investment situations
8
POLAR APPROACHES
, Ziemba (2003)
Markowitz (1976), Hausch, Ziemba and Rubinstein (1981,
1985), Luenberger (1993) and others below
9
Laplace (17xy) and others including Bhulmann
10
If you bet on a horse, that’s gambling.
If you bet you can make three spades, that’s entertainment.
If you bet cotton will go up three points, that’s business.
See the difference?
Blackie Sherrod
11
Games: favorable or unfavorable
•
Blend growth versus security to your risk tolerance and the situation at hand
12
Effect of data input errors on portfolio performance
13
It’s the means that are the most important for investment success
14
Mean Percentage Cash Equivalent Loss Due to Errors in Inputs
t
100
1
RA
2
Pension funds 60-40 mix, RA=4 (Kallberg-Ziemba, 1983, Management Science)
Conclusion: spend your money getting good mean estimates and use historical variances and
covariances
Reference: Chopra and Ziemba (1993), Journal of Portfolio Management, reprinted in ZiembaMulvey (1998) Worldwide asset and liability management, Cambridge University Press
Results similar in period 1 of multiple period models and the sensitivity is especially high in
continuous time models. See examples in AIMR, 2003.
15
The results here apply to essentially all models. You must get the means
right to win!
Optimal asset weights at
stage 1 for varying levels of
US equity means in a
multiperiod stochastic
programming pension fund
model for Siemens Austria:
see Geyer and Ziemba
(2007, Operations Research)
 ’s euro equities (.9US)
• US bonds 7.2, =11.3
• Euro bonds 6.8, =3.7
16
Assuming the the mean return for US stocks is equal to the long run mean of 12% as
estimated by Dimson et al. (2002, 2006) --> the model yields an optimal weight for
equities of 100%.
A mean return for US stocks of 9% --> < 30% optimal weight for equities.
This is in a five period ten year stochastic programming model.
The sensitivity to the mean is much less in periods 2, …, T
17
Asset proportions: not practical: bonds vs stocks vs T-bill futures
18
The Symmetric Downside-Risk Sharpe Ratio
•
The Sharpe ratio is a very useful measure of investment performance.
•
However, it is based on mean-variance theory and thus is basically valid only for
quadratic preferences or normal distributions.
•
Hence skewed investment returns can lead to misleading conclusions.
•
This is especially true for superior investors such as Warren Buffett and others
with a large number of high returns.
•
Many of these superior investors use capital growth wagering ideas to implement
their strategies which leads to higher growth rates but also higher variability of
wealth.
•
A simple modification of the Sharpe ratio to assume that the upside deviation is
identical to the downside risk provides a useful modification that gives more
realistic results.
19
Using the Sharpe ratio
20
Using a modified Sharpe ratio that does not penalize gains
Summary over funds of negative observations and arithmetic and geometric means
21
The symmetric downside Sharpe ratio performance measure
•
we want to determine if Warren Buffett really is a better investor than the rather good
but lesser funds mentioned here, especially the Ford Foundation and the Harvard
endowment, in some fair way.
•
The idea is presented in a Figure below where we have plotted the Berkshire
Hathaway and Ford Foundation monthly returns as a histogram and show the losing
months and the winning months in a smooth curve. We want to penalize Warren for
losing but not for winning. So define the downside risk as
•
This is the downside variance measured from zero, not the mean, so it is more
precisely the downside risk.
•
To get the total variance we use twice the downside variance
22
The wealth levels from December 1985 to April 2000 for the Windsor Fund of
George Neff, the Ford Foundation, the Tiger Fund of Julian Robertson, the
Quantum Fund of George Soros and Berkshire Hathaway, the fund run by
Warren Buffett, as well as the S&P500 total return index.
23
Ford Foundation and Harvard Investment Corporation
Returns, quarterly data, June 1977 to March 2000
24
Comparison of ordinary and symmetric downside Sharpe yearly
performance measures
Only Buffett improves but he still does not beat the Ford Foundation - and Harvard is also
better than Buffett but not Ford with the quarterly data
Why? Tails still too fat
Thorp (2006) shows that Buffett is essentially a full Kelly bettor.
25
Berkshire Hathaway versus Ford Foundation, monthly returns
distribution, January 1977 to April 2000
26
Return distributions of all the funds, quarterly returns
distribution, December 1985 to March 2000
27
The Chest Fund, 1927-1945 (Keynes)
-w-0.25 (80% Kelly, 20% cash), see Ziemba (2003)
28
Gamblers like smooth wealth paths using fractional Kelly strategies
29
Princeton Newport Partners, LP, cumulative results, Nov 1968-Dec 1998
(Thorp)
DSSR=13.8
30
PNP: 15.1% net vs 10.2% for the S&P500
31
Log Utility
•
•
•
•
In the theory of optimal investment over time, it is not quadratic (the utility
function behind the Sharpe ratio) but log that yields the most long term growth.
But the elegant results on the Kelly (1956) criterion, as it is known in the
gambling literature and the capital growth theory as it is known in the investments
literature, see the survey by Hakansson and Ziemba (1995) and MacLean and
Ziemba (2006), that were proved rigorously by Breiman (1961) and generalized
by Algoet and Cover (1988) are long run asymptotic results.
However, the Arrow-Pratt absolute risk aversion of the log utility criterion is
essentially zero, where u is the utility function of wealth w,, and primes denote
differentiation.
The Arrow-Pratt risk aversion index.
is essentially zero, where u is the utility function of wealth w, and primes denote
differentiation.
•
Hence, in the short run, log can be an exceedingly risky utility function with wide
swings in wealth values.
32
Long run exponential growth is equivalent to maximizing the expected
log of one period’s returns
33
• Thus the criterion of maximizing the long run exponential rate of
asset growth is equivalent to maximizing the one period expected
logarithm of wealth.
So an optimal policy is myopic.
• Max G(f) = p log (1+f) + q log (1-f)
 f* = p-q
• The optimal fraction to bet is the edge p-q
34
Slew O’ Gold, 1984 Breeders
Cup Classic
f*=64% for place/show; suggests
fractional Kelly.
35
36
Classic Breiman Results
37
38
Kelly and half Kelly medium time simulations: Ziemba-Hausch (1986)
These were independent
39
The good, the bad and the ugly
166 times the wealth is more than 100 times initial wealth fail with full Kelly but
only once with half Kelly
But probability of being ahead is higher with half Kelly, 87% vs 95.4%
Min wealth is 18 and only 145 with half Kelly
700 bets all independent with a 14% edge, result you still lose over 98% of your
fortune with bad scenarios
With half Kelly, lose half of wealth only 1% of the time but 8.40% with full Kelly
40
Kentucky Derby 1934-1998
• Use inefficient market system in Hausch, Ziemba, Rubinstein (1981) and ZiembaHausch books
• Place/show wagers made when prices off sufficiently and EX≥ 1.10
w0 = $2500
63 years 72 wagers with 45 (62.5%) successful
41
Typical wealth level histories with one scenario (the actual results)
from place and show betting (Dr Z system) on the Kentucky Derby,
1934-1994 with Kelly, half Kelly and betting on the favorite strategies
42
Overbetting
Probability of doubling and
quadrupling before halving and
relative growth rates versus fraction of
wealth wagered for Blackjack (2%
advantage, p=0.51 and q=0.49
Should you ever be above 0.02 that is
positive power utility like
I think its dominated!
Betting more than the Kelly bet is non-optimal as risk increases and growth decreases;
betting double the Kelly leads to a growth rate of zero plus the riskfree asset.
LTCM was at this level or more, see AIMR, 2003.
43
Growth Rates Versus Probability of Doubling Before Halving for Blackjack
44
Fractional Kelly and negative power utility
u(w) =-w
<0
 0
u  log
f=1/(1- ) = fraction (Kelly) in log optimal portfolio, rest in cash
=0
f=1
full Kelly
=-1
f=1/2
1/2 Kelly
=--3
f=1/4
1/4 Kelly
futures trading down here
This is exact with log normaility and approximate otherwise.
45
Samuelson’s critique of Kelly betting
46
Commodity Trading: Turn of Year Effect
Small cap stocks have outperformed large cap stocks in January on a regular basis
since 1926
Average excess returns of smallest minus largest decile of US stocks, 1926-93, Source:
Ibbotson Associates
47
-7
sell
-8
-9
-10
Value Line minus S&P 500
-11
-12
buy
-13
-14
-15
-16
-17
Cash(VL-S&P)
-18
Futures(VL-S&P)
-19
-20
27
25
21
19
17
13
31
Ja
n-
Ja
n-
Ja
n-
Ja
n-
Ja
n-
Ja
n-
11
7
Ja
n-
Ja
n-
5
Ja
n-
Ja
n-
3
Ja
n-
ec
-3
0
D
ec
-2
8
D
ec
-2
3
D
ec
-2
1
D
ec
-1
7
D
ec
-1
5
D
ec
-9
ec
-1
3
D
D
ec
-3
ec
-7
D
D
D
ec
-1
-21
1993/1994 Turn of the Year
Futures play with anticipation, mid December to mid January, this is a typical
year in the mid 90s, Value Line versus S&P, 1992-3
48
Turn of the year effect
Relative growth rate and
probability of doubling,
tripling or tenfolding before
halving for various Kelly
strategies
Probability of reaching $10
million before ruin for Kelly, half
Kelly and quarter Kelly strategies
49
Turn of the year effect, recent developments
Futures markets - much more violent
Russell 2000 - has more volume than Value Line
Effect moved into December
Textbooks and finance experts say effect is not there
Graphs in Hensel-Ziemba paper in Keim-Ziemba (2000) Worldwide security market
imperfections, Cambridge University Press.
Doing this trade is like driving a dynamite truck smoking a cigar. You do it carefully.
Rendon-Ziemba (2005) update to 2005 turn of the year Value Line/S&P500 and Russell
2000/S&P500 spread trades
50
Unpopular numbers in the Canadian 6/49, 1984, 1986, and 1996 Lotto
51
Lotto games, experimental data
52
Probability of doubling, quadrupling and tenfolding before halving, Lotto 6/49
Case A
Case B
53
Probability of reaching the goal of $10 million before falling to $25,000 with
various initial wealth levels for Kelly, 1/2 Kelly and 1/4 Kelly wagering strategies
The downside of the analysis is that the expected time to win a lot is in the millions of
years.
54
The Investors
•
Tom, I believe, is overbetting and dominated and will go bankrupt
•
Harriet has a limited degree of risk tolerance, fits well with lots of
empirical Wall St equity premium data
55
Some tests
56
57
58
59
60
61
62
63
64
65
66
67
Horseracing
• market in miniature
• fundamental and technical systems
• returns and odds are determined by
1) participants -- like stock market, unlike roulette
2) transaction costs -- track take (17%), breakage;
rebates now plus Betfair (long short)
• bet to
1) win -- must be 1st
2) place -- must be 1st or 2nd
3) show -- must be 1st, 2nd or 3rd
68
Place market in horseracing
Inefficiencies are possible since:
1) more complex wager
2) prob(horse places) > prob(horse wins) ==> favorites may be good bets
To investigate place bets we need:
1) determine place payoffs
2) their likelihood
3) expected place payoffs
4) betting strategy, if expected payoffs are positive
Bettors do not like place and show bets.
69
The Idea
1. Use data in a simple market (win) to generate probabilities of outcomes
2. Then use those in a complex market (place and show) to find positive expectation
bets
3. Then bet on them following the capital growth theory to maximize long run wealth
70
Effect of transactions costs, calculation of optimal place and show Kelly bets
Non concave program but it seems to converge.
In practice, adjust q’s to replicate biases.
71
72
Use in a calculator
What we do in the system is to reduce the non-convex log optimization problem
down to four numbers: Wi,, W, and Si, S or Pi, P,
Thousands of race results regress the expected value and the optimal Kelly bet
as a function of these four variables.
Hence, you just find horses where the relative amount bet to place or show is
below the bet in the win pool.
The calculator tells you when the expected value is say 1.10 or better and
calculates the optimal Kelly bet.
So this can be done in say 15 seconds.
73
Exhibition Park, 1978, typical returns.
74
Aqueduct, 1981-82
75
Expected value approximation equations
w i /w 
Ex Place i  0.319  0.559

p
/
p
 i 
w i / w 
Ex Show i  0.543  0.369

 si / s 

•
Expected value (and optimal wager) are functions of only four numbers - the
totals and the horse in question.
•
These equations approximate the full optimized optimal growth model.
•
Solving the complex NLP: too much work and too much data for most
people.
•
This is used in the calculators, and Hausch-Ziemba (1985, Management
Science), differing track take, etc.
76
1983 Kentucky Derby
77
1991 Breeders’ Cup Race 5
78
Simulations in 2004-5
Real results April 2005-March 2006
Up ~ 36,000 ~ 2% on bets ~ 1.5 M,
System -7%, rebate ~ 9%, edge ~ +2%
79
Calculating the optimal Kelly fraction
To stay above a
wealth path using a
Kelly strategy is very
difficult
Kelly fractions and path achievement
• the more attractive the investment opportunity,
• the larger the bet size and
• hence the larger is the chance of falling below the path.
MSZZ using a continuous time lognormally distributed asset model calculate
that function to stay above a path at various points in time to stay with a high
exogenously specified value at risk probability.
Convex case like Geyer-Ziemba (2007) Siemens Vienna pension model - can
do on a computer; will develop the math
80
The planning horizon is T=3, with 64 scenarios each with probability 1/64
81
With initial wealth W(1)=1, the value at risk is a. The optimal investment decisions and
optimal growth rate for a, the secured average annual growth rate and 1-a, the security
level are shown in the table.
82
83
84
85
86
Guide to Capital Growth Theory and Kelly Criterion Literature
1956
Kelly
heuristic paper, original idea (Latane, 1957, also)
1961
Breiman,
original correct proofs
1969
Thorp
original application to sports betting
1981
Hausch-Ziemba-Rubinstein,
application to place and show system, books later Р
1984, 86, 87
1988
Algoet and Cover
most general proofs
1994
Hausch-Lo-Ziemba
reprints many k ey articles
1995
Hakansson & Ziemba
survey finance view point in Finance Handbook, reviews
HakanssonХs work
1998
Janacek MSc Thesis, Charles Univ. creative student
1998
Thorp
brilliant math analysis
1999
MacLean-Ziemba,
fractional Kelly examples (series of papers 1986+)
2002* MacLean-Ziemba
theory of targets rather than time, Time to Wealth
2003
Ziemba
AIMR, more simply written, many references
2004
MacLean-Ziemba et al
how to calculate the ТoptimalУfractional Kelly;
controversial constrained optimization, JEDC
87
Some properties of the Capital Growth Theory
88
Some properties of the Capital Growth Theory (cont’d)
89
Some properties of the Capital Growth Theory (cont’d)
90
Some properties of the Capital Growth Theory (cont’d)
91
Some properties of the Capital Growth Theory (cont’d)
92
Some properties of the Capital Growth Theory (cont’d)
93
References
Essentially all of the material in this talk is in the following books plus the papers
handed out
Ziemba, The Stochastic Programming Approach to Asset Liability Management, AIMR,
2003
Ziemba-Hausch, Dr Z’s Beat the Racetrack, William Morrow, 1987 (has UK betting
system)
Hausch-Lo-Ziemba, Efficiency of Racetrack Betting Systems, Academic Press, 1994.
Classic new and reprinted articles, bible for Hong Kong professional betting teams.
Originals sell for huge prices as high as $12,000 I am told, I sold one for $1400 last
week.
Ziemba-Vickson, Stochastic Optimization Models in Finance, Academic Press, 1975.
Classic articles, new articles, huge collection of portfolio theory, problems.
Reprinted by World Scientific, Singapore, 2006.
Ziemba et al, 6/49 Lotto Guidebook, 1986
Ziemba-Hausch, Betting at the Racetrack, 1986, exotic bet pricing
Books all available, [email protected] for information.
94