Download JKaplan_2010_MOP_Presentation

Financial Noise And Overconfidence Ubiquitous News and Advice  CNBC  WSJ  Jim Cramer  Assorted talking heads  How good can it be? Thought Experiment  Listen to CNBC every day  Choose ten stocks from recommendations  Buy stocks  How good can this strategy be? Thought Experiment  10 stocks x 260 business days = 2,600 bets  Assume the bets are independent  How good does the information have to be such that the probability of losing money is 1 in 10,000? Detour: Gaussian Distribution  Random Variable X depends upon two parameters: ‣ mean value:  ‣ uncertainty (standard deviation):  1  x     Proba  X  b   exp   dx 2 2   2 b 2 a 2 Detour: Gaussian Distribution  Universal “Bell Curve” Distribution  Prob(X < 0) depends only on the ratio of  /  ProbX  0  1 2  /  x2   exp   2  dx The Central Limit Theorem  {Xi}i = 1..N a set of r.v.: ‣ Identically distributed ‣ Independent ‣ All moments are finite  If X = x1 + x2 + … + xN then: ‣ X has a Gaussian distribution with: •  = N * average(x1) •  = N * stddev(x1) 1  x     Proba  X  b   exp   dx 2 2   2 b 2 a 2 The Central Limit Theorem  X = x1 + x2 + … + xN is a Gaussian r.v. with: ‣ mean(X) =  = N * average(x1) ‣ stddev(X) =  = N * stddev(x1)  For a Gaussian r.v. the probability that it is greater than zero is just a function of the ratio of the mean to the stddev:   Prob(X 0)       In our experiment, our annual profit/loss will have a ratio of mean to stddev sqrt(2,600)  50 greater than just one of our stock bets. This is the advantage of aggregation. How Good is the Information?  Assume Prob(Xannual < 0)  1 / 10,000  mean(Xannual)  3.7 * stddev(Xannual)  mean(Xind)  (3.7 / 50) * stddev(Xind)  0.07 * stdev(Xind)  Prob(Xind < 0)  47% Probability That Advice Is Right?  Any given piece of advice can’t be much more than 53% reliable, compared to 50% for a random flip of a coin  Otherwise, we could make a strategy that almost never loses money by just aggregating all the different pieces of advice  Contrast “my price target for this stock is 102 by mid-November” with “I think it’s 53–47% that this goes up vs. down”  Precision here greatly exceeds possible accuracy  Overconfidence  No error bars Probability that Advice is Right?  If we want Prob(Xannual < 0)  1 / 1,000,000 then: ‣ Prob(Xind < 0)  46%, instead of 47%  A computer algorithm (or team of analysts) predicting something everyday for all  2,000 U.S. stocks, instead of just 10 stocks, would need only Prob(Xind < 0)  49.75% to get Prob(Xannual < 0) to 1 in 1,000,000!  The existence of such a strategy seems unlikely  Statements expressed with certainty about the market need to be viewed with skepticism  Low signal-to-noise ratio Where Are Our Assumptions Wrong?  Transaction costs matter  Individual stock bets almost certainly not independent  By placing bets on linear combinations of individual stocks one can partially overcome correlations between bets, but you reduce the effective number of bets.  These aren’t just technical details. They can make the difference between making or losing money. Honesty Is The Best Policy  Despite caveats, the big picture is correct ‣ Only small effects exist ‣ The data is very noisy ‣ One can find statistical significance only by aggregating lots of data  As a result there is great value in: ‣ Clean data ‣ Quantitative sophistication ‣ Discipline ‣ Honesty Overconfidence and Overfitting  Humans are good at learning patterns when: ‣ Patterns are persistent ‣ Signal-to-noise ratio isn’t too bad  Examples include: language acquisition, human motor control, physics, chemistry, biology, etc.  Humans are bad at finding weak patterns that come and go as markets evolve  They tend to overestimate their abilities to find patterns Further Consequences of Aggregation  If profit/loss distribution is Gaussian then only the mean and the standard deviation matter  Consider 3 strategies: ‣ Invest all your money in the S&P 500 ‣ Invest all your money in the Nasdaq ‣ 50% in the Nasdaq and 50% in short-term U.S. Government Bonds  Over the past 40 years, these strategies returned approximately: ‣ 7% +/- 20% ‣ 8% +/- 35% ‣ 5.5% +/- 17.5% Diversifying Risk  Imagine we can lend or borrow at some risk-free rate r  Suppose we have an amount of capital c, which we distribute: ‣ X in S&P 500 ‣ Y in Nasdaq ‣ Z in “cash” (where c = X + Y + Z)  Our investment returns are: ‣ rP = X*rSP + Y*rND + Z*r ‣ rP = X*(rSP – r) + Y*(rND – r) + c*r ‣ E(rP) = X*E(rSP – r) + Y*E(rND – r) + c*r Consequences of Diversification  As long as E(rSP) and/or E(rND) exceeds the risk-free rate, r, we can target any desired return  The price for high return is high volatility  One measure of a strategy is the ratio of the return above the risk-free rate divided by the volatility of the strategy  Investing everything in Nasdaq gave the best return of the three strategies in the original list  Assuming the correlation between ND and SP is 0.85, the optimal mixture of investments gives: Y  0.2 X Consequences of Diversification  For our example, we want Y  - 0.2*X, but we get to choose the value of X. We can choose the level of risk!  If we choose X  2.4 and Y  -0.5, we get the same risk as investing everything in the Nasdaq but our return is  10.1% rather than 8%  Returns are meaningless if not quoted with: ‣ Volatility ‣ Correlations  Why does everyone just quote absolute returns? Past Results Do Not Guarantee Future Results  If we only look back over the past 20 years, the numbers change: ‣ 3.5% +/- 20% for S&P ‣ 7% +/- 35% for Nasdaq ‣ 5.0% +/- 17.5% for 50% Nasdaq and 50 % Risk-free  The same optimal weighting calculation gave X  -2.6 and Y  1.9 which gave the same risk as Nasdaq but with a return of 9.3%!  40 years of data suggests that an investor should go long S&P and short Nasdaq, but 20 years of data suggests to opposite. If we used the 40 year weight on the last 20 years we’d end up making 2%. Application to the Credit Crisis  Securitization ‣ Bundles of loans meant to behave independently ‣ CDOs are slices of securitized bundles ‣ Rating agencies suggested 1 in 10,000 chance of investments losing money ‣ All the loans defaulted together, rather than independently! Credit Models Were Not Robust  Even if mortgages were independent, the process of securitization can be very unstable.  Thought Experiment: ‣ Imagine a mortgage pays out $1 unless it defaults and then pays $0. ‣ All mortgages are independent and have a default probability of 5%. ‣ What happens to default probabilities when one bundles mortgages? Mortgage Bundling Primer: Tranches  Combine the payout of 100 mortgages to make 100 new instruments called “Tranches.”  Tranche 1 pays $1 if no mortgage defaults.  Tranche 2 pays $1 if only one mortgage defaults.  Tranche i pays $1 if the number of defaulted mortgages < i.  So far all we have done is transformed the cashflow.  What are the default rates for the new Tranches? Mortgage Bundling Primer: Tranches  If each mortgage has default rate p=0.05 then the ith Tranche has default rate: 100  j  p (1  p)100 j p T (i)    j i  j   Where pT(i) is the default probability for the ith Tranche.  For the first Tranche pT(1) = 99.4%, i.e., it’s very likely to default. But for the 10th Tranche pT(10) = 2.8%.  By the 10th Tranche we have created an instrument that is safer than the original mortgages. Securitizing Tranches: CDOs  That was fun, let’s do it again!  Take 100 type-10 Tranches and bundle them up together using the same method.  The default rate for the kth type-10 Tranche is then: 100   pT (10) j (1  pT (10))100 j p CDO (k )    j k  j   The default probability on the 10th type-10 Tranche is then pCDO(k) = 0.05%. Only 1/100 the default probability of the original mortgages! Why Did We Do This: Risk Averse Investors  After making these manipulations we still only have the same cash flow from 10,000 mortgages, so why did we do it?  Some investors will pay a premium for very low risk investments.  They may even be required by their charter to only invest in instruments rated very low risk (AAA) by rating agencies.  They do not have direct access to the mortgage market, so they can’t mimic securitization themselves. Are They Really So Safe?  These results are very sensitive to the assumptions.  If the underlying mortgages actually have a default probability of 6% (a 20% increase) than the 10th Tranches have a default probability of 7.8% (a 275% increase).  Worse, the 10th type-10 Tranches will have a default probability of 25%, a 50,000% increase!  These models are not robust to errors in the assumptions! Application to the Credit Crisis  Connections with thought experiment: ‣ Overconfidence (bad estimate of error bars) ‣ Insufficient independence between loans (bad use of Central Limit Theorem) ‣ Illusory diversification (just one big bet houses wouldn’t lose value) About the D. E. Shaw Group  Founded in 1988  Quantitative and qualitative investment strategies  Offices in North America, Europe, and Asia  1,500 employees worldwide  Managing approximately $22 billion (as of April 1, 2010) About What I Do  Quantitative Analyst (“Quant”)  Most Quants have a background in math, physics, EE, CS, or statistics ‣ Many hold a Ph.D. in one of these subjects  We work on: ‣ Forecasting future prices ‣ Reducing transaction costs ‣ Modeling/managing risk

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download JKaplan_2010_MOP_Presentation