Download Presidential Elections and the Stock Market:

Presidential Elections and the Stock Market: Comparing Markov-Switching and (FIE)GARCH Models of Stock Volatility1 David Leblang Associate Professor Dept. of Political Science University of Colorado [email protected] Bumba Mukherjee Assistant Professor Dept. of Political Science Florida State University [email protected] Abstract: Existing theoretical research on electoral politics and financial markets predict that when investors expect left parties –Democrats (US), Labor (UK)—to win elections market volatility increases. In addition, current econometric research on stock market volatility suggests that Markov-Switching models provide more accurate volatility forecasts and fit stock price volatility data better than linear or non-linear GARCH (Generalized Autoregressive Conditional Heteroscedasticity) models. We take issue with both of these claims. We construct a formal model which predicts that if traders anticipate that the Democratic candidate will win the Presidential election stock market volatility decreases. Using two data sets from the 2000 Presidential election we test our claim by estimating several GARCH, Exponential-GARCH (EGARCH), Fractionally Integrated Exponential-GARCH (FIEGARCH) and Markov-Switching models. We also conduct extensive out-of-sample forecasting tests to evaluate these competing statistical models. Results from the out-ofsample forecasts show—in contrast to prevailing claims—that GARCH and EGARCH models provide substantially more accurate forecasts than the Markov-Switching models. Estimates from all the competing statistical models support the predictions from our formal model. 1 Prepared for presentation at the 2003 Political Methodology Summer Meetings, Minneapolis, MN. We are grateful to Charles Franklin, Christopher Wlezien, and Andre Gibson of the Chicago Mercantile Exchange for providing data and to Jude Hays, William Bernhard and Brian Gains for helpful discussions regarding the calculation of election night probabilities. 1. Introduction There is an extensive empirical literature focusing on the relationship between politics and financial markets. Scholars studying currency, stock and bond markets have examined the role that electoral systems, elections, partisanship, and political uncertainty play in shaping both the value and volatility of financial assets (e.g., Freeman, Hays and Stix 2000; Martin and Moore 2003; Leblang and Bernhard 2000a, 2000b; Blomberg and Hess 1997, Lobo and Tufte 1998; Alesina, Roubini and Cohen 1997; Cohen 1993; Gartner and Wellershoff 1995; Herron 2000; Herron et al 1999; Goodhart and Alter 2003; McGillivray 2000, 2002; Roberts 1994; Gemill 1992, 1995). This literature shares two broad similarities. First, from a methodological perspective, scholars working in this research area typically use similar econometric tools -inclusive of ARMA, GARCH (Generalized Autoregressive Conditional Heteroscedasticity), EGARCH2 or Markov-Switching3 models—to test theoretical predictions4 on samples of highfrequency stock price and currency volatility data.5 Some of these scholars favor the use of Markov-Switching models claiming that Markov-Switching models are more accurate and provide better forecasts than a variety of linear and nonlinear GARCH models (Turner, Stratz and Nelson 1989; Kim, Morley and Nelson 2002; Van Norden and Schaller 1993; Sola and Timmerman 1994; Simonato 1992). This claim is surprising since, in reality, there is almost no 2 Applications of the GARCH and/or (E)GARCH models to test hypotheses on currency or stock price volatility include Leblang and Bernhard (2000a, 2000b), Bollerslev et al (1992), Ramchand and Susmel (1998), Poon and Taylor (1992). An excellent literature review that surveys applications of the family of GARCH models in the financial economics literature can be found in Fan (2002). 3 Application of the Markov-Switching Model (Hamilton 1989; 1994) in macroeconomics, financial economics and political economy is far too vast to cite in a single footnote; however, for a literature review that provides a detailed bibliography of various applications of Markov-Switching models, see McCulloch and Tsay (1994). Among political economists, Freeman, Hays and Stix (1999, 2000) and Blomberg and Hess (1997) have used Markov-Switching Models to estimate the impact of political variables on exchange-rate volatility. 4 GARCH and Markov-Switching models are not the only econometric techniques used by scholars. For instance, Herron (2000) and Herron et al (1999) use nonlinear least squares, while Foster and Vishwanathan (1995) use GMM estimation to examine the impact of economic and/or political variables on stock market volatility. However, given the widespread use of GARCH and Markov switching models, we concentrate our analysis on comparing the forecasting performance of these two popular estimation techniques. 5 Financial time series data exhibit excess kurtosis, autoregressive conditional heteroscedasticity (ARCH), nonlinearity and nonstationarity. 1 work that seriously evaluates the statistical merits of these two competing sets of models – GARCH versus Markov-Switching.6 Second, from a substantive viewpoint, the predominant theoretical prediction and empirical finding in the literature is that if traders or investors anticipate electoral victory by a “left” party –Democrats in the US or Labor in Britain—then the prices of various financial assets—exchange rates, stock indices, bond prices—will become increasingly volatile (Herron 2000; Cohen 1993; Alesina, Roubini and Cohen 1997; Freeman, Hays and Stix 2000; Gemmill and Saflekos 2000). In this paper we take issue with these methodological and substantive claims. We use GARCH, EGARCH, FIEGARCH and Markov-Switching models to analyze the impact of electoral information, uncertainty and partisanship on stock price volatility during the 2000 U.S. Presidential Election. We find, using two different data sets, that GARCH type models outperform Markov-Switching models in a number of important ways. Results from the outof-sample forecasting tests – which included the RMSE and MAE statistics as well as realized volatility regressions – show that the GARCH and especially the EGARCH models more accurately forecast volatility than Markov-Switching models. In addition to this methodological contribution our paper provides an important substantive addition to the burgeoning literature on democratic politics and financial markets. In particular, we construct a formal model of speculative trading where stock traders and the market–maker rationally respond to political information about potential electoral outcomes. Our model predicts that higher ex ante uncertainty amongst traders about the electoral outcome increases stock price volatility. More importantly, the model also predicts that if stock To the best of our knowledge, only Pagan and Schwert (1990) have compared the forecasting performance of GARCH, EGARCH and Markov-Switching models by using pre-war US stock returns data. There are a number of key differences between our paper and their work. First, our Markov-Switching model is estimated with time-varying transition probabilities whereas their model contains constant transition probabilities. Second, our specifications contain numerous political and electoral variables to capture the behavior of stock market returns. Finally, they do not compare forecasts between FIEGARCH and Markov-Switching models that we do here. 6 2 traders expect a Democrat to win the Presidential election, then stock market volatility decreases. This theoretical prediction stands in stark contrast to existing findings in the literature which show that bond yields and stock price volatility increases when investors anticipate leftoriented parties to win elections (Herron 2000; Gemmill and Saflekos 2000; Alesina, Roubini and Cohen 1997; Freeman, Hays and Stix 1999; Cohen 1993). This paper is organized as follows. We present the formal model in the next section. In section 3 we describe the data, variables and results from GARCH, EGARCH, FIEGARCH and Markov-Switching models. We report and compare the out-of-sample forecasts from all the estimated models in section 4 before concluding the paper. 2. The Model 2.1 Players, Sequence of Moves To examine how uncertainty about the outcome of a Presidential election, the partisan identification of the Presidential candidates and political information from a public source affects stock price volatility, we construct a model of speculative trading. There are two players in our model: (i) a group of strategic traders7, I, where i ∈ I denotes a single trader and (ii) a market-maker who sets prices based on the strategic behavior of traders. The traders in our model are primarily interested in trading an asset with a terminal value, v, which is known to them. Before trading, all players observe public signals s ∈ S that provide political information on the potential outcome of the presidential election and the policies that will be adopted by the party that wins the elections. More substantively, then, s includes aspects such as opinion polls, news events or campaign promises by candidates and is 7 The traders in our model trade stocks for short-run personal financial gain and as service to clients. Following models of speculative trading by Kyle (1984, 1985), Admati and Pfleiderer (1988), we assume that traders and the market maker are risk-neutral to capture the linear relationship between risk and stock returns. 3 correlated with v. The political information8 learnt from observing s influences the strategic behavior of traders in our model, which – we show later—affects the price, p, of the traded stock set by the market-maker and the variance9 in the traded stock’s price. We denote the mean price of the traded stock as p 0 and the variance as v − p 0 .10 After observing the public signal s, each trader submits to the market maker the order xi ( I , s, v) + qi . The component xi ( I , s, v) indicates the segment of the order that is placed by a trader given the political information provided by the signal s, the number of traders (I) and his (or her) expectations about the terminal value, v, of the traded stock. qi indicates the amount of nondiscretionary liquidity trading11 done by each trader. The market maker sets prices based on the information arising from the signal s and net order flows12 I n = ∑ xi ( I , s, v) + q that satisfy the zero profit condition.13 i =1 The sequence of moves in the game is as follows: First, political information that stems from the public signal s is observed by all players. Following observation of s, traders submit orders to the market maker who then determines the price knowing s and the net order flow. After the market-maker sets the price, trade occurs and the terminal value of the traded stock is revealed. We now describe below the players’ utility functions. Unlike existing models, we do not analyze the case of endogenous information acquisition where traders incur search costs to acquire information. This is because electoral information exogenously stems from public sources and players don’t incur search costs to acquire electoral information, which is publicly observable. 9 We use the terms “variance” or “variability” to mean “volatility” in the traded stock’s price (or vice-versa). 10 p also denotes the full information price of the traded stock. 0 8 Following Kyle (1984, 1986) and Foster & Vishwanathan’s (1995) models, we let q i denote the stocks traded by a trader for purely portfolio-balancing reasons. This portfolio-balancing is done to diversify risk or to serve the interests of institutional clients, rather than for maximizing his short-run personal profit. Hence, the argument q i is not optimized in each trader’s maximization problem, but is assumed to be correlated to s. 11 12 Note that q = I =1 ∑q i i 13 We formally define the zero-profit condition later. 4 2.2 Utility Functions Scholars have noted that there always exists variance in the traded stock’s price and the amount of liquidity trading by traders. 14 This is obvious given that stock prices and liquidity trading are highly sensitive to variability in trading behavior. To formalize this variance, we denote the unconditional variance of the stock’s terminal value, v, as σ v2 and the unconditional variance of q as σ q2 . Recent scholarship in the financial economics literature15 has also emphasized that public signals that provide economic or political information exhibit variance largely because such signals stem from heterogeneous sources. We thus denote variance in the signal as σ s2 and we let s 0 denote the expected value (mean) of s.16 The behavior of traders in our model is influenced by their beliefs over the parameters s,v and q. Instead of using the normal distribution to characterize beliefs,17 we follow Owen and Rabinovitch (1987) and Foster and Vishwanathan (1995) by assuming that s,v and q – with unconditional 18expected values s 0 , p 0 and q0 -- are jointly distributed according to a multivariate elliptically contoured (hereafter ECC)19 distribution,  s   s   v  ~ ECC   p0 , ∑, f (.)   q 0   q    0   (1) For this, see Kyle (1984, 1985, 1986) and Gallant, Rossi and Tauchen (1992). For this, see Kyle (1985) and Harris and Raviv (1993). 16 Substantively, s denotes political information about the election result that is expected by traders. 0 17 Kyle (1984, 1985) and Admati and Pfleiderer (1988) use the normal distribution to characterize beliefs. 18 This occurs when the expected value of the parameters are not dependent on s. 19 Typical examples of multivariate elliptically contoured distributions are the multivariate t- distribution as well as distributions that belong to the compound normal class. We use a multivariate elliptically ECC distribution to represent traders’ beliefs over the parameters v,p,s and q because it has the property that a higher absolute difference between the realized public signal s and its mean s o leads to more variability in the traded stock’s price. This property is vital for the theoretical analysis in this paper. In Appendix B, we provide formal definitions from Cambanis, Huang and Simons (1981) and Johnson (1987) to characterize some technical properties of the multivariate ECC distribution, which are used to prove propositions from our model. 14 15 5 In (1), f(.) is the function that characterizes the density of the distribution20. ∑ is the unconditional variance-covariance matrix21, which is defined as: σ s 2  Σ =  σ sv   0 σ sv σ v2 0 0   0   σ q 2  (2) Equation (1) merely defines the multivariate ECC distribution. The distribution in (1) is, however, inadequate because we need to assume in our model that traders’ beliefs about the variance in the stock’s price, v − p 0 and liquidity trading is conditional on the signal s.22 To do so, we assume that the traders’ beliefs over v − p0 and σ q2 given s are represented by v − p | s the conditional distribution  2 0  .23 More formally, the conditional distribution of v − p 0 σ q | s  and σ q2 given s is jointly distributed according to the multivariate ECC distribution, ( )   σ sv / σ s 2 (s − s0 )  ( s − s0 ) 2   Θ 0   v − p0 | s    , h    , (.) f  σ 2 | s  ~ ECC  2   σ 2   0 σ q   q  q  s    0     (3) where Θ = σ v2 − (σ sv2 / σ s2 ) and h : ℜ → ℜ + . The use of the multivariate ECC distribution unfortunately increases the technical complexity of our model. Yet this distribution is extremely useful because of two reasons. First, it helps us to analyze how traders’ uncertainty about which candidate will win the In Appendix B, we formally characterize the unconditional distribution for the parameters, s,v and q. The unconditional distribution is also a multivariate elliptically contoured distribution. We show that the mean, variance and the function f(.) fully characterizes the multivariate elliptically contoured distribution in this case. 21 The term “unconditional” variance-covariance matrix implies the variance-covariance matrix of the parameters v,q and s, when they are not conditional on s 22 Each trader is aware that all players adjust their behavior after observing s and that this affects the variance in the traded stock’s price. Hence, ex ante, traders will have beliefs about how s may influence the variance in the stock price. We also denote the variance of the traded stock’s price given n and s as (v − p0 ( n, s) | n, s) . For proofs of propositions from our model, we examine variance in the stock price given s (or given n and s). 23 This conditional distribution captures how observation of political information about the potential electoral outcome stemming from the signal s affects trader’s beliefs over the variance v − p 0 and q. 20 6 Presidency and the arrival of new political information about electoral results affects the variability in the traded stock’s price. Second, it also allows us to analyze how traders’ ex ante expectation of whether the Democratic or Republican candidate will win the election – i.e. the candidates’ partisan identification-- affects stock market volatility. We now turn to describe the players’ utility functions. Let p (n,s) be the price that the market maker sets knowing the net order flow, n, and the realized political information from the signal s. Following standard models of speculative trading (Admati and Pfleiderer 1988; Foster and Vishwanathan 1995), the market maker in our model uses a linear pricing rule, p(n, s) = b(s) + α (s)n (4) where b > 0 is some positive constant and α is the weight24 that the market maker places on adjusting prices (upward or downward) as a function of net order flows given the realized public signal s.25 The market maker’s objective is to set p(n,s) such that it satisfies the expected zero profit condition, i.e. p(n, s ) = E[v | n, s ] (5) where E[v | n, s ] is the expected terminal value of the traded stock conditional on n and s. Given p(n,s), each trader maximizes, arg max E [(v − p (n, s )) xi ( I , s, v) + qi | v, s ] (6) xi (.) where E is the expectation operator. Substituting (4) into (6) yields the following utility function for each trader, α > 0 but bounded from below and above. That is, α ∈ [α ,α ] , α ∈ ℜ + . ∂p(n, s ) ∂α (.) 25 Observe that in (4), > 0 and > 0 which indicates a monotonic relationship between n ∂n dα and α as well as p and α . We exploit this monotonic relationship to demonstrate in the proofs that the higher 24 We assume that (lower) the variance in net order flows, the greater (lesser) the weight the market maker places toward adjusting stock prices and hence the higher (lower) the variance in the traded stock price. 7 arg max E [(v − b( s ) − α ( s )n) xi ( I , s, v) + qi | v, s ] (7) xi (.) We now present equilibrium results and comparative statics from our model. 2.3 Equilibrium Results and Comparative Statics The solution concept that we use is Nash equilibrium. With respect to our model, this means that each trader takes the strategies of all other traders as well as the terms of trade –summarized by the market maker’s price-setting strategy—as given while choosing his or her best response. Solving for the Nash equilibrium leads to the following result, Lemma 1: The Nash equilibrium in the single period model is: p(n, s ) = α n + E[v | s ] xi ( I , s, v) = γ (v − b( s )) α = σq I Θ s (1 + I ) and γ = σ qI 2 s I Θ (8) (9) Proof: See Appendix A. th Lemma 1 formally characterizes the order that the i trader places in a Nash equilibrium, the Nash asset price and α that is set by the market maker. It also characterizes the parameter γ that denotes the “intensity” with which each trader trades the stock in equilibrium, i.e. the th degree of the i trader’s trading (buying and/or selling) activity in equilibrium.26 We present some comparative statics from the equilibrium solution below. These are used later to explain the causal logic underlying various theoretical predictions from our model, Corollary (to Lemma 1): (i) ∂α ∂I > 0 and ∂γ ∂I > 0 . (ii) α and γ is monotonic in s. Proof: See Appendix A In Kyle (1985,1986) and Foster and Vishwanathan’s (1995) models, γ is a proxy for the amount of stocks that traders trade in equilibrium. Hence, the greater the intensity of trading, the greater the amount of trading (or vice-versa). Note that the function γ (s ) in (8) increases when the difference between v and b(s) increases. 26 8 ∂γ ∂I > 0 indicates that as the number of traders who trade increases, the intensity of trading activity by each trader also increases. This is not surprising since the entry of additional traders will drive down marginal trading profits, therein giving incentives to each trader to increase his level of trading. ∂α ∂I > 0 implies that the weight that the market maker places on adjusting prices increases when the number of traders that engage in speculative trading increases. This is intuitive because an increase in the number of traders leads to ∂γ ∂I > 0 , which directly affects the degree of net order flows by traders. This, in turn, forces the market maker to adjust prices more often. The comparative static result that is more important is the finding that both the intensity of trading activity, γ , and the weight that the market-maker places on adjusting prices, α , are monotonic in s. This implies that the trading behavior of traders and the pricesetting behavior of the market maker are directly affected by the nature27 of political information that arrives via s. We show below that the monotonicity of γ and α in s plays a critical role in explaining how uncertainty about Presidential election results and expectations of which candidate will win the Presidency affects variability in the stock price. Second, given that the market-maker and the traders’ behavior is affected by the signals s, it is likely that the conditional variance in the traded stock’s price v − p0 | s , will also be influenced by the political information that is learnt from observing s. The following result, Lemma 2, formally proves the aforementioned claim, Lemma 2: The variance in the traded stock’s price Var (v − p 0 | s ) is monotonic in s. By “nature” of the political information/signal s, we mean information about whether the Democratic or the Republican candidate will win the election. 27 9 Proof: See Appendix A. The causal logic behind the result in Lemma 2 is as follows. Specifically, we find that under certain conditions with respect to s –which we specify in more detail later—the variance in the net order flows by traders increase, while under other conditions, it decreases. Now when the variance in n increases, the weight that the market-maker places on adjusting prices upward and downward increases.28 This serves to increase the variance in the traded stock’s price.29 Conversely, if the variance in n decreases, the market-maker is less likely to adjust the stock price and this engenders a decline in stock price volatility.30 The brief discussion of the result in Lemma 2 gives rise to the following question: Under what conditions does political information from the signal s increase or decrease variability in the traded stock’s price? We show that if the arrival of “new” political information via s increases uncertainty about which candidate will win the Presidency, then the variance in the traded stock’s price also increases. We also demonstrate that the direction of political information from s matters; specifically, stock market volatility is affected by signals that reveal whether the Democratic or the Republican candidate will win the Presidency. We begin by analyzing how the degree of electoral uncertainty about the outcome of a Presidential election can affect the variance in the traded stock’s price. Specifically, in our model, uncertainty among traders occurs when new information from the updated signals, s A , about which Presidential candidate deviates from their expectation about the likely winner of the Presidential election –this expectation being formed by the signal s0 . More formally, This follows from the monotonicty of α with respect to n. Since the Nash equilibrium asset price in (8) is monotonic in α and n, an increase in the variance of either of these two parameters leads to a higher variance in the traded stock’s price. 30 In the proof of Lemma 2, we also prove that Var (v − p | s ) increases when the variance in n increases. 0 28 29 10 uncertainty about the electoral outcome among traders occur when s A deviates from it’s mean s0 . Hence, the degree of electoral uncertainty is defined as the absolute difference s A − s0 . This definition of electoral uncertainty can be illustrated by the following example. Suppose that, on average, opinion polls predict that the Democratic candidate is likely to win the elections, this is the information that stems from s0 . Assume further that new political information from s A indicates that recent opinion polls favor the Republican candidate and that the election race may be a close finish. The significant deviation between s A and s0 in this case is likely to increase uncertainty among traders about the outcome of the Presidential election. In contrast, if s A and s0 converge – i.e. if s A shows that the Democratic candidate is still likely to win the election --then uncertainty about the electoral outcome will be low. Given the conceptualization of electoral uncertainty in our model, we state the following result: Proposition 1: The variability of the stock’s price Var (v − p 0 | (| s A − s 0 ) |) increases when electoral uncertainty about which candidate will win the presidency increases; i.e. when the difference s A − s0 increases. Proof: See Appendix A. Three reasons explain the result in Proposition 1. First, when new political information from s A is substantially different from what traders’ expect, we find that the variance in net order flows by traders increases.31 The proof of Lemma 2 (Appendix A), shows that this contributes to a higher variance in the traded stock’s price. Second, the proof of proposition 1 shows that an increase in s A − s0 leads to an upward revision of beliefs about the variance in the stock price among traders. This has a feedback effect that contributes to a higher conditional variance in the stock’s price. Third, we claim that when uncertainty about the election result increases, then traders find it difficult to assess, ex ante, how the electoral 31 We prove this claim in the proof of Proposition 1 in Appendix A. 11 outcome will affect the stock price ex post. We find that in equilibrium, traders rationally hedge against this increased uncertainty by temporarily selling the traded asset and maintaining larger cash balances. This, in turn, leads to a sharp downward price movement. Note, however, that after the stock’s price declines to some threshold, the demand for buying the stock rapidly increases therein inducing the market maker to revise the price upwards.32 The upward revision in prices can be mean-reverting, that is, converge back to p 0 , or, depending on the degree of order flows, it can rise above p 0 . In short, when political uncertainty about the outcome of presidential elections increases, we will observe relatively sharp dips and spikes in the price movement, which essentially implies increased variability in the traded stock’s price.33 Our claim that greater electoral or (more broadly) political uncertainty leads to a higher variability in stock prices is not surprising.34 But does the partisan identification of the two Presidential candidates also affect stock market volatility? If so, how? As an answer to these questions, we make the counter-intuitive claim that anticipation of a Democratic victory reduces variability in the traded asset’s price, while expectations of a Republican victory increases variability in the stock price.35 More specifically, we argue that traders generally believe that the post-election economic policies that an incoming Democratic President will implement are not likely to deviate from his party’s pre-electoral policy announcements that are likely to be “left-ofcenter” oriented. Our model demonstrates that this belief serves to lower the variability in 32 Traders on the NYSE market floor typically buy stocks in large amounts when stock prices fall and there are arbitrage opportunities in the immediate future; see, Gallant, Rossi and Tauchen (1992). 33 We prove in a more detailed version of our model that increasing uncertainty leads to higher trading volume. 34 McGillivray (2002) shows that under coalition governments, uncertainty with respect to implementation of policies by the governing coalition leads to higher stock price volatility. In a different context Freeman, Hays and Stix (2000) show that more uncertainty is correlated with higher exchange rate volatility. 35 We use the term “counter-intuitive” since prevailing theoretical and empirical results by Herron (2000), for example, indicate that investor’s expectation of electoral victory by left parties – Labor (UK) or Democrats (US)—and not conservative parties increases stock price volatility. 12 the stock price when the market expects a Democrat to win the Presidency. Conversely, we argue that traders typically perceive ex ante that an incoming Republican President will “move further to the right” and implement relatively more extreme anti-inflationary, antiwelfare and tax-reduction policies than those proposed by him and his party before the elections. We show below that the market’s anticipation of further deviation to the right by an incoming Republican President engenders higher variability in the stock’s price. To see this more formally, let s D be the pre-electoral signals (for e.g. campaign promises) provided by the Democratic candidate with respect to the economic policies that he intends to implement in office. From a substantive viewpoint, we assume that s D reveals information that the Democratic candidate will implement in office policies that lower unemployment, but leads to higher inflation, welfare and taxes. This assumption follows from existing results in rational partisan literature (Alesina & Rosenthal 1995) and the general reputation of Democrats being liberal. Now let ŝ D represent the post-election policy signal that traders expect the Democratic candidate to announce (and implement) after he wins the elections. The extent to which ŝ D is expected to deviate from s D is formalized as | sˆ D − s D | where | sˆ D − s D | ≠ ∅ . 36 Likewise, we denote s R as the pre-electoral policy signals/policy promises that are made by the Republican candidate. Based on the reputation of Republicans as conservative, we assume that the signal s R reveals information that the Republican candidate will implement in office conservative policies that help to reduce taxes and inflation. We let ŝ R represent the post-election policy signal that the market expects the Republican candidate to announce if he wins the elections. The degree to which ŝ R is expected to deviate from s R is | sˆ R − s R | ≠ ∅ . 36 Observe without loss of generality that the sign of (| sˆD ( R ) − sD ( R ) | ) will be the same as | sˆ D ( R ) − s D ( R ) | . 2 13 Given our earlier discussion, we presume that | sˆ D − s D | < | sˆ R − s R | and that lim sˆD → sD | sˆ D − s D | < ε . This formalizes our idea that traders expect less deviation between the pre-electoral policy signals and post-electoral policy announcements by the winning Democratic candidate. We now state the following result from our model: Proposition 2: When traders anticipate the Republican candidate to win the Presidency, then the ∂ E[Var(v − p | (| sˆR − s R |)) variance/volatility in the stock price increases. This is because > 0. ∂ (| sˆR − s R | 2 ) Proof: See Appendix A. The causal intuition that explains the result in the above proposition is as follows. We claim that traders believe that after elections, a victorious Republican President will announce policies that curb inflation, but will also cut taxes more aggressively than what he promised prior to elections. Consequently, traders expect better economic prospects and higher stock returns in the near future under a Republican administration. This initially leads to “rational exuberance” in the market where demand for stocks rises rapidly which, in turn, engenders an increase in the expected price of the traded stock, as proved below, Lemma 3: Let | sˆ R − s R | = λ , where λ ≠ ∅ . From the implicit function theorem, Proof: See Appendix A. dp(.) > 0. dλ Now when the expected stock price increases owing to increased demand, traders have rational incentives to sell the stock in order to take advantage of a short-run arbitrage opportunity.37 We prove in the appendix (see Lemma 4) that such selling behavior engenders a mean reversion in the stock’s price. After the stock price reverts to it’s mean, traders have incentives to invest in the stock again owing to a cheaper buying price and expectations of higher returns under a Republican administration. This consequently leads to an increase in Since traders on the NYSE market floor trade largely for purposes of short-term financial gain, they rationally respond to an upward increase in a stock price by selling it to acquire short-term profits. 37 14 the traded stock’s price. Thus, rapid switching between buying and selling behavior by traders within a short-time period generates increased volatility in the stock price when the market expects a Republican victory. In contrast, our model predicts that variability in the stock’s price decreases when traders anticipate that a Democrat will win the Presidency. More formally, Proposition 3: When traders expect the Democratic candidate to win the Presidency, then variability in the stock price decreases because lim sˆD → sD Var (v − p 0 | (| sˆ D − s D |)) < 0 Proof: See Appendix A. Two reasons explain the above result. First, as mentioned earlier, traders expect that policies implemented by the incoming Democratic President are less likely to deviate from his pre-electoral policy announcements. This implies that the degree of ex ante uncertainty among traders about the kind of economic policies that the Democratic candidate will follow ex post – after the Democratic candidate wins the elections – is low. The proof of Proposition 3 demonstrates that such low uncertainty reduces variability in the stock’s price. Second, we argue that reputation matters in that traders perceive ex ante that an incoming Democratic President will follow “left-oriented” measures – inflationary policies and relatively higher taxes—that are detrimental to the stock market.38 In Appendix A, we state and formally prove two lemmas from our model -- Lemmas 5 and 6 – which demonstrate that ex ante perceptions of Democrats being “bad” for the stock market decreases the incentives and intensity with which traders trade stocks. This serves to decrease the variance in net order flows and the weight that the market maker places toward adjusting the stock price (Lemma 6). Both these factors lead to lower volatility. In sum, our model provides the following four testable hypotheses: Santa-Clara and Volkanov (2002) show that the average expected returns in the stock market are 1.8% higher under a Republican administration than under a Democratic government, thus justifying our assumption. 38 15 Hypothesis 1: Information arrival about electoral outcomes affects stock price volatility. Hypothesis 2: Increased uncertainty about the electoral result increases volatility. Hypothesis 3: If traders expect the Democratic candidate to win, then volatility decreases. Hypothesis 4: If traders expect the Republican candidate to win, then volatility increases. 3. Empirical tests 3.1 Sample, Data and Variables We test the hypotheses listed above on two distinct samples related to the 2000 Presidential election. The first sample is comprised of daily observations during the 2000 Presidential campaign —end of day returns for the S&P 500 and aggregated daily national polling results. The second sample examines how actors trading S&P futures during the night of November 7, 2000 – the night of the election – respond to information regarding the likelihood of a candidate winning the Electoral College. These samples are discussed in turn. The 2000 Presidential Campaign: We examine the response of stock market returns to the arrival of political information using a sample of daily observations from January 6, 2000 – November 6, 2000. Data limitations prevented us from extending the sample backwards and November 6 was the last day that polling information was available. We use returns (log changes in daily closing prices multiplied by 100) on the Standard and Poor’s 500 index as our dependent variable.39 To measure political information that captures expectations of a Gore (i.e. Democrat) victory, we utilize polling data that indicates, for each day, Gore’s share of the two major-party vote. These data, collected and used by Wlezien (2001) and Wlezien and Erikson (2001), are based on an aggregation of 295 separate national polls conducted during the 2000 presidential campaign. Missing values were filled in using linear interpolation.40 39 The S&P 500 index includes 80% industrials, 3% utilities, 1% transportation and 15% financial companies. Their market value is roughly 80% of the value of all equities traded on the NYSE; see www.standardpoors.com 40 See Wlezien (2001) for a detailed discussion of this variable 16 We also include additional variables to control for other unmeasured influences on stock market volatility. These include two dummy variables capturing “closing days effects;” that is, effects on market activity that result from weekends or holidays. Closing day effects variables measure the number of days BEFORE day t that the market was closed and the number of days AFTER the day t that the market will be closed.41 It is expected that these variables will have a positive and statistically significant effect on stock market volatility. Finally, we include a variable measuring (the log of) trading volume because studies find that including trading volume substantially accounts for observed volatility in stock market returns (Gallant et al, 1992). Since volume data is not available for the S&P500 indexes, we use total daily volume traded on the New York Stock Exchange (hereafter, NYSE) as a proxy. These variables, their sources and measurement, are summarized in Table 1 in Appendix B. --Insert Table 1 about here-- Election Night: November 7, 2000 A second laboratory within which to examine the effect of political information on stock market volatility was created the evening of November 7, 2000. As the evening progressed network and cable news outlets (as well as the major wire services) “called” the electoral outcome of each state. These calls constitute the arrival into the market of political information, information that affects the strategic decisions of traders. The NYSE is open for trading between 9:30am-4:30pm Eastern Standard Time; it closes, therefore, prior to the reporting of election results. After hours traders can trade options and futures contracts through the GLOBEX electronic trading system. GLOBEX, developed by Reuters and the Chicago Mercantile Exchange, is an automated system that In their analysis of currency markets, Beine, Laurent and Lecourt (1999) find this specification more parsimonious than the inclusion of a set individual business day dummy variables. 41 17 provides information about trades (bid & ask prices), routs orders, and executes trades. Using the GLOBEX system, individuals can trade a variety of futures, options and interest rates. The GLOBEX system reports information on the price and volume for every individual transaction during the trading session. We use this “tick” data to track the movements of futures prices for the S&P 500 Index.42 These tick data were aggregated to provide the average price and total volume of trades for each minute during the trading session. To avoid overlap with the NYSE, and because a five minute lag is used the sample period for the overnight data set is 4:35pm on November 7th through 8:59am on November 8th. We measure the arrival of political information by constructing a variable that estimates the probability that Gore will win a majority of electors in the electoral college and, thus, will become the 43rd president. This measure is based on state level polls for each of the 50 states and exploits the fact that these polls contain a degree of sampling uncertainty. As each state is called over the evening of November 7th and into the morning of November 8th, traders update their priors regarding the likelihood of a Gore victory. The prior for each state is calculated using the final state level poll available. Table 2 (Appendix B) reports information regarding the sample size of the poll (sample size), the percentage of respondents responding with a preference for Gore (Gore %) and for Bush (Bush %) and the share of the two-party vote for Gore (Gore/(Bush+Gore))43. This information is used to test the null hypothesis that, in the population, Gore’s share of the two party vote is greater than or equal to 0.50001 against the alternative hypothesis that Gore’s share is less than 0.50001. The p-values for rejection of the null are also listed in 42 A future is a legally binding agreement to buy or sell the cash value of the asset at a specific future date. In the case of the futures used here the maturity date was November 15, 2000. 43 We are grateful to Charles Franklin and Chris Wlezien for sharing these data; see Franklin (2001) and Wlezien (2001). 18 table two. Higher p-values indicate the probability of making an error by rejecting the null that Gore will win the state. The mapping of the share of the two-party vote for Gore and the p-value for rejecting the null results in a S-shaped relationship. For example, Gore’s share of the two-party vote in Massachusetts was 63.4%. The p-value for rejection of the null that Gore would get at least 50% and win the state is 1.00, indicating that it is certain that an error will be made if that state’s electoral votes are given to Bush. Likewise, the p-value for rejection of the null for Texas is 0.000 meaning that there is zero chance out of a thousand that Gore will win that state. Since these polls contain sampling uncertainty there is a probability that an error will be made by rejecting the null hypothesis. The second step in variable construction is to exploit this sampling uncertainty. This is done by randomly drawing from a uniform [0,1] distribution and creating a variable Q with observations for each state. Denoting the p-value for rejection of the null hypothesis P, if Q is less than P then Gore wins state i and gets all of state i’s electoral votes. This is done for each state. If Gore wins sufficient states to give him more than 270 electoral votes then he wins the election. Third, the process in step two is repeated 1,000 times and the proportion of Gore victories is recorded. This measure--the probability that Gore wins the Electoral College—is graphed in figure 1 (see Appendix B). At the beginning of the evening, at 3:45pm, the probability of Gore winning the Electoral College was .378; that is, he won 378 out of the 1,000 elections. Finally, this probability is updated over the course of the election as each state was called by CNN. As a state is called the probability of winning the state in question goes to either zero or one, depending on whether the state is called for Bush or for Gore, and steps two and three are repeated. Continuing this procedure until 6:21am on November 8th, when Wisconsin was called, results in a variable that measures the probability for each minute, that 19 Gore will win the Electoral College. As can be seen in figure 1, this probability increases dramatically at 7:52 when Florida is called for Gore and then declines at 8:55pm when CNN takes Florida from Gore’s win column. Likewise, the probability that Gore wins the Electoral College took a nosedive when Florida was called for Bush at 1:18am and then increased when Florida was once again labeled a “toss-up.” Similar to the daily data set, we control for the total volume traded during each minute. We also control for the anticipated time interval between the current and previous trade. O’Hara (1995) argues that “if market participants can learn from watching the timing of trades, then the adjustment of prices to information will also depend on time” (p.169). Engle (1996) empirically implements this idea and argues that the expected duration between trades should have a statistically significant effect on the mean and price changes. 3.2 Statistical Models: GARCH, EGARCH, FIEGARCH Because we are interested in the effect of political information on stock market volatility, we utilize the Generalized Autoregressive Conditional Heteroscedasticity Model introduced by Engle (1982) and extended by Bollerslev (1986). A GARCH model is comprised of two equations: one for the conditional mean and the other for the conditional variance. In the GARCH (1,1) specification, the conditional mean can be written as: ∆Pt = λ + ε t , ε t ~ N (0, σ t2 ) (10) where ∆Pt is the change in closing price of the stock market index observed at time t, λ is a constant and ε t is an error term that is normally distributed44 with mean zero and variance σ 2t . Note that the mean is specified as following a random walk with a drift; no exogenous variables are thought to influence the mean change in price. We can also use the generalized exponential, student-t or double exponential distribution if desired. But we did not need to do so since we found that the residuals from the estimated models are conditionally normal 44 20 The unique feature of GARCH models is that we can specify how the conditional variance evolves ( σ 2t ) over time in response to both past values and to exogenous shocks. The conditional variance for the standard GARCH (p, q) model is: q p i =1 i =1 σ 2t = ω + ∑ α i ε 2t −i + ∑ β iσ 2t −i (11) Using the lag or backshift operator45, equation (11) can be rewritten as: σ 2t = ω + α ( L)ε 2t + β ( L)σ 2t (12) In most cases there is one ARCH and one GARCH term. With exogenous variables affecting the conditional variance, the GARCH(1,1) can thus be written as: σ 2t = ω + α 1ε 2t −1 + β 1σ 2t −1 + δ i I i ,t (13) The variance σ 2t , called the conditional variance, is the one-period ahead forecast variance based on all information available at time t-1. The conditional variance is a function of four terms: the constant (ω) the ARCH term ( ε 2t −1 ), the GARCH term ( σ 2t −1 ), and a set of exogenous variables (Ii,t). GARCH models are often used to analyze financial time series because it is assumed that economic agents form expectations about this period’s variance based the long term mean of the variance ( ω ), the forecasted variance from the prior period ( σ 2t −1 ), and new information about volatility gleaned in the prior period ( ε 2t −1 ). GARCH models can account for the large clustering of errors observed in financial time series where large deviations in the conditional variance are often followed by other large deviations. While the standard GARCH model is useful it has two limitations. First, volatility in tick data tends to persist over long periods of time. Second, it is likely that positive and negative shocks from prior periods may have differential effects on price volatility at the 45 The backshift operator can be represented as: α ( L) = α 1 L +...+α q Lq 21 and β ( L) = β 1 L +...+ β p Lp current point in time. To deal with these problems, Bollerslev and Mikkelsen (1996) have developed the following Fractionally Integrated Exponential GARCH (FIEGARCH) model, which we present in more detail in Appendix B: ln(σ 2t ) = ω + δ i I i ,t + φ ( L) −1 (1 − L) − d [1 + α ( L)]g ( zt −1 ) (14) In (14), [1 − β ( L)] = φ ( L)(1 − L) d and g ( zt ) = θzt + γ [| zt |− E | zt |] . Ignoring the g(zt-1) term for a moment, equation (14) says that (the log of) volatility is a function of the constant (ω), a set of exogenous variables (δIxi,t) measured at time t, the ARCH term (α), the GARCH term (β) and the fractional integration parameter (d). As in standard ARIMA models, d measures the speed at which shocks to the dependent variable (in this case, the variance) die out over time. If d equals zero then shocks have no memory and equation (14) collapses to the standard EGARCH model. However, if d equals 1 then (14) becomes the Integrated EGARCH model. Bollerslev and Mikkelsen (1996) find that the FIEGARCH model fits tick data well. Turning our attention to the g(zt-1) term, this part of the equation captures the idea that volatility responds differently to “good news” than to “bad news.” Nelson (1991) noted that “to accommodate the asymmetric relation between stock returns and volatility changes…the value of g(zt) must be a function of both the magnitude and the sign of zt.” This is accomplished with the θ and γ terms in g(zt-1) where θ captures the sign and γ captures the magnitude of past errors. Substantively this term means that the negative errors in the prior period will have a larger effect on the conditional variance than positive shocks. If θ and γ in g(zt-1) equal zero then equation (14) becomes the FIGARCH (p,d,q) model 3.3 Results from the GARCH, EGARCH and FIEGARCH models We estimate GARCH models on the daily and overnight samples. The results of these models are contained in Tables 3 and 4. Cell entries in both tables are maximum likelihood 22 parameter estimates with Bollerslev-Wooldridge semi-robust standard errors in parentheses. Beginning with the daily sample in Table 3, we note that the Ljung-Box Q statistic, indicating no residual serial correlation, suggests that the differenced S&P price series follows a random walk. The squared Ljung-Box statistic is also statistically insignificant; this suggests that there is no remaining ARCH in the residuals. The Jarque-Bera statistic also prevents us from rejecting the null hypothesis of normally distributed residuals. --Insert Table 3 about here-Turning our attention to the specification in column one in Table 3, the ARCH term (α) is not statistically significant while the GARCH term (β) is significant at the .05 level. This means that while random errors from the prior period ( ε 2t −1 ) does not significantly affect the conditional variance at time t, the conditional variance from time t-1 does. We also note that the sum of the ARCH and GARCH terms ( αˆ + βˆ ) is significantly less than one indicating a non-integrated GARCH process. The coefficients on the variables included in the conditional variance are—for the most part—consistent with our expectations. Stock market volatility is greater the days after traders return from vacation. Daily stock Price volatility is also significantly (in both substantive and statistical terms) higher as a result of increased trading. In column one, we test the hypothesis that expectation of Gore’s victory decreases stock market volatility. Column one uses the percentage of people polled expressing a preference for Al Gore to test hypothesis three.46 The coefficient on the GORE variable is negative and statistically significant, indicating that a higher likelihood that Gore will win the popular vote for president—and subsequently become the President—decreases the volatility of the S&P 500 index. While we report the results of contemporaneous information arrival—that is, measures at time t—there is no substantive or statistical difference if we use lagged (t-1) measures of information arrival. 46 23 Hypothesis two suggests that electoral uncertainty drives stock market volatility as uncertain information will bias traders’ forecasts. In column two we operationalize the idea of uncertainty by calculating a measure of entropy E=1-4(p-.5)2 where p is Gore’s share of the two-party vote. The entropy measure is greatest when p is closest to .5; the intuition being that there is little uncertainty about an outcome when the probability of an electoral victory is .10 or .90 and great uncertainty about an outcome when the probability is equal to .5 (see Freeman, Hays and Stix 2000). The entropy measure is substituted for the GORE variable in column two. Note initially that the entropy measure has very little variance: it has a mean of 99.57, a minimum of 97.39 and a maximum of 99.9, reflecting the fact that Gore’s share of the two party vote in opinion polls hovered around 50% for most of the 2000 campaign. Interestingly, when incorporated in the GARCH model the entropy variable has a negative, as opposed to its hypothesized positive, effect on stock market volatility. This is not only contrary to expectations but to prior (e.g., Freeman, Hays and Stix, 2000) research as well as to the results we report later using the overnight sample. Testing hypothesis one—that the arrival of political information affects stock market volatility—is difficult to operationalize using polling data. We conceive of information arrival in terms of changes in the percentage of the population reporting a preference for Gore. As such, we use the change in GORE to measure this concept. As indicated in column three, this variable is negative not statistically significant. This suggests that information arrival does not play a role in stock market volatility. This result should be interpreted with caution, however, as it is far from clear that we have measured the concept correctly. In results not reported here we checked the robustness of these findings by using a different dependent variable and by also including a number of other political and economic variables in the conditional variance equation. For the three models reported in table 3 we 24 obtain very similar results using the return on the Dow Jones Industrial Average rather then on the S&P 500. We also experimented with including political variables such as the dates of the republican and democratic conventions and the dates when republican and democratic challengers dropped out of the race. We also included a dummy variable for the period after the democratic convention interacted with GORE. Economic variables included dates of Federal Open Market Committee meetings, dates of interest rate changes and continuous variables measuring both the level of and change in the three-month Treasury bill rate. In no case did inclusion of any of these variables significantly (in both a statistical and substantive sense) alter the results reported in table 3. A second laboratory within which to examine the effect of political information on stock market volatility was created the evening of November 7, 2000. As the evening progressed network and cable news outlets (as well as the major wire services) “called” the electoral outcome of each state. These calls constitute the arrival into the market of political information, information that affects the strategic behavior of traders. Table 4 includes a variety of econometric specifications to capture the price dynamics of S&P futures the night of November 7, 2000. In initial specification of the GARCH model residual diagnostics revealed remaining serial correlation so an AR(1) and MA(1) term were included. This is consistent with Bollerslev and Mikkelsen (1996) and the findings summarized in Dacorogna et al (2001) that volatility in high frequency financial data is persistent. In addition, because the Jarque-Bera test consistently rejects the null hypothesis of normality we utilize Bollerslev-Wooldridge semi-robust standard errors. The models in Table 4 also include two control variables. Following Engle (1996) we include a variable measuring the expected duration between trades in both the mean and variance equation. We also include a variable measuring the quantity traded each minute. 25 --Insert Table 4 about here-Column one is the basic GARCH specification that includes the lagged (by five minutes) variable measuring the probability that Gore will win the electoral college.47 While the coefficient on this variable is negative and statistically significant—providing support for both hypotheses three and four—the Ljung-Box test reveals remaining ARCH in the residuals. We re-estimate the model in column one using an EGARCH specification under the assumption that accounting for the asymmetry nature of shocks to volatility will render the residuals white noise. This intuition is born out in column two: again the variable measuring the probability that Gore will win the Electoral College is negative and statistically significant and the diagnostics reveal no remaining ARCH. Solving one problem, however, leads to another as we now see that the sum of ARCH and GARCH terms ( αˆ + βˆ ) is not significantly different from one indicating the existence of an integrated GARCH process. The solution, as presented in columns three-five, is to estimate a FIEGARCH model. The coefficients for these models lend support for our four hypotheses. The FIEGARCH model using the probability that Gore will win the Electoral College is well behaved and passes all diagnostic tests. The coefficient on GORE is negative and statistically significant providing support for hypothesis three. Political uncertainty—operationalized as entropy—exerts a positive and statistically significant effect on stock market volatility. Finally, the arrival of political information—measured by the calling of “tossup” states—increases volatility. We obtain the same results if we substitute the return on NASDAQ futures for S&P futures. (Unfortunately futures for the Dow Jones Industrial Average did not exist in 2000). The results are also unchanged if we include a set of dummy variables reflecting the times A five-minute lag was chosen since that is the average amount of time that it takes for a trade to be executed by the GLOBEX system. Changing the lag from between one and ten minutes did not alter our results. 47 26 when Florida was called for and subsequently taken away from both Gore and Bush. These findings are compelling from an experimental point of view: using a nightly sample minimizes the likelihood that exogenous factors such as earnings reports, interest rate expectations, or other events influenced the behavior of these asset prices. 3.4 Estimating a Markov Switching Model We also test the hypotheses from our theoretical model by estimating a Markov Regime-Switching model with time-varying transition probabilities (Freeman, Hays and Stix 2000; Diebold, Lee and Weinbach 1994).48 In our Markov-Switching model, we assume that the stock price series is governed by a two-state, first order Markov-Switching process. 49 Each state is characterized by a high (or low) variance and mean that corresponds to a separate regime. The series that we observe is thus a “mixture” of these two regimes50 where this mixture is determined by a probabilistic transition between the two states. More formally, we estimate an autoregressive specification where the mean and variance is subject to switches between two states that evolves according to a first-order Markov process: ∆ pt = µ S t + φ (∆ pt −1 − µ S t −1 ) + ε t , ε t ~ N (0,σ S2t ) , St ∈ {0,1} µ St = S t µ1 + (1 − S t ) µ 2  2 2 2 σ St = S t σ 1 + (1 − S t )σ 2 (15) A Markov-Switching model has useful statistical properties in that it can account for nonlinearities, nonstationarity, clustering, serial correlation and fat-tailed distributions of stock price volatility (Hamilton 1989, 1994; Engel 1994; Durland and McCurdy 1994; Filardo 1994). 49 In the Markov-Switching model that we estimate here, the probability that stock market returns and volatility is in either state (“regime”) 1 or 2 at time t is a function of the state that it was in t-1. The term “state”/ regime 1 denotes a situation of high variance and high mean, while state/regime 2 implies low variance and low mean. 50 Similar to Hamilton’s (1994: 685-696 ) Markov-Switching model, stock prices and volatility in our Markov model is drawn from a mixture of two normal densities. We assume that high volatility corresponds to a high mean and low volatility to a low mean. This follows from models in the theoretical finance literature, which presumes that rational agents invest in highly volatile assets only if the conditional mean (or risk-premium) of the returns of the volatile asset is high; see Merton (1973), Turner, Stratz and Nelson (1989). 48 27 In (15), µ1 denotes high mean and σ 12 high variance, while µ 2 represents low mean and σ 22 low variance. φ denotes the AR coefficient. The states , St , are generated by a realization of the first-order Markov process with transition probabilities, Pr( St = 1 | St −1 = 1) = p11,t , Pr( St = 2 | St −1 = 1) = 1 − p11,t Pr( St = 2 | St −1 = 2) = p22,t , Pr( St = 1 | St −1 = 2) = 1 − p22,t (16) p11,t denotes the probability of the state of high volatility, St = 1 , and p22,t indicates the probability of the low volatility state St = 1 . To examine the effect of exogenous variables on volatility, we allow the transition probabilities in (16) to depend on electoral uncertainty, information arrival, partisanship and trading volume via the logistic specification, p11,t = p22,t = exp(c1 + β1,k xt′−1,k ) 1 + exp(c1 + β1,k xt′−1,k ) 1 − p11,t = 1 − exp(c2 + β 2,k xt′−1,k ) 1 + exp(c2 + β 2,k xt′−1,k ) 1 − p22,t = 1 − exp(c1 + β1,k xt′−1,k ) 1 + exp(c1 + β1,k xt −1 ) exp(c2 + β 2,k xt′−1,k ) 1 + exp(c2 + β 2,k xt′−1,k ) (17) xt′−1, k is the vector of relevant political variable(s) (k=1)51 that affect the transition probabilities, while β1, k , β 2, k denotes the coefficient to be estimated and ci (i=1,2) is the constant.52 Notice that if β1,k > 0 ( β1,k < 0 ), then dp11,t dxt′−1, k > 0 ( dp11,t dxt′−1, k < 0 ) ∀xt′−1, k ∈ ℜ+ . This implies that for β1, k > 0 ( β1, k < 0 ) the probability of remaining in state 1 increases (decreases). If β 2,k > 0 ( β 2,k < 0 ), then dp22, t dxt′−1, k > 0 ( dp22,t dxt′−1, k < 0 ) ∀xt′−1, k ∈ ℜ + . Hence, for β 2,k > 0 ( β 2,k < 0 ), the probability of remaining in state 2 increases (decreases). We briefly derive the log-likelihood of the Markov-Switching model in Appendix B.53 51 52 53 To accurately interpret coefficients, we include one independent variable separately in each Markov model. Since xt′−1, k influences ∆Pt via p11 , p22 , it allows for nonlinear relationships between xt′−1, k and ∆Pt . We estimate our Markov-Switching model via the EM Algorithm (Diebold, Lee and Weinbach 1994: 690-695). 28 We first discuss the results derived from estimating a Markov-Switching model on the daily data set between 6th January to 7th November 2000. A battery of pretest results – reported in Table 5 – establishes the existence of regime switching in stock market volatility in this data set. Specifically, Wald tests rejects the null hypothesis of equality of means (7.97) and variances (12.85) and the null of independent switching between states at the 1% level (59.61).54 Both the Hansen (1992, 1996b) and the Garcia (1998) tests reject the null hypothesis of no switching in the mean and variance at the 1% level.55 --Insert Table 5 about here-In model (1), Table 5, the coefficient of β1,1 for the entropy variable is positive and significant, while β 2,1 is negative and significant. The coefficient of the high variance state σ 12 is significantly positive and about seven times higher than the coefficient of the low variance state σ 22 for the entropy variable. The estimated transition probability p11,t = 0.966 is significant and almost equal to 1. The results mentioned above indicate that increased uncertainty over the Presidential electoral outcome significantly increases the probability that stock prices will enter and remain in the state of high volatility. The Markov-Switching results in model (2) confirm the predictions from the theoretical model. The coefficient β 2,1 in this model is positive and highly significant, while β1,1 is negative and significant. The coefficient of low variance σ 22 is roughly 17 times higher than the coefficient of high variance σ 12 . Not surprisingly, we find that the estimated transition probability p22,t = 0.974 is close to 1 and highly significant. Put together, these “Independent switching between states” means that the current state is not dependent on the previous state. Garcia (1998) and Hansen (1992, 1996b) provide likelihood-ratio tests to test the null of a single regime in the mean and variance. Garcia derives the asymptotic null distribution of the LR statistic to obtain critical values; in his test, the 1% critical value for rejecting the null is 17.67. Hansen (1992, 1996b) uses empiricalprocess theory to derive the upper bound of his standardized LR statistic. 54 55 29 results demonstrate that as the probability of a Democrat’s (i.e. Gore’s) victory increases, stock price volatility declines and the persistence of the low volatility state increases. In model (3), the coefficients of both β 2,1 and σ 22 are positive, but insignificant. This result indicates that information arrival does not significantly affect stock price volatility. In model (4), the coefficients of β1,1 and σ 12 for the trading volume variable are both positive and significant. This confirms existing empirical results that indicate a positive correlation between trading volume and higher stock price volatility (Gallant, Rossi and Tauchen 1992). The coefficient of the parameter, φ , is also highly significant in all four columns in Table 5. Finally, the Ljung-Box Q-Statistics for lags 1 and 3 for each model shows that the MarkovSwitching model eliminates much of the serial correlation in the residuals till the third lag. To check for robustness, we also estimated each model in Table 5 on a smaller sub-sample, i.e. an “in-sample”, of 139 observations56 (results not reported here). There was no difference in the results obtained for the sub-sample for each model. The estimates of the Markov-Switching model on stock volatility data from the night of November 7th, 2000-- see Table 6-- produces no surprises. Wald tests reject the null hypothesis of equality of means (6.83), variances (12.46) and independent switching between states (39.21) at the 1% level. The Hansen (1992, 1996b) and Garcia (1998) tests also reject the null hypothesis of no switching in the mean and variance at the 1% level. --Insert Table 6 about here-The positive and significant coefficients of β1,1 and σ 12 for the entropy variable in model (5), Table 6, shows that there was certainly a significant correlation between increased uncertainty over which candidate will win the election and higher volatility during the night of November 56 The rest of the sample is used for “out-of-sample” forecast tests, which are described in the next section. 30 7th 2000. In contrast, the positively significant coefficients of β 2,1 > 0 , σ 22 > 0 for the Gore variable in this data set (see model (6)) suggests that when expectations of a Gore (a democrat’s) victory increased on the night of the November 7th, 2000, stock market volatility decreased. In fact, the coefficient of low variance, σ 22 , is 16 times higher than σ 12 , therein indicating that expectations of a Gore victory was strongly correlated with lower volatility.57 The significant coefficients of β1 > 0 and σ 12 > 0 for the Bush variable in model (6) demonstrate that traders’ expectation of a Bush victory on the night of November 7th, 2000 was correlated with significantly higher volatility. The coefficients for the information arrival variable in this data set provide ambiguous results. In particular, β1,1 , β 2,1 and σ 12 , σ 22 for this variable are insignificant (see model (7)). This result is different from the FIEGARCH model where the coefficient for information arrival is significant. As before, the coefficients of β1,1 and σ 12 for the trading volume variable are both positive and significant. The AR parameter, φ , is highly significant in all the five models in Table 6. The Ljung-Box Q-Statistics for lags 1 and 3 demonstrates that each Markov-Switching model eliminates much of the serial correlation in the residuals till the third lag. We also estimated each model in Table 6 on a smaller sub-sample, i.e. an “in-sample” of 385 observations58 for this data set (results not reported here). There was no difference in the results obtained for the sub-sample for each model. 4. Comparing Volatility Forecasts In this section, we examine which estimator –GARCH, EGARCH, FIEGARCH or the Markov-Switching model—provides more accurate volatility forecasts to judge the relative performance of these models. We first compare the Akaike Information Criterion 57 58 The estimated transition probability p22 = 0.989 in model ( 5) confirms this result. The remainder of the sample is used for “out-of-sample” forecast tests, as described in the next section. 31 (AIC) and Bayesian Information Criterion (BIC) statistics of all the models estimated for the daily data set. For the daily data set, the AIC statistic of each of the three estimated GARCH models is 1.24, while the BIC statistics are 1.36, 1.37 and 1.38 respectively. These AIC and BIC values are much lower than the lowest obtained AIC (382.25) and BIC (417.78) statistic from the estimated Markov-Switching models for the daily data set (see the last two rows in Table 5). Hence, in terms of information criteria, the GARCH models outperform all the Markov-Switching models in this case. Another way of comparing the GARCH and the Markov-Switching models is through out-of-sample forecast errors. Out-of-sample tests are effective since they control for the possibility of over-fitting and hence provide a useful framework for evaluating the merits of competing models. We used 80 observations from 08/17/2000 to 11/06/2000 in the daily data set for the out-of-sample forecast evaluation in the daily data set.59 Similarly, we used 350 observations from the last 6 hours in the overnight data set for the out-ofsample forecast evaluation.60 Two well-known criteria were used to evaluate the forecast errors from the models across both the data sets. These are the RMSE (Root Mean Square Error) and the MAE (Mean Absolute Error).61 Panel A, Table 7 reports the RMSE and MAE statistics for the GARCH, EGARCH and FIEGARCH models from the daily and overnight data sets. Panel A, Table 8 reports the RMSE and MAE statistics for all the Markov-Switching models in the daily and overnight data sets. --Insert Tables 7 and 8 about here-- We estimated all the GARCH and Markov-Switching models on this smaller sub-sample of 80 observations (results not reported but available on request). After doing so, we evaluated the out-of-sample-error forecasts. 60 We estimated all the GARCH, EGARCH, FIEGARCH and Markov-Switching models on this smaller subsample of 359 observations (results not reported). We then evaluated the out-of-sample error forecasts. 61 The (root) mean squared error provides a quadratic loss function, which disproportionately weighs large forecast errors more heavily relative to mean absolute error. As a result, the RMSE may be particularly useful in forecasting situations when large forecast errors are disproportionately more serious than small errors. 59 32 Observe that for the daily data set, the RMSE statistic of all the GARCH models is 0.134 and the MAE statistic is 0.011. These values are substantially lower than the lowest RMSE (1.227) and MAE (0.319) statistics from the Markov-Switching models for this data set. The RMSE value for the GARCH, EGARCH and FIEGARCH models based on 350 observations from the overnight data set is 0.001, while the MAE statistic is either 0.0003 or 0.0005 for these models. Once again, these values are significantly lower than the lowest RMSE (1.184) and MAE (0.329) statistics obtained from the Markov-Switching models for this data set. Put together, the RMSE and MAE statistics discussed above unambiguously demonstrate that all the GARCH, EGARCH and FIEGARCH models provide more accurate forecasts and fit the data better than any of the Markov-Switching models. To check and compare the out-of-sample volatility forecasts, we also estimated the following ex post volatility regression62 for each GARCH and Markov-Switching model, σ t2 = α + βσˆ t2−1 + ut (18) In (18), the measure of ex post (i.e. realized) volatility, σ t2 , is square of log change of daily closing prices (x 100) of S& P 500 index for the daily data set and square of log change of prices (x 100) of S&P 500 futures for the overnight data set. The term σˆ t2−1 denotes the forecasted conditional variance derived from the estimated out-of-sample GARCH and Markov-Switching models.63 The procedure that we adopted to estimate the above volatility/variance regression is as follows. We first estimated each of the GARCH (Table 3), GARCH, EGARCH, FIEGARCH (Table 4) and Markov-Switching models (Tables 5 and 6) on the out-of-sample observations from the daily and overnight data set. We then derived the 62 This regression is also known as the Mincer-Zarnowitz (1969) regression. 63 In appendix B, we describe how we estimated σˆ t2−1 for our Markov-Switching model. 33 forecasted conditional variance for each of these estimated models. The volatility regression in (18) was estimated for each model by using their forecasted conditional variance. After estimating (18) for each model, we checked if αˆ = 0 and βˆ = 1 (in each case) because the aforementioned hypothesized values indicate that the relevant model provides perfect forecasts. Note that if the estimated for a model is βˆ > (<) 0 , then that model underestimates (overestimates) the true realized volatility in the data (Pagan and Schwert 1990: 283). Results from estimating (18) for each GARCH model for the daily data set is reported in Panel B, Table 7. The estimated coefficient, α̂ , for all the GARCH models are well above 0. The estimated β̂ coefficient of the three GARCH models (-0.07, -0.068, 0.13) are insignificant and below 1. This indicates that the GARCH models do not provide accurate volatility forecasts for this data set. The estimated α̂ and β̂ for each Markov-Switching model for the daily data set – see Panel B, Table 8 -- fare much worse. The intercept estimate α̂ for all these models are much below 0, while the insignificant slope coefficient estimate β̂ of the Markov-Switching models are much higher than 1. This demonstrates that there exists a substantial bias in the forecasts from the Markov-Switching models and that these models are underestimating the degree of volatility in the daily data set. Moreover, it suggests that the GARCH models provide relatively better forecasts than the Markov-Switching models. The estimated volatility regressions for the GARCH, EGARCH and FIEGARCH models from the overnight data set –see Panel B, Table 7– provide mixed results. The estimated α̂ of all the GARCH, EGARCH and FIEGARCH models are approximately equal to zero. It is also significant for the EGARCH (p(gore)) and FIEGARCH (entropy) models. The estimates of β̂ for the FIEGARCH models are disappointing since they are 34 insignificant and well below 1. However, the estimated β̂ for the EGARCH model is slightly encouraging because it is significant and marginally different from 1. The estimate of α̂ for all the Markov-Switching models in the overnight data set is again well below their hypothesized value of 0 (see Panel B, Table 8). Likewise, the slope coefficient estimate β̂ of the Markov-Switching models are substantially higher than 1. In sum the volatility regression results for all models in both data sets is disappointing. Yet, combined with the RMSE and MAE statistics, the volatility regression results show that: (i) The GARCH and especially the EGARCH model outperforms the MarkovSwitching models with respect to out-of-sample forecasts in both the daily and overnight data set and, (ii) The EGARCH model provides the most accurate volatility forecast compared to the FIEGARCH and Markov-Switching models. Our out-of-sample error and volatility forecast results are surprising. They challenge the claim that Markov-Switching models provide more accurate forecasts of stock price volatility than various linear and non-linear GARCH models (Kim, Morley and Nelson 2002; Van Norden and Schaller 1993; Sola and Timmerman 1994).64 The analysis presented in this section thus raises a key question: Why do Markov-Switching models provide poor volatility forecasts in our study? We provide brief answers to this question in the conclusion. 5. Conclusion This paper makes two main contributions. First, unlike the existing literature, we argued and empirically demonstrated here that anticipation of a Democratic victory decreases stock price volatility. Second, in sharp contrast to methodological claims in the literature, our out-of-sample forecasts show that the GARCH and EGARCH models provide more accurate 64 Our out-of-sample forecasts are similar to Akgiray 1989 and Franses and Van Djick, 1996 who found that GARCH models fit stock market data better than random-walk models. Our analysis differs from their works in that we compare the out-of-sample forecasts between various GARCH and Markov-Switching models, which is not done by the aforementioned scholars. 35 forecasts of stock volatility than the Markov-Switching models. We posit three plausible reasons below to explain why the Markov-Switching models provided the worst forecasts. First, the Markov-Switching models that we have estimated here –and which is also commonly used in the empirical literature—cannot account for the presence of ARCH and GARCH effects within each regime. This is problematic since we know from the estimates of all the GARCH models, we know that the ARCH and GARCH terms significantly affect the conditional variance and that this, in turn, plays an important role in determining the degree of realized volatility. Thus, by not taking into account the effects of ARCH and GARCH on the conditional variance, the Markov-Switching models are underestimating the degree of realized volatility in both the daily and overnight data set.65 Second, on more technical grounds, it is well known that the Markov-Switching model places an upper bound on the conditional variance that is too small (Pagan and Schwert, 1990: 283). As a result, volatility estimates from the Markov-switching models are typically too low, which weakens their ability to predict realized volatility. This problem is evident in our analysis where the estimates of β̂ from the volatility regressions of the Markov-Switching models show that these models are seriously underestimating the degree of volatility in both two data sets. Third, in comparison to the Markov-Switching model, the GARCH, EGARCH and FIEGARCH models estimated are not only better at capturing the persistence of volatility, 66 but can also account for the differential effects of positive and negative shocks on stock price volatility. This is crucial since volatility in daily and especially tick (overnight) data tends to persist over long periods of time. The persistence of volatility, 65 Instead of estimating a “plain-vanilla” Markov-Switching model, it might have been more appropriate to estimate Markov-Switching-GARCH models (Dueker 1997, Klaassen 2002). 66 Note that, by construction, the switch between the states in our Markov-switching model is determined by a first-order Markov process. Combined with the absence of ARCH and GARCH terms within regimes, this impairs the ability of Markov-switching models to account for temporal persistence of volatility. 36 in all likelihood, affects the conditional variance and hence the degree of realized volatility. Further, it is possible that differential effects of shocks that stem from both “good” and “bad” news about the expected electoral outcome exert a powerful influence on realized volatility. The inability of Markov-Switching models to account for volatility persistence and the differential effects of shocks could be a key cause for it’s poor forecasting performance. Although we found that Markov-Switching models seriously underestimate the degree of stock market volatility, we need to do more extensive out-of-sample forecasting tests and use an experimental design by sampling the data at different intervals to confirm our methodological claims. We also need to estimate all the statistical models used here on different data sets; i.e. on data of stock price movements from other Presidential election years and from other advanced industrial democracies such as Britain, for example. If the obtained parameter estimates and out-of-sample forecasts from different data sets are similar to those reported there, then the methodological and theoretical claims that we have posited in this paper will be truly generalizable. 37 Appendix A Proof of Lemma 1: Since the market maker uses the linear pricing rule, p (n, s ) = b( s ) + α ( s )n the ith trader maximizes, E [(v − b( s ) − α ( s )n) xi ( I , s, v) + qi | v, s ] = [v − b( s ) − α ( s )∑ xi ( I , s, v)] xi ( I , s, v) + qi − α ( s )[ xi ( I , s, v)]2 where n = ∑x (1) j ≠i j ( I , s, v)] + q . The first-order condition with respect to xi ( I , s, v) is, j [v − b( s ) − α ( s )∑ xi ( I , s, v) − 2α ( s)[ xi ( I , s, v)] = 0 (2) j ≠i and the second-order condition is − 2α ( s ) < 0 or α ( s ) > 0 .From (2), we obtain after some algebra 1 [v − b( s ) − α ( s ) ∑ x j ( I , s , v )] α ( s) j 1 x ( I , s, v ) = [v − b( s)] ( I + 1)α ( s) xi ( I , s , v ) = (3) (4) 1 . The market-maker treats the linear strategy of traders as given and sets ( I + 1)α ( s) p (n, s) = E [v | s, n] . Since elliptically contoured distributions have linear conditional expectations Hence γ ( s ) = (see property (3) in Appendix B), we obtain E [v − E[v | s ] | s, n] =   Iγ ( s )Θ | s I n− ( E[v | s ] − b( s )  2 2 σ | s + I γ ( s) Θ | s  ( I + 1)α ( s )  2 q This implies that, α ( s) = Iγ ( s )Θ | s σ | s + I 2γ ( s ) 2 Θ | s (5) (6) 2 q   I (7) b( s ) = E[v | s ] − α ( s )  ( E[v | s ] − b( s )  ( I + 1)α ( s )  This yields b( s ) = E[v | s ] . Rearranging (6), we get Iγ ( s)Θ | s = α ( s)[σ q2 | s + I 2γ ( s) 2 Θ | s ] . From γ ( s) and the conditional variance matrix in equation (3) in the text, we obtain,   I2 I Θ + σ q2  = Θ 2 2  ( I + 1) α ( s )  ( I + 1)α ( s ) α ( s)  α (s) = α = where | Θ | > 0 and σq I Θ and γ ( s ) = γ = (1 + I ) s I 2σ q s I Θ (8) (9) Θ > 0 . Q.E.D Proof of Corollary to Lemma 1: Part (i) Let s′ ∈ S , s′′ ∈ S and s ∈ S , such that s ′′ ≥ s ′ ≥ s . Suppose further that | s′ − s |≠ ∅ and | s"− s |≠ ∅ . Since S is bounded, it follows that, α (| s ′ − s |) = σq I (1 + I ) | s ′ − s | Θ≥ σq I (1 + I ) | s ′′ − s | 38 Θ = α (| s ′′ − s |) and γ (| s′ − s |) = I 2σ q | s′ − s | I Θ Then α (| s ′ − s |) = γ (| s′ − s |) = ≥ I 2σ q | s′′ − s | I Θ σq I Θ≤ (1 + I ) | s ′ − s | I 2σ q | s′ − s | I Θ ≤ = γ (| s′′ − s |) . Now suppose that s ′′ ≤ s ′ ≤ s . I 2σ q | s′′ − s | I Θ I 2σ q | s′′ − s | I Θ = γ (| s′′ − s |) and = γ (| s′′ − s |) . ∴ α , γ are monotonic in s. Part (ii) Differentiating α with respect to I, we obtain, ( ) ∂α σ q s Θ 1 / 2 I + I / 2 I − I = >0 ∂I ( s + Is) 2 where Θ > 0 . Differentiating γ with respect to I, we obtain, (10) ( ) ∂γ σ q I Θ 3I / 2 I = > 0 . Q.E.D ∂I IΘs 2 Proof of Lemma 2: We prove that Var (v − p 0 | s ) is monotonic in s given n and s. Let the signal s A denote the arrival of new political information where s A ≠ s 0 . Define the variance of n as σ n =| n − n 0 | where n0 is the mean level of net order flows. From Property 5 of ECC distributions (see Appendix B),  (| n − n |2 ) (| s − s |2 )  σ2 (11) Var (v − p0 | s ) = g  2 2 0 2 + A 2 0  2 2 q 2 Θ  I γ Θ +σq  I γ Θ +σq σ s A   From (11), observe that Var (v − p0 | s ) monotonically increases when | s A − s 0 | increases. Also observe that Var (v − p 0 | s ) monotonically increases when σ n increases. Q.E.D. Proof of Proposition 1: We use a multivariate t distribution (which is also an ECC distribution) with k degrees of freedom and x =3 variables to prove that volatility increases under higher uncertainty, i.e. when | s A − s 0 | increases. The conditional density for the multivariate t distribution is d ( s , v, q ) = Γ[1 / 2(k + x)] (det ∑) −1 / 2 Γ[1 / 2k ]((k − 2)π ) x / 2 −(k +v) / 2   s A − s0    1  ( s A − s0 v − p0 q ) ∑ −1  v − p0  × 1 + k −2  q     (12) Given (12), the conditional variance between v and the unconditional expected price is, 2  (| s − s | 2 )    Θ = k − 2 1 k + (| s A − s 0 | ) k  Θ (13) Var[v − p 0 || s A − s 0 |] = h A 2 0 2   − − k k k 1 2 σ σ sA sA     Observe in (13) that an increase in the deviation of s A from it’s mean s0 will lead to an increase in the conditional variance of the traded stock’s price. In footnote 28, we claimed that σ n increases when I | s A − s 0 | increases. Note that n = ∑ xi ( I , (| s A − s0 |), v] + q(| s A − s0 |) . Since xi (.) and q are i =1 monotonic in | s A − s 0 | , an increase in | s A − s 0 | increases the variance of xi (.) , q and n. Q.E.D 39 We state and prove in the following Lemma (which we term as Chu’s Lemma), Chu’s (1973) first and second characterization of the expected absolute value of the random variable a that is drawn from an elliptically contoured distribution. This characterization is used to prove propositions 2 and 3 Chu’s Lemma: Let the random variable “a” be drawn from an elliptically contoured description with mean 0. Then   σa  z dW ( z )  2π ∞  z 2 dW ( z )1 2    ∫0 ∞ E[| a |] is a monotonic function of its variance σ and E[| a |] = ∫ 2 2 a 0 where z represents the random variable that determines the variance of the independent random variable “a”. From Chu’s (1973) second characterization (see Property 5, Appendix B), we obtain E[| a |] = where 2 2 π E[ Z ] σa (E[Z]1 2 Z represents the random variable that multiplies the unit normal in Chu’s (1973) second characterization. Proof: From Royden (1968: 270) and Chu’s (1973) characterization of ECC distributions, we get, +∞ ∞ ∞ ∞ −∞ −∞ 0 −∞ 0 E[| a |] = ∫ | a | l (a )da = ∫ | a | ∫ j (a, z )dW ( z )da ≡ ∫ ∫ ∞ | a | j (a, z )da dW ( z ) (14) where dW(z) is a finite measure and | a | j (a, z ) is a positive function. Equation (14) implies that,   σa  z dW ( z )  2π ∞  z 2 dW ( z )1 2    ∫0 ∞ E[| a |] = ∫ 2 0 (15) In (15), one can observe that E[| a |] is monotonic in σ a . The second characterization of E[| a |] when it belongs to the compound class of normal distributions (taken from Chu (1973) is, E[| a |] = 2 2 π E[ Z ] σa (E[Z]1 2 (16) Proof of Proposition 2: We prove this proposition by assuming that | sˆR − sR | increases and by using Chu’s Lemma. Since variance in the stock price is influenced by ŝR --which is a random variable – we let Var[v − p | (| sˆ R − s R |) = a = U × e and E (Var[v − p | (| sˆ R − s R |)) = E[U | (| sˆ R − s R |) . From Chu’s(1973) characterization of ECC distributions in Property 5 and the proof of Chu’s lemma: ∞ E[U | (| sˆ R − s R ∫ Zr (Z ) g (| sˆ |) = ∫ r (Z ) g (| sˆ 0 R ∞ R 0 − s R |) / Z )dz − s R |) / Z )dz (17) where r(.) is the density function of z and g the normal density function of (| sˆ R − s R |) / Z . Now observe that ∂E[U | (| sˆR − sR |) ∂E[U | (| sˆR − sR |) > 0 when > 0 . From (17), note that ∂ (| sˆR − sR |) ∂ (| sˆR − sR |) 2 ∞ ∞ ∂E[U | (| sˆR − sR |) = ∫ Zr ( Z ) g (| sˆR − sR |) / Z )dz × ∫ 1 / Zr ( Z ) g (| sˆR − sR |) / Z )dz 2 0 0 ∂ (| sˆR − sR |) ∞ + ∫ r ( Z ) g (| sˆR − sR |) / Z )dz 0 Equation (18) is strictly positive which implies that ∂E[U | (| sˆR − sR |) > 0 Q.E.D. ∂ (| sˆR − sR |) 2 40 (18) Proof of Lemma 3: From Lemma 1, α = σq I (1 + I )(| sˆ R − s R |) Θ , p(n, s ) = αn + E [v | s] . Hence, ∂p(.) ∂I ∂p(.) ∂n ∂α < 0 , λ = (| sˆ R − s R |) . For α and p(n,s), we obtain the Jacobian, J =  . ∂α ∂n  ∂λ  ∂α ∂I Note that ∂α ∂I = 0, ∂α ∂n > 0, ∂p (.) ∂I = 0 and ∂p (.) ∂n > 0. Hence |J|< 0. From the ∂p(.) ∂λ ∂p(.) ∂n dp(.) dp(.) implicit function theorem = − | J | which ⇒ > 0. Q.E.D  ∂α ∂n  dλ dλ  ∂α ∂λ Lemma 4: When | sˆ R − s R | increases, then p(n,s) can revert to its mean p0 . Proof of Lemma 4: Note that if p (n, s ) = E[v | s ] , then p (n, s ) = p0 . From equation (9) in text and monotonicity of γ and α in s, it follows that γ and α decreases when | sˆ R − s R | increases. Suppose lim | sˆR − sR | j → ∞ . Then lim α j → 0 which ⇒ for lim , p (n, s) = E[v | s ] = p0 Q.E.D j →∞ j →∞ j →∞ Proof of Proposition 3: Define s′D ∈ S such that | s′D − sD | < | sˆD − sD | . From (18), this implies ∂E[U | (| s′D − sD |) ∂E[U | (| sˆD − sD |) < . If we continue this process and define s′D′ ∈ S ∂ (| s′D − sD |) ∂ (| sˆD − sD |2 ) ∂E[U | (| s′D′ − sD |) ∂E[U | (| s′D − sD |) < . Taking the such that | s′D′ − sD | < | s′D − sD | , then 2 ∂ (| s′D′ − sD | ) ∂ (| s′D − sD |2 ) limit lim sˆ D → s D , we find that lim sˆ D → s D Var[v − p | (| sˆ D − s D |)] strictly decreases. Q.E.D that Lemma 5: Suppose lim | sˆ D − s D |→ 0 . Then γ (| sˆD − sD |) strictly decreases as lim | sˆ D − s D |→ 0 . sˆD → s D sˆD → s D Proof of Lemma 5: From Lemma 1, γ ( s ) = γ = I σq 2 . Substitute | sˆ D − s D |= s . s I Θ ∂γ (.) Differentiating with respect to | sˆ D − s D | leads to < 0 . Suppose sˆ D < s D or ∂ (| sˆD − sD |) ∂γ (.) sˆ D > s D . Since < 0 , γ (.) strictly decreases as lim | sˆ D − s D |→ 0 . Q.E.D sˆD → s D ∂ (| sˆD − sD |) Lemma 6: n and α (| sˆ D − s D |) strictly decreases in | sˆ D − s D | as lim | sˆ D − s D |→ 0 sˆD → s D Proof of Lemma 6: α ( s ) = α = σq I (1 + I ) s Θ from Lemma 1. Substitute | sˆ D − s D |= s . ∂α (.) < 0 , which implies that α (.) ∂ (| sˆD − sD |) strictly decreases as lim | sˆ D − s D |→ 0 . When α (| sˆD − sD |) strictly decreases, then xi (.) Differentiating with respect to | sˆ D − s D | shows that sˆD → s D decreases for each i and, in the aggregate, n decreases. Q.E.D 41 Appendix B The FIEGARCH (1,d,1) Model: To motivate the discussion, rewrite the expression for the conditional variance in equation (13) by dropping the exogenous variables, adding ε 2t to both sides, and moving σ 2t to the right hand side: ε 2t = ω + (α 1 + β 1 )ε 2t −1 + ν t − β 1ν t −1 where ν t = ε − σ 2 t 2 t . 2 t . (B1) Note that the GARCH (1,1) representation in (B1)can be thought of as an ARMA model for ε It follows that the GARCH (1,1) model is covariance stationary iff α 1 + β 1 < 1 . In high frequency data, α 1 + β 1 = 1 , which is an Integrated GARCH (IGARCH) model since α 1 + β 1 = 1 implies a unit root for ε 2t in (B1). As in an ARMA model, the existence of a unit root means that shocks to the conditional variance die out very slowly (Bollerslev et al. 1992:15). This is in contrast with the expectation that volatility is mean reverting. An IGARCH model can be rewritten as: φ ( L)(1 − L)ε 2t = ω + ν t − β 1ν t −1 (B2) To deal with long-memory in volatility, Baillie, Bollerslev and Mikkelsen (1996) introduce the Fractionally Integrated GARCH or FIGARCH (p,d,q) model. Denoting the fractional integration parameter by d and adding d to the first difference operator, Baillie et al (1996) derive the FIGARCH model: φ ( L)(1 − L) d ε t2 = ω + ν t − β 1ν t −1 (B3) As in standard ARIMA type models, the fractional differencing parameter, d, indicates the speed at which shocks to ε 2t die out over time. Baillie et al (1996) and Bollerslev and Mikkelsen (1996) find that the FIGARCH model fits high frequency data quite well. The second problem with the standard GARCH model is that it assumes that both positive and negative innovations/error ( ε 2t ) have the same effect on the conditional variance. This is problematic since a positive (negative) shock or innovation may have a larger effect on the conditional variance than a negative (positive) one. This phenomena is quite common in the analysis of stock (or currency) returns when investors engage in herding behavior-- a negative shock leads larger volatility than a positive shock. To deal with asymmetric effects of shocks, Nelson (1991) developed an Exponential GARCH (EGARCH) which from equation (12) can be written as follows: ln(σ 2t ) = ω + αzt −1 + γ 1 (| zt −1 |− E (| zt −1 |)) + β 1 ln(σ 2t −1 ) (B4) where zt represents standardized innovations (εt/σt), and E is the expectations operator In B4, the conditional variance is a function of four terms: the constant, the GARCH term ( σ t2−1 ) and two ARCH terms--an asymmetric component (zt-1) and a symmetric component (| zt −1 |− E (| zt −1 |)) . Consider (| zt −1 |− E (| zt −1 |)) which measures deviations between realized and expected innovations and can hence capture how unexpected innovations affect conditional volatility. γ is typically greater than zero. Thus, if (|zt| > E|zt|), then future volatility will be higher than its average level. However, if |zt| < E|zt|, then future volatility will be lower than average. αzt-1 provides for the asymmetric effect of the standardized innovations. If γ>0, then a positive (negative) value for α implies that positive (negative) shocks will have a larger effect on future volatility than negative (positive) shocks. Bollerslev and Mikkelsen (1996) introduce the Fractionally Integrated Exponential GARCH model (FIEGARCH). This approach combines the FIGARCH and EGARCH specifications. Rewriting equation (B4) using the backshift operator: ln(σ 2t ) = ω + [1 − β ( L)]−1[1 + α ( L)]g ( zt −1 ) (B5) where g ( zt ) = θzt + γ [| zt |− E | zt |] . Factorizing the autoregressive polynomial [1 − β ( L)] = φ ( L)(1 − L) d , Bollerslev and Mikkelsen (1996) derive the FIEGARCH model: ln(σ 2t ) = ω + φ ( L) −1 (1 − L) − d [1 + α ( L)]g ( zt −1 ) 42 (B6) Derivation of Log-Likelihood for Markov-Switching Model: We briefly describe the log-likelihood for the Markov-Switching model with an AR(1) specification. For more details, readers can consult Diebold, Lee and Weinbach (1994: 285-289) and Hamilton (1994: 690-695). Let θ1 = ( µ1 ,σ 12 ,φ )′ and θ 2 = ( µ 2 ,σ 22 ,φ )′ denote the parameter vectors that require to be estimated. Define θ = (θ1 ,θ 2 ) and t = 1,2,…T. Let Pr( s1 = 1) = ρ denote the probability that at t=1, the state is 1. Hence, Pr( s1 = 2) = 1 − ρ denotes the probability that at t=1, the state is 2. Define the indicator function at t=1 for state 1 as I ( s1 = 1) and I ( s1 = 2) for state 2. To avoid notational confusion, we let ∆pt = yt . Finally, note that p11, t = {st = 1 | st −1 = 1} , 1 − p11,t = {st = 2 | st −1 = 1} , p22,t = {st = 2 | st −1 = 2} and 1 − p22,t = {st = 1 | st −1 = 2} , where p11,t = exp( xt' −1β1 ) 1 + exp( xt' −1β1 ) . Using Diebold, Lee and Weinbach’s (1994: 285-289) terminology, we can define the “complete-data likelihood” of the Markov Switching model in log form as: f ( yT | yT −1 , sT ; θ) = I ( s1 = 1)[log ρ + log f ( y1 | s1 = 1;θ1 )] + I ( s1 = 0)[log(1 − ρ ) + log f ( y1 | s1 = 1;θ1 )] T + ∑{I ( st = 1) log f ( yt | yt −1 , st = 1;θ1 ) + I ( st = 2) f ( yt | yt −1 , st = 1;θ1 ) t =2 + I ( st = 1, st −1 = 1) log( p11,t ) + I ( st = 2, st −1 = 1) f ( yt | yt −1 , st = 1;θ1 ) log(1 − p11,t ) (B7) + I ( st = 2, st −1 = 2) log( p22,t ) + I ( st = 1, st −1 = 2) f ( yt | yt −1 , st = 1;θ1 ) log(1 − p22,t ) Since in practice, the complete data cannot be observed, we require the “incomplete-data” log likelihood which can be obtained by summing over all possible state sequences:  1 log f ( yT | yT −1 , xT , sT ; θ) = log ∑  s1 = 0 1  ... f ( yT | yT −1 , xT , sT ; θ)  ∑ ∑ s2 =0 sT = 0  It is intractable to maximize the above log-likelihood with respect to θ . We use the same EM 1 (B8) algorithm as used by Diebold, Lee and Weinbach (1994: 287-288) and Hamilton (1994:pp#) for maximization of the incomplete-data likelihood. Interested readers can refer to the pages in Diebold et al (1994) where the EM algorithm has been described. Calculation of Conditional Variance, σˆ t2 , From Markov-Switching Model: We show how to compute σˆ t2 from our Markov-Switching model; obtaining σˆ t2−1 is trivial given σˆ t2 . Let µ S t = α1St + α 2 and σ S2t = ω1St + ω 2 ; recall ε t ~ N (0,σ S2 ) . Suppose that the stock t price index was in regime 1 at t-1. From Pagan and Schwert (1990: 275), σˆ t2 is derived from [ E{σ ( St ) | St −1 = 1}]2 + var{σ ( St ) | St −1 = 1} + E{[ µ ( St ) − E ( µ ( St ))]2 | St −1 = 1} which yields, [ω 2 + ω1 p11 ]2 + ω12 p11 (1 − p11 ) + α12 p11 (1 − p11 ) (B9) Suppose that the stock price index was in regime 2 at t-1. From Pagan and Schwert (1990: 275), σˆ t2 is derived from [ E{σ ( St ) | St −1 = 2}]2 + var{σ ( St ) | St −1 = 2} + E{[ µ ( St ) − E ( µ ( St ))]2 | St −1 = 2} , i.e. from, [ω 2 + ω1 (1 − p22 )]2 + ω12 p22 (1 − p22 ) + α12 p22 (1 − p22 ) Multiplying (B9) and (B10) by the estimates of the conditional probabilities of being in each regime given data through t-1 gives the estimate of the conditional variance σˆ t2 , which is then lagged to obtain σˆ t2−1 . 43 (B10) Table 1: Data, Variables and Sources VARIABLE Daily Sample Return on S&P 500 Index Volume Traded Before After Gore Overnight Sample Return on S&P 500 Futures Volume Traded Duration Gore Information Arrival MEASURE SOURCE (d(log(sp)))*100 – daily closing price log(volume) Number of non-trading days prior to day t Number of non-trading days after day t Share of Gore’s 2-party vote based on aggregation of polls finance.yahoo.com (d(log(sp)))*100 – average price per minute log(volume) – total volume traded per minute. Predicted duration between trades based on hazard model. Probability that Gore will win Electoral College Time when CNN called PA, OH, MI, IL, WI, MO, WA, LA finance.yahoo.com Wlezien(2001); Franklin (2001) Chicago Mercantile Exchange GLOBEX database Chicago Mercantile Exchange GLOBEX database Simulation based on times called on air by CNN CNN Summary Statistics From the Daily Sample Variable Dlsp Before After Ldv Rpctgore Entropy drpctgore Mean 0.0092563 0.3926941 0.3926941 20.73713 47.79178 99.57449 0.0183875 Std. Dev 1.339722 0.8356405 0.8356405 0.1593171 2.4058 0.4410992 1.963907 Min -6.00451 0 0 19.92897 41.92393 97.39109 -5.957844 Max 4.654578 3 3 21.13898 55.29616 99.9999 6.304455 N 219 219 219 219 219 219 219 Summary Statistics From the Overnight Sample Variable dlsp sq st_dur l5gore l5entropy ltossup Mean 0.0001917 3.767513 0.558605 0.4392081 0.8901627 0.0050761 Std. Dev 0.0305541 8.12765 1.291843 0.154233 0.2278682 0.0711021 Min -0.2961127 0 -6.94119 0.021 0.082236 0 44 Max 0.1990449 110 2.204567 0.657 0.999984 1 N 985 985 985 985 985 985 Table 2: State Probabilities, Polls and p-values (Overnight Sample) State Alaska Alabama Arkansas Arizona California Colorado Connecticut Delaware Florida Georgia Hawaii Iowa Idaho Illinois Indiana Kansas Kentucky Louisiana Massachusetts Maryland Maine Michigan Minnesota Missouri Mississippi Montana North Carolina North Dakota Nebraska N. Hampshire New Jersey New Mexico Nevada New York Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina South Dakota Tennessee Texas Utah Virginia Vermont Washington Wisconsin West Virginia Wyoming Sample Size 400 625 286 423 600 400 447 625 600 512 261 603 633 600 600 600 625 660 401 627 400 600 1015 600 625 628 625 586 1007 801 843 425 625 700 600 625 600 600 370 625 300 500 625 914 625 400 500 400 536 412 Gore % 26 38 44 39 45 38 48 42 48 37 50 44 30 50 30 32 41 38 52 52 47 51 47 46 41 37 41 35 31 39 41 45 43 54 43 39 45 50 47 38 33 46 30 27 41 52 50 39 39 32 Bush % 47 55 47 49 44 47 32 46 46 53 31 43 56 42 53 55 51 46 30 38 36 44 37 46 52 49 48 47 56 45 36 45 47 37 50 54 44 42 29 53 51 51 64 59 49 36 42 44 41 57 Gore/ (Bush+Gore) 0.356 0.409 0.484 0.443 0.506 0.447 0.600 0.477 0.511 0.411 0.617 0.506 0.349 0.543 0.361 0.368 0.446 0.452 0.634 0.578 0.566 0.537 0.560 0.500 0.441 0.430 0.461 0.427 0.356 0.464 0.532 0.500 0.478 0.593 0.462 0.419 0.506 0.543 0.618 0.418 0.393 0.474 0.319 0.314 0.456 0.591 0.543 0.470 0.488 0.360 P-Value 0.000 0.000 0.287 0.010 0.607 0.017 1.000 0.127 0.697 0.000 1.000 0.609 0.000 0.983 0.000 0.000 0.003 0.007 1.000 1.000 0.996 0.964 1.000 0.498 0.002 0.000 0.024 0.000 0.000 0.021 0.970 0.498 0.132 1.000 0.032 0.000 0.607 0.983 1.000 0.000 0.000 0.124 0.000 0.000 0.013 1.000 0.974 0.113 0.280 0.000 Electoral Votes 3 9 6 8 54 8 8 3 25 13 4 7 4 22 12 6 8 9 12 10 4 18 10 11 7 3 14 3 5 4 15 5 4 33 21 8 7 23 4 8 3 11 32 5 13 3 11 11 5 3 *New Mexico and Oregon were not called before markets closed on November 8, 2000. 45 Time Called by CNN (est) 12:00am (B) 8:00pm (B) 12:12am (B) 11:51pm (B) 11:00pm (G) 11:41pm (B) 8:00pm (G) 8:00pm (G) see below 7:59pm (B) 11:00pm (G) 5:00am (G) 10:00pm (B) 8:00pm (G) 6:00pm (B) 8:00pm (B) 6:00pm (B) 9:21pm (B) 8:00pm (G) 8:00pm (G) 10:10pm (G) 9:24pm (G) 10:25pm (G) 10:47pm (B) 8:00pm (B) 10:00pm (B) 8:14pm (B) 9:00pm (B) 9:00pm (B) 12:07am (B) 8:00pm (G) not called* 1:31am (B) 9:00pm (G) 9:19pm (B) 8:00pm (B) not called* 9:24pm (G) 9:00pm (G) 7:00pm (B) 9:00pm (B) 11:03pm (B) 8:00pm (B) 10:00pm (B) 7:33pm (B) 7:00pm (G) 12:08am (G) 6:21am (G) 10:46pm (B) 9:00pm (B) Table 2 (Continued): Florida Time 7:52pm 8:55pm 1:18am 2:58am Event Called for Gore Toss-up (taken away from Gore) Called for Bush Toss-up (taken away from Bush) Table 3: Daily GARCH Models (N=218 in all models) Mean Intercept Variance Intercept ARCH GARCH Before After log(Volume ) Gore (1) (2) (3) -0.009 (0.074) 0.005 (0.075) -0.001 (0.072) -60.267* (0.378) 0.138 (0.084) 0.628* (0.144) -82.08* (0.285) 0.134 (0.127) 0.726* (0.207) -53.82* (1.81) 0.136 (0.139) 0.704* (0.254) -0.656 (0.739) 0.572* (0.228) -1.09 (1.78) 0.545 (0.308) -0.313 (0.818) 0.822 (0.551) 3.14* (0.003) -0.134* (0.007) 4.195* (0.015) 2.49* (0.009) Entropy -0.07* (0.003) d(Gore) -0.175 (0.154) Diagnostics p-value p-value p-value LB(10) 0.1742 0.1742 0.1742 LB2(10) 0.1050 0.0978 0.0965 Jarque-Bera 0.6201 0.6209 0.6207 AIC 1.24 1.24 1.24 BIC 1.36 1.38 1.37 Notes: Cell entries are maximum likelihood estimates with semi-robust standard errors in parentheses. 46 Table 4: Overnight GARCH Models (N=985 in all models) (1) (2) -0.001* (0.0003) AR(1) MA(1) Mean Intercept Duration Variance Intercept ARCH GARCH (3) (4) (5) -0.0004 (0.0004) -0.0004 (0.0004) -0.0004 (0.0005) -0.0004 (0.0005) 0.025 (0.26) 0.413* (0.159) 0.279* (0.150) 0.274* (0.130) 0.268* (0.133) 0.066 (0.248) -0.239 (0.175) -0151 (0.167) -0.131 (0.167) -0.132 (0.150) -0.0027* (0.0010) -0.0005 (0.001) -0.0005 (0.001) -0.0006 (0.002) -0.0005 (0.003) 0.0015* (0.00003) 0.158* (0.026) -1.64* (0.117) 0.30* (0.03) -3.09* (0.344) 0.459* (0.045) -3.67* (0.373) 0.468* (0.046) -4.603* (0.487) 0.443* (0.045) 0.686* (0.008) 0.801* (0.014) 0.025 (0.068) 0.049 (0.048) 0.060 (0.037) 0.063 (0.048) 0.089* (0.040) 0.120 (0.048) 0.081* (0.038) 0.120* (0.015) 0.018* (0.002) -0.406* (0.075) 0.121* (0.018) 0.332* (0.034) 0.080* (0.006) -1.11* (0.218) 0.124* (0.018) 0.350* (0.034) 0.080* (0.007) 0.071* (0.018) 0.399* (0.036) 0.085* (0.007) EGARCH fraction (d) Duration Volume P[Goret-5] 0.0001 (0.001) 0.001* (0.0001) -0.0024* (0.0004) Entropyt-5 0.232* (0.100) Info Arrivalt-5 1.544* (0.700) Diagnostics LB(12) 0.5176 0.5043 0.5269 0.4764 0.4682 2 LB (12) 0.0725 0.1151 0.1345 0.1254 0.1201 Jarque-Bera 0.0000 0.0000 0.0000 0.0000 0.0000 AIC -5257 -4735 -4784 -4769 -4784 BIC -5208 -4681 -4725 -4710 -4725 Notes: Cell entries are maximum likelihood estimates with semi-robust standard errors in parentheses. 47 Table 5: Markov-Switching Estimates for Daily Sample (N=218) Parameters µ1 µ2 β 1,1 c1 β 2,1 c2 σ 12 σ 22 φ p11,t p 22,t (1) Entropy 1.113 (0.647) 0.702 (0.422) 2.254 (0.660) 1.710 (0.512) -0.471 (0.289) 1.587 (0.978) (2) Gore 0.392 (0.237) 1.436 (0.203) -0.415 (0.308) 0.895 (0.556) 3.142 (0.395) 1.280 (0.764) (3) Information 0.413 (0.289) 1.031 (0.244) -0.397 (0.325) 0.593 (0.417) 2.119 (0.324) 0.977 (0.743) (4) Volume -0.141 (0.078) 0.375 (0.082) 0.698 (0.110) 0.190 (0.146) 0.423 (0.265) 0.147 (0.136) 2.360 (0.315) 0.325 (0.643) 0.334 (0.036) 0.966 (0.083) 0.684 (0.194) 0.183 (0.225) 3.134 (0.512) 0.359 (0.048) 0.712 (0.186) 0.974 (0.181) 0.237 (0.149) 1.874 (0.419) 0.293 (0.051) 0.664 (0.297) 0.955 (0.214) 0.598 (0.151) 0.147 (0.102) 0.264 (0.067) 0.950 (0.214) 0.708 (0.129) -245.342 -230.246 -211.348 -316.022 Wald Tests H 0 : µ1 = µ 2 7.97 ** H0 :σ = σ 12.85** H 0 : p 22 = 1 − p11 59.61** 2 1 2 2 LRT Tests: Garcia 85.14** Hansen 5.98** Log Likelihood Ljung-box Q-statistics LB-1 LB-3 0.141 (0.718) 0.925 (0.428) 0.157 (0.697) 0.125 (0.724) 0.976 (0.411) 0.912 (0.433) AIC 342.45 394.73 382.25 BIC 430.22 422.15 417.78 Notes: Standard errors reported in parentheses. * 5% level, ** 1% level. 48 0.158 (0.530) 0.934 (0.522) 376.23 439.06 Table 6: Markov-Switching Estimates for Overnight Sample (N=985) Parameters µ1 µ2 β 1,1 c1 β 2,1 c2 σ 12 σ 22 φ p11 p 22 (5) Entropy 1.462 (0.852) 0.341 (0.536) (6) Gore 0.282 (0.165) 1.012 (0.314) (7) Bush 0.564 (0.307) 0.196 (0.288) (8) Information 0.555 (0.321) 0.482 (0.346) (9) Volume -0.156 (0.081) 0.450 (0.092) 1.941 (0.299) 1.027 (0.673) -0.335 (0.299) 0.583 (0.337) 2.424 (0.307) 0.632 (0.422) 0.276 (0.185) 0.373 (0.191) 0.677 (0.122) 0.208 (0.315) -0.558 (0.318) 0.664 (0.397) 3.529 (0.411) 1.119 (0.689) 0.714 (0.397) 0.878 (0.522) 0.459 (0.312) 0.248 (0.196) 0.222 (0.139) 0.135 (0.172) 1.862 (0.218) 0.265 (0.410) 0.185 (0.228) 2.977 (0.432) 4.056 (0.493) 0.386 (0.219) 0.458 (0.293) 0.677 (0.442) 0.662 (0.138) 0.148 (0.087) 0.455 (0.027) 0.988 (0.125) 0.701 (0.163) 0.413 (0.038) 0.771 (0.159) 0.989 (0.146) 0.298 (0.034) 0.991 (0.127) 0.698 (0.141) 0.340 (0.052) 0.797 (0.403) 0.824 (0.397) 0.236 (0.061) 0.932 (0.145) 0.783 (0.191) -179.062 -314.311 0.138 (0.755) 1.011 (0.205) -148.57 -211.02 0.215 (0.731) 1.226 (0.512) -136.35 -221.63 Wald Tests H 0 : µ1 = µ 2 6.83** H0 :σ = σ 12.46** H 0 : p 22 = 1 − p11 39.21** 2 1 LRT Tests: Garcia Hansen 2 2 85.27** 7.25** Log Likelihood -127.686 -165.801 -138.452 Ljung-box Q-statistics LB-1 0.104 (0.836) 0.129 (0.784) 0.157 (0.695) LB-3 0.819 (0.514) 0.929 (0.371) 1.233 (0.299) AIC -106.68 -109.65 -114.22 BIC -214.51 -203.26 -259.37 Notes: Standard errors reported in parentheses. * 5% level, ** 1% level 49 Table 7 : Error and Volatility Forecasts from all GARCH models GARCH (RpctGore) Daily Sample GARCH GARCH (Entropy) (Information Arrival) Overnight Sample GARCH EGARCH FIEGARCH FIEGARCH (P(Gore)) (P(Gore)) (P(Gore)) (P(Gore)) FIEGARCH (Information Arrival) Panel A. Error Forecasts: RMSE 0.134 0.134 0.134 0.001 0.001 0.001 0.001 0.001 MAE 0.011 0.011 0.011 0.0003 0.0003 0.0005 0.0003 0.0005 α̂ 1.32 (0.58) 1.33 (0.57) 0.94 (0.51) 0.0001 (0.0001) 0.0002 (0.0001) 0.0001 (0.0001) 0.0004 (0.0001) 0.0001 (0.0001) βˆσˆ t2−1 -0.07 (0.28) -0.068 (0.26) 0.13 (0.23) 0.26 (0.087) 0.328 (0.101) -0.0001 (0.0001) -0.0001 (0.0002) -0.0001 (0.0001) R2 0.0009 0.0009 0.0037 0.0261 0.0252 0.023 0.024 0.022 Panel B. Volatility regression Notes: Standard Errors in Parentheses. 50 Table 8: Error and Volatility Forecasts from all Markov Switching models Markov (RpctGore) Daily Sample Markov Markov (Entropy) (Information Arrival) Markov (Volume) Overnight Sample Markov Markov Markov Markov (P(Gore)) (Bush) (Entropy) (Information Arrival) Markov (Volume) Panel A. Error Forecasts: RMSE 1.227 1.273 1.351 1.462 1.184 1.878 1.367 1.421 1.193 MAE 0.385 0.379 0.371 0.319 0.329 0.401 0.388 0.390 0.365 α̂ -0.406 (0.284) -0.418 (0.288) -0.402 (0.291) -0.395 (0.274) -0.327 (0.290) -0.319 (0.279) -0.334 (0.293) -0.309 (0.282) -0.328 (0.275) βˆσˆ t2−1 3.993 (1.048) 4.112 (0.997) 4.079 (1.023) 3.988 (1.103) 5.643 (1.482) 5.187 (1.311) 5.889 (1.263) 5.074 (1.107) 5.631 (1.505) R2 0.0003 0.0002 0.0005 0.0003 0.0009 0.0011 0.0012 0.0010 0.0011 Panel B. Volatility regression Notes: Standard Errors in Parentheses. 51 Figure 1 52 Time (EST) 32.13 31.47 31.21 30.55 30.29 30.03 2:58am: FL from Bush 29.37 29.11 28.45 1:18am: FL for Bush 28.19 27.53 6pm: IN & KY for Bush 7:33pm: VA for Bush 27.27 27.01 26.35 26.09 25.43 11pm:CA for Gore 25.17 24.51 24.25 23.59 23.33 23.07 8:55pm: FL from Gore 22.41 22.15 21.49 21.23 0.6 20.57 20.31 20.05 19.39 19.13 18.47 18.21 0.5 17.55 0.2 17.29 0.3 17.03 16.37 16.11 15.45 Pr(Gore Victory) 0.7 6:21am: WI for Gore 7:52: FL for Gore 0.4 5am: IA for Gore 0.1 0 Appendix B (Continued) Properties of Elliptically Contoured Distribution (Used for Proofs in Appendix A.) For proofs of the lemmas and propositions in Appendix A, we used the following definitions and properties of the Elliptically Contoured (ECC) Distribution. These are taken from Cambanis, Huang and Simons (1981), Johnson (1987), Foster & Vishwanathan (1995) and Chu (1973). Property 1: A random variable x is said to have an elliptically contoured distribution if its characteristic function is of the form, E [a it/x ] = a it/µ × f (t' ∑ -1 t ) where µ is the mean vector of the random variable x and f (.) satisfies the necessary conditions f (0)=1, | f (t ) |< 1 ∀t ≠ 0, and f (t) is continuous. Property 2: Assume that the random variable x has ∑ -1 = I . The elements of x are mutually independent if and only if x ~ N (0, I ) . The Normal Distribution is the only elliptically contoured distribution that has the independence property. Property 3: If x ~ ( µ , , ∑, f ) and we partition x' = ( x1′, x′2 ) and therefore, ∑ ∑ =  11  ∑ 21 ∑12   ∑ 22  From the above matrix, we obtain, E [ x1 | x 2 ] = ∑12 ∑ −221 x 2 If E [ x1 | x2 ] = 0 , then x1 , x2 are semi-independent. Elliptically contoured distributions are the only distributions with linear conditional expectations. Property 4: From Property 4, it holds that Var [ x1 | x2 ] is independent of x2 if and only if x ~ N (0, Ω) . Property 5: If the random variable a is drawn from an elliptically contoured multivariate distribution with mean zero, then from Chu’s (1973) lemma it follows that, ∞ l(a) = ∫0 u(a,z )dW(z) where dW (.) is a weighting function on [0, ∞) that can assume negative values and ∫ ∞ 0 dW ( z ) = 1 . u(.,z ) is the normal density with mean zero and variance z 2Ω . If dW (z ) ∈ ℜ + then it is itself a density and l(a) is in the compound normal class. Following Chu (1973) and Foster and Vishwanathan (1995) a second characterization of the random variable a when it is drawn from an elliptically contoured multivariate distribution is a = variable independent of the normal random variable e. 53 Z × e where Z is a positive random References Admati, A.R and P. Pfleiderer. 1988. “A Theory of Interday Patterns: Volume and Price Variability,” Review of Financial Studies 1: 3-40. Akgiray, V. 1989. “Conditional Heteroskedasticity in Time-Series of Stock Returns: Evidence and Forecasts,” Journal of Business 62:55-80. Alesina, Alberto and Nouriel Roubini. 1992. “Political Cycles in OECD Economies.” Review of Economic Studies 59:663-688. Alesina, Alberto, and Nouriel Roubini with Gerald D. Cohen. 1997. Political Cycles and the Macroeconomy. Cambridge: MIT Press. Alesina, Alberto and Howard Rosenthal. 1995. Partisan Politics, Divided Government, and the Economy. Cambridge: Cambridge University Press. Alter, Alison, B. and Lucy M. Goodhart. 2003. “Explaining the Public Mind: Economic Expectations and Electoral Cycles,” Presented at the MPSA Conference, April 3-6, Chicago, Illinois. Baillie, R., T. Bollerslev and H. Mikkelsen. 1996. “Fractionally Integrated Generalized Autoregressive Conditional Heteroskedasticity,” Journal of Econometrics 74:3-30. Beine, M., S. Laurent and C. Lecourt 2001. “Official Central Bank Interventions and Exchange Rate Volatility: Evidence From a Regime Switching Analysis,” Ms., University of Lille, France Blomberg, S. Brock and Gregory D.Hess. 1997. “Politics and Exchange Rate Forecasts.” Journal of International Economics 43:189-205. Bollerslev, T. 1986. “Generalized Autoregressive Conditional Heteroskedasticity,” Journal of Econometrics 31:307-27. Bollerslev, T. 1990. “Modeling the Coherence in Short-Run Nominal Exchange Rates: A Multivariate Generalized Arch Model,” The Review of Economics and Statistics 72:498-505. Bollerslev, T. and H. Mikkelsen. 1996. “Modeling and Pricing Long-Memory in Stock Market Volatility,” Journal of Econometrics 73:151-84. Bollerslev, T., R. Chou, and K. Kroner. 1992. “ARCH Modeling in Finance: A Review of the Theory and Empirical Evidence,” Journal of Econometrics 52:5-59. Cambanis, S., S. Huang, and G. Simons. 1981. “On the Theory of Elliptically Contoured Distributions,” Journal of Multivariate Analysis 7: 551-559. Chamberlain, G. 1983. “A Characterization of the Distributions That Imply Mean-Variance Utility Functions,” Journal of Economic Theory 29: 185-201. Chu, K.C. 1973. “Estimation and Decision For Linear Systems with Elliptically Random Processes,” IIE Transactions in Automatic Control, AC-18:499-505. Cohen, G.D. 1993. “Pre and Post-Electoral Macroeconomic Fluctuations,” Ph.D. Dissertation. Cambridge, MA: Harvard University. Dacorogna, Michael M., Ramazan Gencay, Ulrich A. Muller, Richard B. Olsen, Olivier V. Pictet. 2001. An Introduction to High Frequency Finance. New York: Academic Press. Diebold, F.X., J.-H. Lee and G.C. Weinbach. 1994. “Regime-Switching With Time-Varying Transition Probabilities. In C. Hargreaves (ed.) Nonstationary Time Series Analysis and Cointegration. Oxford: Oxford University Press, 283-302. Dueker, M.J. 1997. “Markov-Switching in GARCH processes and mean-reverting stock market volatility. Journal of Business and Economic Statistics 15: 67-77. Durland, J.M. and T.H. McCurdy.1994. “Duration Dependent Transitions in a Markov-Model of U.S. GNP Growth. Journal of Business and Economic Statistics 12: 26-34. Engel, C. 1994. “Can the Markov-Switching Model Forecast Exchange Rates?” Journal of International Economics 36: 151-165. Engle, R. 1982. “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation,” Econometrica 50:987-1007. Engle, R. 1996. “The Econometrics of Ultra-High Frequency Data,” NBER Working Paper 5816. Fan, Wenzhong. 2002. “Stock Market Volatility and the Forecasting Performance of ARCH Models,” Manuscript: Department of Economics, Yale University. 54 Filardo, A.J. 1994. “Business Cycle Phases and their transitional Dynamics,” Journal of Business and Economic Statistics 12: 299-308. Foster F.D., and S. Vishwanathan. 1995. “Can Speculative Trading Explain the Volume-Volatility Relation?,” Journal of Business and Economic Statistics, October: 379-396. Franklin, Charles. 2001. “Pre-Election Polls in Nation and State: A Dynamic Bayesian Hierarchical Model" Presented at the 2001 Annual Meeting of the APSA, San Francisco CA. Franses, P.H and D. Van Dijk. 1996. “Forecasting Stock Market Volatility Using non-linear GARCH models,” Journal of Forecasting 15: 229-235. Freeman, J., Jude C.Hays and Helmut Stix. 1999. “The Electoral Information Hypothesis Revisited,” Manuscript, University of Michigan. Freeman, J., Jude C.Hays and Helmut Stix. 2000. “Democracy and Markets: The Case of Exchange Rates.” American Journal of Political Science, 44:3: 449-468. Gallant, A.R., P.E. Rossi and G. Tauchen. 1992. “Stock Prices and Volume,” Review of Financial Studies 5: 199-242. Garcia, R. 1998. Asymptotic Null Distribution of the Likelihood Ratio Test in a Markov-Switching Model,” International Economic Review 39:763-788. Gartner, Manfred, and Klaus W. Wellershoff. 1995. “Is there an Election Cycle in Stock Market Returns?” International Review of Economics and Finance 4:387-410. Gemmill, G. 1992. “Political Risk and Market Efficiency: Tests Based on British Stock and Options Markets in the 1987 Election,” Journal of Banking and Finance 16: 211-231. Gemmill, G. 1995. “Stockmarket Behavior and Information in British Elections,” Working Paper, City University Business School. Gemmill, G. and Saflekos, A. 2000. “How Useful are Implied Distributions? Evidence From Stock Index Options?’ The Journal of Derivatives, Spring 2000: 1-17. Hamilton, James D. 1989. “A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle,” Econometrica 57: 357-384. Hamilton, James D. 1994. Time Series Analysis. Princeton University Press: Princeton, N.J. Hansen, B.E. 1992 “The Likelihood Ratio test Under Non-standard conditions: Testing the Markov Switching Model of GNP,” Journal of Applied Econometrics 7: S61-S82. Hansen, B.E. 1996b. “Inference When a Nuisance Parameter is not identified under the Null Hypothesis,” Econometrica 64: 413-430. Harris, M., and A. Raviv. 1993. “Differences of Opinion Make a Horse-Race,” The Review of Financial Studies 4: 571-595. Herron, M. 2000. “Estimating the Economic Impact of Political Party Competition in the 1992 British Election,” American Journal of Political Science, 44:3: 326-337. Herron, M., J.Lavin, D.Cram and J.Silver. 1999. “Measurement of Political Effects in the United States Economy: A Study of the 1992 Presidential Election,” Economics and Politics 11: 51-81. Johnson, M. 1987. Multivariate Statistical Simulation. New York: Wiley. Kim, Chang-Jin, J.C. Morley and Charles R. Nelson. 2002. “Is there a Positive Relationship Between Stock Market Volatility and the Equity Premium?” Ms.. Washington University. St.Louis. Klaassen, Franc. 2002. “Improving GARCH volatility forecasts with Regime-Switching GARCH,” Empirical Economics 27:363-394. Kyle, A. 1984. “Market Structure, Information, Futures Markets, and Price Formation,” in G.Storey, A. Schmitz, and A.H. Harris (eds.), International Agricultural Trade: Advanced Readings in Price Formation, Market Structure, and Price Instability. Boulder, Colorado: Westview: 45-64. Kyle, A. 1985. “Continuous Auctions and Insider trading,” Econometrica 53: 1315-1355 Kyle, A. 1986. “On the incentives to Produce Private Information with Continuous Trading,” working paper, University of California, Berkeley. Leblang, D. and W. Bernhard. 2000a. “Speculative Attacks in Industrial Democracies: The Role of Politics,” International Organization 54: 291-324. Leblang, D. and W. Bernhard. 2000b. “Parliamentary Politics and Foreign Exchange Markets: The World According to GARCH,” Manuscript, University of Colorado. 55 Lobo, Bento J. and David Tufte. 1998. “Exchange Rate Volatility: Does Politics Matter?” Journal of Macroeconomics 20: 351-365. Martin, L.W. and Will H. Moore. 2003. “Government Formation and Foreign Exchange Volatility,” Manuscript: Florida State University. McCulloch, R.E. and R.S. Tsay. 1994. “Statistical Analysis of Economic Time-Series via MarkovSwitching Models,” Journal of Time Series Analysis 15: 239-265. McGillivray, F. 2000. “Government Hand-Outs, Political Institutions and Stock Price Dispersion,” Manuscript, Yale University. McGillivray, F. 2002. “Coalition Formation and Stock Price Volatility,” Ms., New York University. Merton, Robert B. 1973. “An Intertemporal Capital Asset Pricing Model,” Econometrica 41:5: 867-882. Mincer J., and V. Zarnowitz. 1969. “The Evaluation of Economic Forecasts,” in J.Mincer (ed.), Economic Forecasts and Expectations, National Bureau of Research, New York. Nelson, D. 1991. “Conditional Heteroskedasticity in Asset Returns: A New Approach,” Econometrica 59:347-70. O’Hara, M. 1995. Market Microstructure Theory. Oxford, England: Basil Blackwell. Owen, J. and R.Rabinovitch. 1983. “On the Class of elliptical Distributions and Their Applications to the Theory of Portfolio Choice,” Journal of Finance 38:745-752. Pagan, Adrian R. and G.W. Schwert. 1990. “Alternative Models of Stock Volatility,” Journal of Econometrics 45: 267-290. Poon, S-H., and S.J. Taylor. 1992. “Stock Returns and Volatility: An Empirical Study of the UK Stock Market,” Journal of Banking and Finance 16: 37-59. Ramchand, L. and R.Susmel. 1998. “Volatility and Cross Correlation Across Major Stock Markets,” Journal of Empirical Finance 5: 397-416. Roberts, B. 1994. “The Industrial Organization of the 1992 Presidential Election,” Paper presented at the Annual Meeting of the MPSA, Chicago, Illinois. Royden, H.L. 1968. Real Analysis, Second edition. Collier Macmillan. Santa-Clara, P. and Rossen Valkanov. Forthcoming. “The Presidential Puzzle: Political Cycles and the Stock Market,” Journal of Finance. Simonato, Jean-Guy. 1992. “Estimation of GARCH processes in the Presence of Structural Change,” Economic Letters 40:155-158. Sola, Martin and Allan Timmerman. 1994. “Fitting the Moments: A Comparison of ARCH and Regime-Switching Models for Daily Stock Returns.” London Business School. Discussion Paper No.DP 6-94. Turner, Christopher M., Richard Startz, and Charles R. Nelson. 1989. “A Markov Model of Heteroskedasticity, Risk and Learning in the Stock Market.” Journal of Financial Economics 25:3-22. Van Norden Simon and Schaller, Huntley.1997. “Regime-Switching in Stock-Market Returns,” Applied Financial Economics 7:177-191. Wlezien, C. 2001. “On Forecasting the Presidential Vote,” PS: Political Science and Politics 34:25-31. Wlezien, C. and R. Erikson. 2001. “Campaign Effects in Theory and Practice,” American Politics Research 29:419-36. 56

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Presidential Elections and the Stock Market: