Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 British Journal of Mathematical and Statistical Psychology (2013), 66, 1–7 © 2013 The British Psychological Society www.wileyonlinelibrary.com Editorial Prior approval: The growth of Bayesian methods in psychology Within the last few years, Bayesian methods of data analysis in psychology have proliferated. In this paper, we briefly review the history or the Bayesian approach to statistics, and consider the implications that Bayesian methods have for the theory and practice of data analysis in psychology. Until recently, Bayesian methods for data analysis in psychology were largely unheard of. Standard textbooks barely gave them even a cursory mention and most researchers could spend their entire careers unaware of the existence of anything beyond the orthodox canon that they learned as students. For those who had heard of them, Bayesian methods were often dismissed as infected by subjectivism, something that few if any quantitative psychologists could endorse. Bayesian statistics was seen as a minority topic with a small but fiery cult following. It was not part of the mainstream of the discipline of statistics and could be safely ignored by scientists dealing with the practical realities of data analysis. Times have changed. Beginning in the early 1990s, there was an abrupt proliferation of studies using Bayesian methods in mainstream statistics. DOI:10.1111/bmsp.12004 2 Editorial This has continued for over 20 years—the figure above shows the number of articles on Bayesian statistics in five top-ranked statistics journals over four decades—so that now the topic of Bayesian methods comprises some 20% of published articles in statistics. This trend has been accompanied by a growth in the number and popularity of text-books (e.g., Gelman, Carlin, Stern, & Rubin, 2003; Gelman & Hill, 2007) and general purpose software for Bayesian data analysis and modelling (e.g., BUGS, JAGS). The rising tide has not been unnoticed within psychology. Although perhaps delayed in its reaction, there has been a remarkable increase in the use of Bayesian methods in quantitatively focused psychology journals in the last decade, as the following figure makes clear: In the last few years, popular textbooks (e.g., Kruschke, 2011) for Bayesian data analysis in psychology have emerged. However one might feel about them, Bayesian methods can now no longer be ignored as an irrelevance. The importance of these trends is that Bayesian methods are not just another set of topics in advanced statistics such as, for example, structural equation modeling or nonlinear regression. For some, they represent a new paradigm (in the Kuhnian sense of term) for the field. As such, their increasing adoption has potentially profound implications for the nature and practice of data analysis in psychology, possibly affecting everything from the editorial policies of journals to how statistics is taught to students. Despite their growing appeal, however, there remains a troubling lack of clarity about what exactly Bayesian methods do and do not entail and about how they differ from their so-called classical counterparts. Bayesian methods are often portrayed as being based on a subjective rather than frequentist interpretation of probability, with inference being an updating of personal beliefs in light of evidence. In practice, however, most modern applications of Bayesian methods to real-world data analysis problems are characterized by pragmatism and expediency: Bayesian methods are adopted because they promise (and arguably often deliver) solutions to important or difficult problems. Increasingly, the subjective-Bayesian versus objective-frequentist account begins to seem like like historical baggage that has yet to be replaced by an account more in keeping with how Bayesian methods have evolved and are presently being used in psychology and other disciplines. A more realistic understanding of Bayesian methods is required by our discipline. The better the appreciation of the nature of Bayesian methods and of their similarities and Editorial 3 differences with classical methods, the more constructive will be any debate over best practices or the value of any given methodology. Likewise, it may avoid the dogmatism and unnecessary polemic that have sometimes accompanied the advocacy of or opposition to Bayesian methods in the past. In what follows, we attempt to outline what Bayesian methods are, as they are currently practiced today, and how they compare to their nominal rivals. To do so, we first briefly outline the history of Bayesian methods from their origin up to the recent past. 1. A brief history of Bayesian methods The modern practice of Bayesian statistics has its origin in a single essay Towards Solving a Problem in the Doctrine of Chances written by the Reverend Thomas Bayes, and posthumously published in 1763, two years after the author’s death. The topic being addressed in the essay was clearly stated in its opening paragraph: Given the number of times an unknown event has happened and failed: Required the chance that the probability of its happening in a single trial lies somewhere between any two degrees of probability that can be named. (Bayes & Price, 1763, p 376) In more modern terminology, Bayes was considering the problem of inferring the probability parameter h of a Binomial distribution on the basis of observing n successes in N trials. In particular, he considered how to infer that h had a value between two probabilities h0 and h1 , and showed that R h1 Pðh0 h h1 jN; nÞ ¼ Rh0 hn ð1 hÞNn dh hn ð1 hÞNn dh ; when the possible values of h are equally likely a priori. The historical importance of Bayes’s essay was that it was the first clear solution to a problem of inverse probability. Calculating forward probability, such as the probability of drawing a red marble from an urn of n red and m black marbles, had been worked out for many nontrivial problems by the end of the 17th century as a result of the work of, among others, Pascal, Fermat, and Jacob Bernouilli. By contrast, solving the inverse problem—for instance the inverse problem of inferring the values of n and m from observing a series of draws from the urn of marbles—had remained elusive. For some of the original pioneers like Jacob Bernoulli, solving this inverse problem was seen as the key to the eventual application of probability theory beyond the gambling table to real problems in the physical and social sciences. Bayes’s essay was the first presentation of a solution to a special case of this problem. As important as Bayes’s essay was, it is undoubtedly a case of Stigler’s Law of Eponomy (Stigler, 1999) that we speak of Bayesian statistics or even Bayes’s theorem. Stigler’s law states that no scientific discovery is named after its original discoverer, and indeed it was Pierre-Simon Laplace, a giant of 19th century science, who was first to present what we now call Bayes’s theorem—a generalization of Bayes’s original work that he independently developed—and established and developed the practice we now call Bayesian statistics. This methodology, which came to be known as inverse probability was the dominant method of statistical inference until the early 20th century. It was applied to 4 Editorial problems in the social sciences, earth sciences and, most famously, in astronomy (when Laplace accurately predicted the masses of Jupiter and Saturn). 1.1. The rise of sampling-theory based inference New statistical methods that began to gain traction in the late 19th and early 20th century sought to be founded on rigorous and objective principles. Questions about the nature of probability became increasingly important. From these new perspectives, the dominant methods of inverse probability were found wanting. To the extent that this method survived, it was only from lack of suitable alternatives. As Pearson noted: “The practical man will … accept the results of inverse probability of the Bayes/Laplace brand till better are forthcoming” (Pearson, 1920, p. 3). As such, R. A. Fisher’s declaration in his monumental 1925 Statistical Methods for Research Workers (Fisher, 1925) that “… the theory of inverse probability is founded upon an error, and must be wholly rejected.” (Fisher, 1925, p. 10) can be seen as the coup de grace for the use of inverse probability as a method of inference. Fisher made his central point of criticism of Bayesian methods clear: “Inferences respecting populations, from which known samples have been drawn, cannot be expressed in terms of probability, except in the trivial case when the population is itself a sample of a super-population the specification of which is known with accuracy.” (Fisher, 1925, p.10). In other words, the objects of inference, such as parameters in a probability distribution, are not random variables and so their possible values can not be expressed in terms of probabilities. In data analysis, the to-be-inferred variables are fixed but unknown quantities and it is only the observed data that can be described probabilistically. This perspective was based on the so-called aleatory definition of probability. This is frequentist interpretation that holds probability only applies to the outcomes of random physical processes. This general perspective was to drastically limit the application of probability theory to problems of statistical inference. It became the received view, justifying the near wholesale abandonment of Bayesian methods for several decades. 1.2. Bayes gets personal While a strict frequentist interpretation of probability entailed the abandonment of Bayesian methods, an equally strict yet opposing interpretation allowed it to survive, albeit with a minority status. This view strictly interpreted probability as degree of personal belief (see e.g., De Finetti, 1974). Under this interpretation, Bayes’s rule took on a central role as the means to update one’s degree of belief in light of new evidence. Initially, our beliefs about some variable h can be expressed by some probability distribution P(h). In light of evidence from a set of data D0 , we update these beliefs as PðhjD0 Þ / PðD1 jhÞPðhÞ: With yet more data D1 , we continue this process as PðhjD1 ; D0 Þ / PðD1 jhÞPðhjD0 Þ; so that the posterior probability at one step becomes the prior probability at the next. When applied to practical matters like data analysis, this subjective Bayesian approach Editorial 5 required the starting point for analysis to be the statement and quantification of one’s beliefs about the variables to be inferred. Having done this, the relevant data allow us to update our beliefs by the methodical application of Bayes’s rule. 1.3. A Bayesianism Revival From the early to the late 20th century, the methods that the likes of Fisher promoted became almost ubiquitous, and Bayesian methods remained marginalized. This marginalization seems less to do with philosophical commitments to theories of probability, than to the practical utility of Bayesian methods relative to their now well-established counterparts. For almost all of the commonplace data analysis methods, under general conditions, the results of classical and Bayesian methods were roughly comparable. For example, in a linear model, the frequentist confidence interval and the Bayesian posterior interval were identical under certain circumstances. Bayesian methods did not appear to offer much difference in practical terms, yet seemed to demand a dubious commitment to subjectivity that many were reluctant to make. An advantage of the Bayesian approach was that it was based on the application of probability calculus to problems of statistics to an extent far beyond that of classical methods. What this entailed was that whenever a probabilistic model of data could be specified, how to infer the values of any unobserved variables or parameters in that model could always be derived in principle. This meant that, in principle at least, challenging data analysis problems that required bespoke models could always be used. Classical methods, by contrast, were often stymied by nuisance variables, missing data and small data-sets (not to mention latent variables and hierarchical data structures). The catalyst for the adoption of Bayesian methods was the increase in the availability and cost of computing power. When computer power was minimal, “in principle” advantages of Bayesian methods were not important because the calculations involved were often intractable. However, the roughly exponential growth of computational power from the 1970’s onwards has meant that calculations that were almost beyond the imagination in one decade could become commonplace in the next. In statistics, this change seems to begin in the 1980’s, and was in full sway by the 1990’s. 2. The theory and practice of Bayesian methods The flourishing of Bayesian methods that began in the early 1990’s coincided with the emergence of general Monte Carlo techniques for inference (e.g., Gelfand & Smith, 1990). With this development, Bayesian methods could now be applied to previously intractable problems in statistics and they began to be adopted widely in science, including in psychology, as they offered solutions to challenging statistical problems. Although it certainly appears that the growth in popularity of Bayesian methods was largely a consequence of their practical advantages, the debate about what exactly defines Bayesian models does not seem to be fully resolved. At the heart of this debate is the question of whether prior probabilities in Bayesian models are (or should be) a reflection of our beliefs about the nature of phenomenon being studied. As we have mentioned, stock definitions of Bayesian methods seem to take this for granted. By contrast, in practice, priors seem to be chosen on the basis of convenience or expediency. As we see it, the choice of priors is like the choice of the probabilistic model of the data. For example, given a set of observations x1 . . .xn , we might model this data as 6 Editorial xi Nðl; r2 Þ; for i 2 f1. . .ng: The choice of this probabilistic model need not be a reflection of our true beliefs about how this data was generated. Rather it can be seen as literally just a model that can potentially provide insight into the nature and structure of the data. By the same reasoning, the priors on l or r2 need not be a reflection of our true beliefs about the parameters, but are just part of our general modelling assumptions. Just as the generative model provides a probabilistic model of the data, the priors provide a probabilistic model of parameters. Just as we assume that our data is drawn from some probability distribution with fixed but unobserved parameters, so too we assume that the values of the parameters are drawn from another probability distribution (also with fixed but unobserved parameters). Priors, therefore, are just assumptions of our model. Like any other assumptions, they can be good or bad and may need to be extended, revised or possibly abandoned on the basis of their suitability to the data being studied. In the current issue, Gelman and Shalizi (2013) addresses in depth this question about the theory and practice of Bayesian methods. They argue that that practical uses of Bayesian methodsisoftenatoddswithitsofficialphilosophy.Inparticular,theyadvocatetreatingpriors and modelsingeneral ashypotheses that should evaluated and possiblyrevised or abandoned in light of well they fit the data and problem being addressed. This contrasts sharply with the the view of Bayesian models as ideally infallible representations of our beliefs. The paper by Gelman and Shalizi is followed by commentaries from a set of statisticians, philosophers and mathematical psychologists. Each provide their perspective on the general debate about the value of Bayesian methods and how they should be used. From these papers, it seems clear that Bayesian methods have entered a period of maturity. Defensive reactions either for or against Bayesian methods seem to have given way to more balanced views. Anti-Bayesians are rare, and few who use Bayesian methods treat it as the only method of statistical analysis. 3. The future of Bayesian methods in psychology? In keeping with the general theme of the Gelman and Shalizi paper and its commentaries, the position we take here is that psychology needs to move beyond the premises of the standard critiques of frequentist and Bayesian methods and adopt methods that are useful in tackling pressing research questions. This is not a new message. For instance, E. G. Boring’s critique of significance testing in psychology made the point that “… statistical ability, divorced from a scientific intimacy with the fundamental observations, leads nowhere” (Boring, 1919, p. 338). For a psychologist, one crucial advantage of Bayesian data analysis is it now provides a general, workable framework for incorporating prior information into a statistical analysis. There are two main objections to this assertion. The first—which we have argued is not true of modern Bayesian methods—is that this opens the doors to subjectivism in quantitative psychology. The second is that classical, frequentist methods can and do incorporate prior information into their analyses. This objection is reasonable and one that we agree with. The limitation of this approach, however, is that priors typically enter into a frequentist analysis in an ad hoc fashion. For example, consider the problem of estimation an odds ratio from a 2 9 2 contingency table with one or more zero cells. A common ad hoc fix is to add 0.5 to each observed cell value. This, in a sense, captures the prior intuition that an observed zero is an Editorial 7 underestimate and that the odds-ratio in the population is not zero or infinity but somewhere in between. More generally, we argue that prior information is used to structure an analysis by assuming normal errors with constant variance or that observations are sampled from a binomial distribution with a fixed probability. Prior information, seen in this light, provides leverage to explore difficult analytic problems by adding information about the context or from theory. The leverage the extra information affords is particularly useful for analyses where data is not plentiful or the number of plausible models is large (encompassing most psychological research). This needs to be done with care and a degree of humility—regardless of the methods being used. A poor Bayesian analysis is unlikely to offer any insights over and above a good frequentist analysis (and may be activiely misleading). One reason for a degree of humility in our analysis is that no probability model and hence no statistical model in psychology is complete. There will always be some degree of uncertainty associated with the choice of model and the appropriateness of its assumptions. As Macdonald (2002, p. 187) wrote: “if the incompleteness of probability models … were more widely appreciated psychologists and others might adopt a more reasonable attitude to statistical tests, the debate about statistical inference might die down, and the emphasis could shift toward better understanding and presenting data”. Bayesian data analysis is not a panacea for the problems of statistical modelling in psychology. Rather, they extend the number and range of tools available to tackle substantive research questions in our discipline. Mark Andrews and Thom Baguley (Nottingham Trent University, UK) References Bayes, T., & Price, R. (1763). An essay towards solving a problem in the doctrine of chances. By the late Rev. Mr. Bayes, F.R.S. Communicated by Mr. Price, in a letter to John Canton, A.M.F.R.S. Philosophical Transactions, 53, 370–418. doi:10.1098/rstl.1763.0053 Boring, E. G. (1919). Mathematical versus scientific significance. Psychological Bulletin, 16, 335–338. doi:10.1037/h0074554 De Finetti, B. (1974). Theory of probability : a critical introductory treatment. London, UK: Wiley. Fisher, R. A. (1925). Statistical Methods For Research Workers. Edinburgh, UK: Oliver and Boyd. Gelfand, A., & Smith, A. (1990). Sampling-based Approaches to Calculating Marginal Densities. Journal of the American Statistical Association, 85(410), 398–409. doi:10.1080/ 01621459.1990.10476213 Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2003). Bayesian data analysis (2nd ed.). Chapman & Hall. Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. New York, NY: Cambridge University Press. Gelman, A., & Shalizi, C. (2013). The philosophy and practice of bayesian statistics. The British Journal of Mathematical and Statistical Psychology, 66, 8–34. doi:10.1111/j.20448317.2011.02037.x Kruschke, J. K. (2011). Doing Bayesian data analysis. Burlington, MA: Academic Press. Macdonald, R. R. (2002). The incompleteness of probability models and the resultant implications for theories of statistical inference. Understanding Statistics, 1, 167–189. doi:10.1207/ S15328031US0103_03 Pearson, K. (1920). The fundamental problem of practical statistics. Biometrika, 13(1), 1–16. doi:10.1093/biomet/13.1.1 Stigler, S. M. (1999). Statistics on the table: The history of statistical concepts and methods. Cambridge, MA: Harvard University Press.