Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MRC Cognition and Brain Sciences Unit Graduate Statistics Course http://imaging.mrc-cbu.cam.ac.uk/statswiki/StatsCourse2009 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 1 1: The Anatomy of Statistics Models, Hypotheses, Significance and Power Ian Nimmo-Smith 8 October 2009 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 2 The Naming of Parts Experiments and Data Models and Parameters Probability vs. Statistics Likelihood Hypotheses and Inference Tests, Significance and Power MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 3 Experiments and Data An Experiment E is prescribed by a Method. When the Experiment E is performed Data X are observed. Repeated performance of E may produce Data which vary. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 4 A Simple Experiment Method 16 volunteers were randomly selected from the large population of those suffering from Fear of Statistics Syndrome (FOSS). They were given a brief experimental therapy (‘CBU desensitization’ or CD). Results At the end of the course 13 volunteers were found to be cured. (Data: N=16; X=13). MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 5 Models and Parameters (1) A Model describes how Data arise, by identifying Systematic and Unexplained components. Data = Systematic + Unexplained. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 6 Models and Parameters (2) The Systematic and Unexplained components are linked together through one or more Parameters via a Probability formulation. Parameters can relate to either the Systematic components (e.g. Mean) or the Unexplained components (e.g. Standard deviation; Variance; Degrees of Freedom) MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 7 A Model for our Data (1) Data N observations X without FOSS following CD Parameter Parameter p: 0<p<1 p = ‘Rate of recovery from FOSS’ ‘each person independently has the same chance p of recovery.’ Model Probability(X|N,p) = N X p (1 p) N X X MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 8 A Model for our Data (2) Probability(X|N,p)= N X p (1 p) N X X N =16, X=13, p is unknown The formula expresses the combination of 13 events with probability p, 3 with probability 1-p and the number of different ways that the 13 Recoverers could occur in the 16 volunteers MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 9 A Model for our Data (3) Probability(X|N,p)= N X p (1 p) N X X N is fixed by the experimental design p represents the Systematic component we want to say something about the Data X has potentially unexplained variability, described by a Binomial Distribution MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 10 Probability Probability is a fundamental concept which it is difficult to define. There are divergent theories on what it means. There is however common agreement on the calculus it obeys. Intuitions can easily lead one astray. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 11 The Birthdays paradox MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 12 Probability vs. Statistics PARAMETERS ‘how the world is’ DATA MODEL MODEL Inferences? DATA PARAMETERS ‘how the world is’ MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 15 Hypotheses and Inference What kinds of things can we say about the ‘true’ value of p? Point estimates and Confidence Intervals? Is the Data compatible with the ‘true’ value of p being, say, 0.75? Is the weight of the evidence sufficient to say that we would prefer to say that p=0.75 rather than that p=0.5? MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 16 The weight of the evidence We will step through a sequence of possible values of p looking to see how our data X=13 look. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 17 p = 0.1 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 18 p = 0.2 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 19 p = 0.3 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 20 p = 0.4 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 21 … at last ... X = 13 just begins to show up on the radar at p=0.4 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 22 p = 0.5 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 23 p = 0.6 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 24 p = 0.7 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 25 p = 0.8 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 26 p = 0.9 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 27 The Rise and Fall of Probability (1) The Probability of the Data 13/16 Recovered rises and falls as p moves from from near 0 to near 1. This behaviour is described as the Likelihood Function for p relative to the Data X. Here is a graph of the Likelihood Function, first for the values of p we have look at so far ... MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 28 Likelihood values 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 29 The Rise and Fall of Probability (2) … and here is a complete graph covering all possible values of p MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 31 The Likelihood Function MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 32 Estimation and Inferences (1) The Likelihood Function is pivotal in understanding how the Data throw light on the Parameters MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 33 Estimation and Inferences (2) The value of p where the Likelihood takes its largest value can be a sensible starting point for estimating p. This is called the Maximum Likelihood Estimate (MLE). Often the MLE is the ‘natural one’: MLE(p) =13/16=0.833 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 34 Estimation and Inferences (3) The sharpness of the peak of the curve tells us the possible scale of the error in this estimate Confidence Intervals can be based on this. The relative heights (Likelihood Ratios) are a principal tool for comparing different Parameters. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 35 Key Questions Does the experimental manipulation have an effect? To what extent does it have an effect? Does the treatment work? How well does it work? Does behaviour B predict pathology P? How well does it predict it? MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 36 Schools of Statistical Inference Ronald Aylmer FISHER Jergy NEYMAN and Egon PEARSON Rev. Thomas BAYES MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 37 Fisherian Inference R A Fisher Likelihood P values Tests of Significance Null Hypothesis Testing MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 38 Neyman & Pearson Inference J Neyman and E Pearson Testing between Alternative Hypotheses Size Power MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 39 Bayesian Inference T Bayes Prior and Posterior Probabilities Revision of beliefs in the light of the data MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 40 R A Fisher: P values and Significance Tests (1) Null Hypothesis H0 e.g. H0: p = 0.5 Data may give evidence against H0. Order possible outcomes in terms of degree of deviation from H0. This may involve a judicious choice of Test Statistic MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 41 R A Fisher: P values and Significance Tests (2) P value is the sum of the probabilities of possible outcomes of the Experiment at least as extreme (improbable) as the Data. P value is also known as the Significance Level or Significance of the Data. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 42 R A Fisher: P values and Significance Tests (3) Sometime we quote the actual P value. P = 0.112 Sometimes we quote the P value relative to conventional values, e.g. P>0.1, P<0.01 etc. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 43 R A Fisher: P values and Significance Tests (4) Sometimes, especially in Tables, a Baedeker starring system operates: * means 0.01 <= P < 0.05; ** means 0.001 <= P < 0.01; *** means P <= 0.001 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 44 R A Fisher and the Design of Experiments Fisher’s influence on mainstream scientific methodology is enormous In particular he created a new science of the Design of Experiments Factors Covariates Interaction Confounding Randomization MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 45 Neyman and Pearson: Hypothesis Testing Deciding between a Null Hypothesis and an Alternative Hypothesis E.g. Hnull: p=0.5 vs. Halt: p=0.75 Two permitted decisions Accept Hnull Reject Hnull Two types of Error MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 46 A Tale of Two Errors (1) Type I When we incorrectly Reject Hnull although Hnull is correct Alpha (Type I error rate) ‘False Alarms’ ‘Size’ = Alpha MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 47 A Tale of Two Errors (2) Type II When we incorrectly decide to Accept Hnull although Halt is correct Beta (Type II error rate) ‘Missed Signals’ ‘Power’ = 1-Beta MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 48 N-P Hypothesis Testing Fix Alpha (in advance!) Find the Rejection Region with the given Size Alpha and smallest possible Beta. This is intimately linked to the Likelihood Function (strictly Likelihood Ratios). Look to see if Data fall in the Rejection Region MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 49 N-P Hypothesis Testing (2) If Data fall in Rejection Region then If Data fall outside RR then ‘We Reject the Null Hypothesis’. We don’t accept the alternative hypothesis. ‘We Do Not Reject the Null Hypothesis’. We don’t accept the null hypothesis. Alpha in Advance gets entangled with Observed P value. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 50 Conventional Hybrid Inference Dress things up as Hypothesis Testing Use observed P values as differential indicators of significance Be aware, and Beware! Read Gerd Gigerenzer (1993). The superego, the ego, and the id in statistical reasoning. In A handbook for data analysis. Hillsdale, NJ: Erlbaum, (pp. 311-339) MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 51 How Statistics took over the scientific world MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 52 Neyman-Pearson Tests N-P tests are those that have maximum power for a given maximum size. For the comparison of two simple hypotheses these have rejection regions determined by the likelihood ratio. LR( x) P( x | p1 ) / P( x | p0 ) MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 53 Likelihood Ratio Test In the case of testing between 2 binomial distributions the LR is an increasing function of the data X So N-P test are of the form Reject H0 if X is greater than or equal to some critical value MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 54 Size and Power when critical value = 12 MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 55 The Size of various Tests as a function of critical value MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 56 The Power of various Tests as a function of critical value MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 57 The Size/Power Trade-Off MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 58 What you buy with larger samples MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 59 Bayesian Inference Prior probability distribution over space of parameters, expressing prior beliefs; Multiply by likelihood for observed data, yielding ... Posterior probability distribution, expressing revised beliefs having observed new data Summary based on posterior distribution MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 60 A Tale of Two Bayesians <=> One Likelihood Vague Opinionated Two Prior Distributions More influenced by data Less influenced by data Two Posterior Distributions MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 61 Conclusion The concepts we have outlined are the basis of all the statistical procedures that we use, though we usually have to take the mathematical details on trust. The concepts are not very easy and efforts made establishing a clear understanding will yield dividends. Used effectively, Statistics are a good support; they can however be a soft underbelly for examiners, referees, and journal editors. MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 62 Finding out more ... MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 63 Next Week ... Peter WATSON will speak on ... Exploratory Data Analysis MRC CBU Graduate Statistics Lectures 8 October 2009 1 The Anatomy of Statistics 64