* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Applying bootstrap methods to time series and regression models
Data assimilation wikipedia , lookup
Linear regression wikipedia , lookup
German tank problem wikipedia , lookup
Time series wikipedia , lookup
Choice modelling wikipedia , lookup
Regression analysis wikipedia , lookup
Robust statistics wikipedia , lookup
Coefficient of determination wikipedia , lookup
Applying bootstrap methods to time series and regression models βAn Introduction to the Bootstrapβ by Efron and Tibshirani, chapters 8-9 M.Sc. Seminar in statistics, TAU, March 2017 By Yotam Haruvi 1 The general problem β’ So far, we've seen so called one-sample problems. β’ Our data was an π. π. π sample from a single, unknown distribution πΉ. β’ Note that πΉ may have been multidimensional. β’ We generated bootstrap samples from πΉ, which gave each 1 observation a probability of . π β’ But not all datasets comply with this simple probabilistic structureβ¦ β’ Examples? 2 Unknown probabilistic model Observed data Bootstrap sample Real world Bootstrap world π β π = (ππ , β¦ , ππ ) π β πβ = (ππβ , β¦ , ππβ ) π = π(π) Statistic of interest 3 Estimated probabilistic model We will focus on this part today π β = π(πβ ) Bootstrap replication Agenda We wish to extend the bootstrap method to other, more complexed data structures: β’ Time series β’ Regression We will review several, ad-hoc bootstrap methods, for each of the structures above. But weβll start with a simple example - Two-sample problem. 4 Two-sample problem β the framework β’ For example, blood pressure measurements in treatment and placebo groups. β’ Let π denote the number of patients in the treatment group. β’ Let π denote the number of patients in the placebo group. β’ Our data: (π§1 , β¦ , π§π , π¦1 , β¦ , π¦π ). β’ This isnβt a one-sample problem, since π and π may come from different distributions. 5 Two-sample problems β a bootstrap solution β’ The extension of bootstrap is simple: β’ We denote the distribution of blood pressure in the treatment group by πΉ. That is, πΉ is the distribution that generated π. β’ Similarly πΊ is the distribution that generated π. β’ We estimate πΉ and πΊ separately. 1 π β’ πΉ gives probability to each of π§1 , β¦ , π§π and πΊ gives probability each of π¦1 , β¦ , π¦π . 6 1 π to Two-sample problems β a bootstrap solution β’ A single bootstrap sample contains π samples from πΉ and π samples from πΊ . β’ For each bootstrap sample, we calculate π§ β β π¦ β . β’ We can estimate the standard error of the means difference. β’ What parameters from the βreal worldβ were mapped with no change to bootstrap world? What is the justification for that choice? 7 Time series β’ A time series π¦π‘ ππ‘=1 is a dataset for which we have a reason to believe that if π‘1 and π‘2 are βclose enoughβ, then π¦π‘ 1 and π¦π‘ 2 are also βcloseβ. β’ Example, measuring the level of a hormone in one subject, every 10 minutes, during an 8 hours time window. β’ We assume that all (48) observations have the same mean π. 8 Time series - illustration lutenizing hormone 4 Hormone level 3.5 3 2.5 2 1.5 1 Time point Diggle, 1990: 48 measurements taken from a healthy woman, every 10 minutes 9 Time series β the problem β’ We denote the centered time series by π§π‘ = π¦π‘ β π¦ β’ Weβd like to fit a first order autoregressive scheme - AR(1) model β’ π§π‘ = π½ β π§π‘β1 + ππ‘ , π‘ = 2, β¦ , 48 β’ 1 β€ π½ β€ 1 , πΌπ = 0 β’ How well does this model fit? β’ What is the SE of π½? β’ Weβd like to apply bootstrap method to answer that. β’ Can we use βone-sampleβ bootstrap here? 10 Time series β a bootstrap solution β’ We estimate π½ using LS: π½ = argmin π½ 48 π‘=2 π§π‘ β π½π§π‘β1 2 β’ We estimate the error terms ππ‘ by ππ‘ = zt β π½π§π‘β1 β’ A bootstrap sample is generated: β’ β’ β’ β’ β’ π§1β = π§1 = π¦1 β π¦ π§2β = π½π§1 + π2β π§3β = π½π§2β + π3β β¦ β β β π§48 = π§47 + π48 β’ Where ππ‘β are drawn randomly with replacement from π2 , β¦ , π48 . 11 Time series β a bootstrap solution β’ For each bootstrap sample we calculate LS estimator π½β β’ We can now estimate the SE of π½ by the empirical SE of all π½ β β’ We can extend easily to second order autoregressive scheme β AR(2) 12 Time series β moving blocks bootstrap β’ A different approach to time series β the moving blocks bootstrap. β’ Choose a block length ( 3 in our illustration) β’ sample with replacement from all possible contiguous blocks of that length. β’ Align those blocks until you get a sample of (approximately) size π. Original sample 13 Bootstrap sample Time series β moving blocks bootstrap β’ For each bootstrap sample we calculate LS estimator π½β β’ We can now estimate the SE of π½ by empirical SE of all π½ β β’ We can extend easily to second order autoregressive scheme β AR(2) 14 Time series β discussion of two methods β’ Advantage of moving blocks approach: doesnβt depend on a specific model. β’ Note: we still use an AR model in this framework as we apply it on each bootstrap sample. The difference is that we donβt use it to generate the bootstrap sample! β’ disadvantage of moving blocks approach: How to choose a block length π? β’ π should be large enough so that observations that are more than π time steps apart from each other, are approximately independent. β’ π = 1 is a one-sample bootstrap. Implies no correlation between neighbors. β’ π = π is not helpful, we will get the same estimatorsβ¦ β’ The authors state that there isnβt (yet on 1993) a solid method for choosing an optimal π. 15 Regression β the framework β’ Consider a regression model in which we observe pairs πi = (ππ , π¦π ), where ππ is a vector of length π and π = 1, β¦ , π. β’ A model: π¦π = ππ π· + ππ , where π· = π½1 , β¦ , π½π π β’ And a function πΊ(π, π), where π β βπ and π β βπ×(π+1) , by which we measure the βgoodness of fitβ of a model. β’ The classic framework also includes the assumption that the error terms π come from a single (centered) distribution, and that they are independent of π. 16 Regression β the problem β’ The most common βfit functionβ: πΊ π, π = π π=π ππ β ππ π π β’ We can derive an analytical expression, not only for π·, but for ππΈ π· as well. β’ If we assume normality we can easily test π»0 : π½π = 0. β’ But what if we're interested, for example, in the more robust Least Median of Squares model, in which πΊ π, π = ππππππ ππ β ππ π π ? β’ We can (numerically) calculate π· = argmin{ππππππ ππ β ππ π π }, but what about its SE? 17 π Regression β Bootstrap solutions We will cover two different ways in which we can generate bootstrap samples: β’ Bootstrapping pairs β’ Bootstrapping residuals 18 Regression - Bootstrapping pairs β’ Bootstrapping pairs means that we draw (with replacement) π pairs from π1 = π1 , π¦1 , β¦ , πn = ππ , π¦π , to create a single bootstrap sample -πβ . β’ For each bootstrap sample we calculate π½β - the minimizer of πΊ πβ , π . β’ We can now estimate ππΈ π· . 19 Regression - Bootstrapping residuals β’ Weβve already seen it in context of time seriesβ¦ β’ Bootstrapping residuals requires that we first calculate π· using the original sample. β’ Then we estimate the error terms ππ = π¦i β π·ππ and obtain an empirical distribution of errors. β’ A bootstrap sample is generated: πβπ = ππ , π·ππ + ππβ where ππβ is drawn with replacement from {π1 , β¦ , ππ } . β’ For each bootstrap sample, we calculate π½ β - the minimizer of πΊ πβ , π . We can now estimate ππΈ π· . 20 Regression β discussion of two methods β’ We will prefer bootstrapping pairs, if the assumption that the error terms and covariates are independent is violated. β’ In other words, bootstrapping residuals is (slightly) more sensitive to the assumption above (it seems that the differences arenβt large). β’ When bootstrapping residuals, each bootstrap sample has exactly the same covariates vector as the original sample. This structure is suitable for data in which there is no variability in the covariates. β’ As π grows, bootstrapping pairs approaches bootstrapping residuals. 21 Conclusion β’ Some data structures, everything but π. π. π samples, require more careful thinking about the process in which we extract πΉ from the observed data. β’ Weβve seen that in the presence of a statistical model, one way of dealing with this issue is bootstrapping residuals. Weβve applied it to a time series model as well as to a regression model. β’ The downside of bootstrapping residuals may be itβs reliance on some of the modelβs assumptions. β’ To tackle this problem, weβve offered slightly more robust approaches: moving blocks in the context of time series, and bootstrapping pairs in the context of regression. β’ It turns out that in many cases, different methods agree, even if not all model assumptions are justified. 22 Thank you! 23