* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Applying bootstrap methods to time series and regression models
Data assimilation wikipedia , lookup
Linear regression wikipedia , lookup
German tank problem wikipedia , lookup
Time series wikipedia , lookup
Choice modelling wikipedia , lookup
Regression analysis wikipedia , lookup
Robust statistics wikipedia , lookup
Coefficient of determination wikipedia , lookup
Applying bootstrap methods to
time series and regression
models
βAn Introduction to the Bootstrapβ by Efron and Tibshirani, chapters 8-9
M.Sc. Seminar in statistics, TAU, March 2017
By Yotam Haruvi
1
The general problem
β’ So far, we've seen so called one-sample problems.
β’ Our data was an π. π. π sample from a single, unknown distribution πΉ.
β’ Note that πΉ may have been multidimensional.
β’ We generated bootstrap samples from πΉ, which gave each
1
observation a probability of .
π
β’ But not all datasets comply with this simple probabilistic structureβ¦
β’ Examples?
2
Unknown
probabilistic model
Observed data
Bootstrap sample
Real world
Bootstrap world
π β π = (ππ , β¦ , ππ )
π β πβ = (ππβ , β¦ , ππβ )
π = π(π)
Statistic of interest
3
Estimated
probabilistic model
We will focus
on this part
today
π β = π(πβ )
Bootstrap replication
Agenda
We wish to extend the bootstrap method to other, more complexed
data structures:
β’ Time series
β’ Regression
We will review several, ad-hoc bootstrap methods, for each of the
structures above.
But weβll start with a simple example - Two-sample problem.
4
Two-sample problem β the framework
β’ For example, blood pressure measurements in treatment and placebo
groups.
β’ Let π denote the number of patients in the treatment group.
β’ Let π denote the number of patients in the placebo group.
β’ Our data: (π§1 , β¦ , π§π , π¦1 , β¦ , π¦π ).
β’ This isnβt a one-sample problem, since π and π may come from
different distributions.
5
Two-sample problems β a bootstrap solution
β’ The extension of bootstrap is simple:
β’ We denote the distribution of blood pressure in the treatment group
by πΉ. That is, πΉ is the distribution that generated π.
β’ Similarly πΊ is the distribution that generated π.
β’ We estimate πΉ and πΊ separately.
1
π
β’ πΉ gives probability to each of π§1 , β¦ , π§π and πΊ gives probability
each of π¦1 , β¦ , π¦π .
6
1
π
to
Two-sample problems β a bootstrap solution
β’ A single bootstrap sample contains π samples from πΉ and π samples
from πΊ .
β’ For each bootstrap sample, we calculate π§ β β π¦ β .
β’ We can estimate the standard error of the means difference.
β’ What parameters from the βreal worldβ were mapped with no change
to bootstrap world? What is the justification for that choice?
7
Time series
β’ A time series π¦π‘ ππ‘=1 is a dataset for which we have a reason to
believe that if π‘1 and π‘2 are βclose enoughβ, then π¦π‘ 1 and π¦π‘ 2 are also
βcloseβ.
β’ Example, measuring the level of a hormone in one subject, every 10
minutes, during an 8 hours time window.
β’ We assume that all (48) observations have the same mean π.
8
Time series - illustration
lutenizing hormone
4
Hormone level
3.5
3
2.5
2
1.5
1
Time point
Diggle, 1990: 48 measurements taken from a healthy woman, every 10 minutes
9
Time series β the problem
β’ We denote the centered time series by π§π‘ = π¦π‘ β π¦
β’ Weβd like to fit a first order autoregressive scheme - AR(1) model
β’ π§π‘ = π½ β
π§π‘β1 + ππ‘ , π‘ = 2, β¦ , 48
β’ 1 β€ π½ β€ 1 , πΌπ = 0
β’ How well does this model fit?
β’ What is the SE of π½?
β’ Weβd like to apply bootstrap method to answer that.
β’ Can we use βone-sampleβ bootstrap here?
10
Time series β a bootstrap solution
β’ We estimate π½ using LS: π½ = argmin
π½
48
π‘=2
π§π‘ β π½π§π‘β1
2
β’ We estimate the error terms ππ‘ by ππ‘ = zt β π½π§π‘β1
β’ A bootstrap sample is generated:
β’
β’
β’
β’
β’
π§1β = π§1 = π¦1 β π¦
π§2β = π½π§1 + π2β
π§3β = π½π§2β + π3β
β¦
β
β
β
π§48
= π§47
+ π48
β’ Where ππ‘β are drawn randomly with replacement from π2 , β¦ , π48 .
11
Time series β a bootstrap solution
β’ For each bootstrap sample we calculate LS estimator π½β
β’ We can now estimate the SE of π½ by the empirical SE of all π½ β
β’ We can extend easily to second order autoregressive scheme β AR(2)
12
Time series β moving blocks bootstrap
β’ A different approach to time series β the moving blocks bootstrap.
β’ Choose a block length ( 3 in our illustration)
β’ sample with replacement from all possible contiguous blocks of that
length.
β’ Align those blocks until you get a sample of (approximately) size π.
Original sample
13
Bootstrap sample
Time series β moving blocks bootstrap
β’ For each bootstrap sample we calculate LS estimator π½β
β’ We can now estimate the SE of π½ by empirical SE of all π½ β
β’ We can extend easily to second order autoregressive scheme β AR(2)
14
Time series β discussion of two methods
β’ Advantage of moving blocks approach: doesnβt depend on a specific model.
β’ Note: we still use an AR model in this framework as we apply it on each
bootstrap sample. The difference is that we donβt use it to generate the
bootstrap sample!
β’ disadvantage of moving blocks approach: How to choose a block length π?
β’ π should be large enough so that observations that are more than π time
steps apart from each other, are approximately independent.
β’ π = 1 is a one-sample bootstrap. Implies no correlation between neighbors.
β’ π = π is not helpful, we will get the same estimatorsβ¦
β’ The authors state that there isnβt (yet on 1993) a solid method for choosing
an optimal π.
15
Regression β the framework
β’ Consider a regression model in which we observe pairs πi = (ππ , π¦π ),
where ππ is a vector of length π and π = 1, β¦ , π.
β’ A model: π¦π = ππ π· + ππ , where π· = π½1 , β¦ , π½π
π
β’ And a function πΊ(π, π), where π β βπ and π β βπ×(π+1) , by which
we measure the βgoodness of fitβ of a model.
β’ The classic framework also includes the assumption that the error
terms π come from a single (centered) distribution, and that they are
independent of π.
16
Regression β the problem
β’ The most common βfit functionβ: πΊ π, π =
π
π=π
ππ β ππ π
π
β’ We can derive an analytical expression, not only for π·, but for ππΈ π·
as well.
β’ If we assume normality we can easily test π»0 : π½π = 0.
β’ But what if we're interested, for example, in the more robust Least
Median of Squares model, in which πΊ π, π = ππππππ ππ β ππ π π ?
β’ We can (numerically) calculate π· = argmin{ππππππ ππ β ππ π π },
but what about its SE?
17
π
Regression β Bootstrap solutions
We will cover two different ways in which we can generate bootstrap
samples:
β’ Bootstrapping pairs
β’ Bootstrapping residuals
18
Regression - Bootstrapping pairs
β’ Bootstrapping pairs means that we draw (with replacement) π pairs
from π1 = π1 , π¦1 , β¦ , πn = ππ , π¦π , to create a single bootstrap
sample -πβ .
β’ For each bootstrap sample we calculate π½β - the minimizer of
πΊ πβ , π .
β’ We can now estimate ππΈ π· .
19
Regression - Bootstrapping residuals
β’ Weβve already seen it in context of time seriesβ¦
β’ Bootstrapping residuals requires that we first calculate π· using the
original sample.
β’ Then we estimate the error terms ππ = π¦i β π·ππ and obtain an
empirical distribution of errors.
β’ A bootstrap sample is generated: πβπ = ππ , π·ππ + ππβ where ππβ is
drawn with replacement from {π1 , β¦ , ππ } .
β’ For each bootstrap sample, we calculate π½ β - the minimizer of
πΊ πβ , π . We can now estimate ππΈ π· .
20
Regression β discussion of two methods
β’ We will prefer bootstrapping pairs, if the assumption that the error
terms and covariates are independent is violated.
β’ In other words, bootstrapping residuals is (slightly) more sensitive to
the assumption above (it seems that the differences arenβt large).
β’ When bootstrapping residuals, each bootstrap sample has exactly the
same covariates vector as the original sample. This structure is
suitable for data in which there is no variability in the covariates.
β’ As π grows, bootstrapping pairs approaches bootstrapping residuals.
21
Conclusion
β’ Some data structures, everything but π. π. π samples, require more
careful thinking about the process in which we extract πΉ from the
observed data.
β’ Weβve seen that in the presence of a statistical model, one way of
dealing with this issue is bootstrapping residuals. Weβve applied it to a
time series model as well as to a regression model.
β’ The downside of bootstrapping residuals may be itβs reliance on some
of the modelβs assumptions.
β’ To tackle this problem, weβve offered slightly more robust
approaches: moving blocks in the context of time series, and
bootstrapping pairs in the context of regression.
β’ It turns out that in many cases, different methods agree, even if not
all model assumptions are justified.
22
Thank you!
23