Download The Problem with Parameter Redundancy

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
The Problem with Parameter
Redundancy
Diana Cole, University of Kent
Parameter Redundancy
β€’ A model is parameter redundant (or non-identifiable) if you cannot
estimate all the parameters.
β€’ Consider a basic occupancy models which considers whether or not
a species is present at a particular site.
– Parameters: πœ“ βˆ’ site is occupied, 𝑝 – species is detected.
– Species detected at a site with probability πœ“π‘.
– Species not detected at a site with probability
πœ“ 1 βˆ’ 𝑝 + 1 βˆ’ πœ“ = 1 βˆ’ πœ“π‘
– Basic model is parameter redundant – can only estimate πœ“π‘
rather than πœ“ and 𝑝.
β€’ There are several different methods for detecting parameter
redundancy, including
– numerical methods (eg Viallefont et al, 1998)
– symbolic methods (eg Cole et al, 2010)
– hybrid symbolic-numeric method (Choquet and Cole, 2012)
β€’ Generally involves calculating the rank of a matrix, which gives the
number of parameters that can be estimated.
Problems with Parameter Redundancy
β€’ There will be a flat ridge in the likelihood of a parameter
redundant model (Catchpole and Morgan, 1997), resulting in
more than one set of maximum likelihood estimates.
β€’ Numerical methods to find the MLE will not pick up the flat
ridge, although could be picked up trying multiple starting
values and looking at profile log-likelihoods.
β€’ The Fisher information matrix will be singular (Rothenberg,
1971) and therefore the standard errors will be undefined.
β€’ However the exact Fisher information matrix is rarely known.
Standard errors are typically approximated using a Hessian
matrix obtained numerically. Can parameter redundancy be
detected from the standard errors?
Is example 1 parameter redundant?
Parameter
πœƒ1
πœƒ2
πœƒ3
πœƒ4
Estimate
0.39
0.64
0.09
0.18
Standard Error
imaginary
0.061
imaginary
imaginary
β€’ Hessian (H) computed numerically has rank 4 (exact Hessian would
have rank < 4 if parameter redundant)
β€’ Single Value Decomposition
β€’ Write 𝑯 = 𝑼𝑺𝑽, Matrix 𝑺 is diagonal matrix (Eigen values), the
number of non-zero values is the rank of the matrix.
β€’ π‘Ίπ‘–π’Š = 68.65 48.3996 12.7670 0.0019
β€’ Standardised 1 0.71 0.19 0.000028
β€’ Hybrid-Symbolic Numeric method: rank 3, only πœƒ2 is estimable.
β€’ Symbolic Method: rank 3, estimable parameter combinations
πœƒ2 , 1 βˆ’ πœƒ1 πœƒ3 , πœƒ1 πœƒ4
Is example 2 parameter redundant?
Parameter
πœƒ1
πœƒ2
πœƒ3
πœƒ4
Estimate
0.41
0.83
0.10
0.19
Standard Error
0.70
0.07
0.11
0.33
β€’ Hessian (H) computed numerically has rank 4 (exact would have
rank < 4 if parameter redundant)
β€’ Standardised Single Value Decomposition
1 0.70 0.045 0.0010
β€’ Hybrid-Symbolic Numeric method: rank 3, only πœƒ2 is estimable.
β€’ Symbolic Method: rank 3, estimable parameter combinations
πœƒ2 , 1 βˆ’ πœƒ1 πœƒ3 , πœƒ1 πœƒ4
Is example 3 parameter redundant?
Parameter
πœƒ1
πœƒ2
πœƒ3
πœƒ4
πœƒ5
πœƒ6
πœƒ7
πœƒ8
Estimate
0.37
0.48
0.39
0.34
0.40
0.65
0.10
0.18
Standard Error
0.19
0.19
0.20
0.17
0.20
0.06
0.03
0.09
β€’ Standardised Single Value Decomposition
[1.00 0.65 0.11 0.096 0.074 0.039 0.034 0.0011]
β€’ Hybrid-Symbolic Numeric method: rank 8 so is not parameter
redundant.
β€’ Symbolic model: rank 8 so is not parameter redundant, but further
test reveal that model could be near redundant, as when πœƒ1 = πœƒ2 =
πœƒ3 = πœƒ4 = πœƒ5 model is same as example 1.
Simulation Study for Example 1/2
Parameter True Value Average MLE St. Dev. MLE
πœƒ1
0.4
0.57
0.27
πœƒ2
0.7
0.50
0.29
πœƒ3
πœƒ4
0.1
0.2
0.50
0.52
0.31
0.30
52% have defined standard errors
SVD threshold %age SVD test correct
0.01
100%
0.001
72%
0.0001
11%
0.00001
2%
Computer Packages and Parameter Redundancy
MARK (Cooch and Evans, 2014)
β€’ Counts the number of estimable parameters using a numerical
procedure involving a Single Value Decomposition, if β€œ2ndPart”
chosen rather than β€œHessian” for variance estimation.
β€’ Using β€œHessian” method parameter redundancy is missed and
agree with Cooch and Evans (2014)’s recommendation to use
the default of β€œ2ndPart”.
β€’ Standard errors for non-identifiable parameters are either very
large or zero and should be ignored. Parameter estimates for
non-identifiable parameters are unreliable and should be
ignored.
β€’ Parameter redundancy could be caused by the model or the
data.
β€’ Recommend refitting any parameter redundant model with
suitable constraints.
Computer Packages and Parameter Redundancy
M-surge / E-surge (Choquet et al, 2004 , Choquet et al, 2009)
β€’ Uses the hybrid-symbolic-numeric method to detect
parameter redundancy, but will not be able to tell whether
parameter redundancy is caused by the model or the data.
(Parameter redundancy caused by the model could be
examined if you used simulated data.)
β€’ Gives which parameters can and cannot be estimated, but
cannot find estimable parameter combinations in parameter
redundant models (currently only possibly symbolically)
β€’ Also recommend refitting parameter redundant models with
suitable constraints.
Conclusion
β€’ It is not always possible to tell from model fitting that a model
is parameter redundant.
β€’ Recommend at least using numeric method to check
parameter redundancy, but symbolic or hybrid methods are
more reliable.
β€’ Fitting parameter redundant models results in large bias for
non-identifiable parameters and can introduce bias in the
identifiable parameter models.
β€’ If a model is parameter redundant it needs to be (re)fitted
with constraints, which can be obtained using the symbolic
method.
References
β€’ Catchpole, E. A. and Morgan, B. J. T (1997) Detecting parameter
redundancy. Biometrika, 84, 187-196.
β€’ Choquet, R. and Cole, D.J. (2012) A Hybrid Symbolic-Numerical Method for
Determining Model Structure. Mathematical Biosciences, 236, p117.
β€’ Choquet, R., Reboulet, A.M., Pradel, R., Gimenez, O. Lebreton, J.D. (2004).
M-SURGE: new software specifically designed for multistate capturerecapture models. Animal Biodiversity and Conservation 27(1): 207-215.
β€’ Choquet, R., Rouan, L., Pradel, R. (2009). Program E-SURGE: a software
application for fitting Multievent models. Series: Environmental and
Ecological Statistics , Vol. 3 Thomson, David L.; Cooch, Evan G.; Conroy,
Michael J. (Eds.) p 845-865.
β€’ Cole, D.J., Morgan, B.J.T., Titterington, D.M. (2010) Determining the
Parametric Structure of Non-Linear Models. Mathematical Biosciences,
228, 16-30.
β€’ Cooch and Evans (2014) Program Mark. A Gentle Introduction.
β€’ Rothenberg, T.J. (1971) Identification in parametric models. Econometrica,
39, 577-591.
β€’ Viallefont, A., Lebreton, J.D., Reboulet, A.M. and Gory, G. (1998)
Parameter Identifiability and Model Selection in Capture-Recapture
Models: A Numerical Approach. Biometrical Journal, 40, 313-325.
Related documents