Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
36th ANNUAL GIRO CONVENTION Theoretical and Practical Aspects of Parameter and Model Uncertainty Edinburgh October 2009 Dimitris Papachristou AON BENFIELD Topics Sources of parameter uncertainty Diversifiable and non diversifiable parameter risk Effect of model/parameter uncertainty on a single risk and on a portfolio of risks Methods of estimating the parameter uncertainty: analytical, Monte Carlo, asymptotic approximations Pitfalls and commonly used methods Model uncertainty and common actuarial distributions Applications: frequency of aviation losses, risk transfer assessment Sources of Uncertainty of Some Risk Quantity Process Risk: due to the stochastic nature of the risk given a statistical model and its parameters Parameter Uncertainty: uncertainty in the values of the parameters of the statistical model Model Uncertainty: uncertainty in the choice of the statistical model Types of Parameter and Model Uncertainty The nature of the parameter and model risk is similar statistical uncertainty due to the limited amount of data uncertainty because of heterogeneity in the portfolio. For example a portfolio of motor policyholders where there is a mix of drivers of different skills uncertainty because of change in the parameters over time Parameter uncertainty because of errors in the data or uncertainties in our estimates of the losses, although it is an important source of uncertainty, it is not examined in this paper. Diversifiable and non diversifiable parameter risk Three Experiments - Nature of Parameter Uncertainty 1st Experiment: Fair coin with known p=0.5 2nd Experiment: Several coins, each has a different p and p~U(0,1) 3rd Experiment: One coin, p~U(0,1), same coin is used for several throws Three Experiments Nature of Parameter Uncertainty 1st Experiment: Fair coin with p=0.5. Insurance analogy: we have sufficient information to assess the parameters of the risk with reasonable accuracy, or simply a decision has been made to ignore parameter risk 2nd Experiment: Several coins, each set of trials has a different p and p~U(0,1). Insurance analogy: a heterogeneous portfolio of policies. e.g. a portfolio of motor policies where the drivers have varying driving skills and for each of the drivers the risk parameter is different 3rd Experiment: One coin, p~U(0,1). Insurance analogy: limited information to assess the parameters of a single type of risk Three Experiments Nature of Parameter Uncertainty 1st Experiment: Fair coin with p=0.5 ⎛n⎞ P[ N = k ] = ⎜⎜ ⎟⎟ p k (1 − p) n − k ⎝k ⎠ 2nd Experiment: Several coins, each set of trials has a different p and p~U(0,1) ⎛n⎞ P[ N = k ] = ∫ ⎜⎜ ⎟⎟ p (1 − p) dp ⎝k ⎠ 1 k n−k 0 3rd Experiment: One coin, p~U(0,1), several throws In the third experiment I would prefer to look at the probability of heads as a random variable ⎛n⎞ P[ N = k ] = ⎜⎜ ⎟⎟ p k (1 − p) n − k ⎝k ⎠ which is a function of the random variable p, with p ~ U (0, 1) . Parameter uncertainty due to limited amount of data The process risk can be diversified over time or over a portfolio of similar and independent policies. the degree of diversification may be limited by the time horizon of the risk taker or the number of policies in the portfolio The parameter and model risk for a single type of policy can not always be diversified over time. The model and its parameters are fixed but unknown. The belief about the value of the parameters could take the form of a prior distribution or combined with some available data the form of a posterior distribution. In practice we do not have a purely homogenous portfolio (single type) of policies As time passes more data is collected and the parameter estimates and parameter distributions are updated Parameter uncertainty due to limited amount of data Summary descriptions of the risk such as the mean make more sense when the risk can be diversified. The mean of the process risk distribution shows the expected loss over time or over a very large portfolio of identical and independent policies. The mean of the prior or the posterior distribution for the risk parameters can always be calculated but its meaning may not necessarily have an intuitive practical interpretation. When insurance practitioners, especially practitioners not mathematically trained, refer to the “1 in 10 years event” it is doubtful whether they refer to both the space of the possible loss outcomes and the space of the possible parameter and model values. Parameter uncertainty due to limited amount of data The discussion touches on the old debate between “frequentists” and “Bayesians” I accept both philosophical interpretations of probability and their applications, but I distinguish in risk applications between probability as an opinion and probability as frequency of experiment. Comparison: Mixture Distribution – Fixed Known Parameter – Fixed but Unknown Parameter First Case: Fixed and known Parameter Let assume that X is a random variable following anExponentia l (λ )distribution where the parameter λ is fixed and known. F ( x; λ ) = 1 − e − λx , λ , x > 0 A p-percentile is given by h ( p; λ ) = − ln(1 − p ) λ Comparison: Mixture Distribution – Fixed Known Parameter – Fixed but Unknown Parameter Second Case: Heterogeneous Portfolio – Mixture Distribution Now let assume that X | λ ~ Exponentia l (λ ) and λ is a random variable which follows a Gamma (a, b) distribution. The unconditional distribution function of X is a x∞ a a −1 − bλ F ( x; a, b) = ∫ ∫ λe −λy 0 0 b λ e Γ(a ) ⎛ b ⎞ dλ ⋅ dy = 1 − ⎜ ⎟ , x, a , b > 0 ⎝b+ x⎠ A p-percentile is given by h( p; a, b) = b (1 − p ) 1/ a −b Comparison: Mixture Distribution – Fixed Known Parameter – Fixed but Unknown Parameter Third Case: Single Policy with Unknown Parameter − ln(1 − p ) h ( p ; ) λ = Here the percentile is a random λ variable because λ is a random variable which follows a Gamma (a, b) distribution. Although the mean of this random variable does not have a clear intuitive interpretation, someone − ln(1 − p ) could estimate the mean of h( p; λ ) = λ ∞ − ln(1 − p) b a λa −1e − bλ − ln(1 − p) ⎡ − ln(1 − p) ⎤ λ E[h( p; λ )] = E ⎢ = d = ⎥ ∫ λ λ Γ( a ) b(a − 1) ⎦ 0 ⎣ a >1 Comparison: Mixture Distribution – Fixed Known Parameter – Fixed but Unknown Parameter Numerical Example p = 0.95 If λ = 1 when it is fixed and known Ifλ follows a Gamma (a = 1.1, b = 1.1) distribution when assumed to be a r.v. with E[λ ] = 1 First case: Second h ( p; λ ) = − ln(1 − p ) case: h( p; a, b) = Third case: =3.00, λ b (1 − p ) 1/ a − b =15.66 − ln(1 − p ) E [h( p; λ )] = b(a − 1) and =27.23 The results differ significantly in these three cases The ordering is not always the same Simulation: Mixture Distribution - Fixed but Unknown Parameter Simulation Method 1: A parameter is simulated and then a loss is simulated, given the simulated parameter. This is repeated many times. The 95th percentile of the empirical simulated distribution is the estimate for the 95th percentile. Simulation Method 2: A parameter is simulated and then the 95th percentile of the distribution given the simulated parameter is calculated. Repeating this process a number of times an estimate of the distribution of the 95th percentile can be obtained The first method is more appropriate for the case where we have a heterogeneous portfolio of policies. The second method is more appropriate when we have a single type of risk and the parameter is considered to be a random variable. the first method tends to be more common irrespective of whether one policy or a portfolio of policies with different parameters is examined. Portfolio of Policies Heterogeneous portfolio e.g frequency of motor accidents follows P (λ ) and λ is different for each driver and follows Gamma ( a, b) Generally holds E[ X ] = E[ E[ X | θ ]] V [ X ] = V [ E[ X | θ ]] + E[V [ X | θ ]] For a heterogeneous portfolio the process and parameter risk can be diversified by increasing the number of independent policies In practice there may be some dependency between the policies but this will not be due to the different risk parameters Homogeneous portfolio Portfolio of Policies e.g. lives of identical risk whose probability of death depends on a life table. The parameters of the table are subject to estimation error The process risk can be diversified by increasing the number of policies, But the parameter risk can not be diversified by increasing the number of identical policies my preferred treatment of quantities such as percentiles of the loss to the portfolio is to consider these quantities as random variables and present their distribution However, some may want to treat the space of the values of the parameters in the same way they would treat the space of the loss outcomes Alternatively we could consider groups of large portfolios each with each own parameter Portfolio of Policies ∞ Result 1 : If { X j | Θ} j =1 are independent and identically distributed random variables then it can be shown that Cov ( X i , X j ) = V [ E[ X i| Θ]] This is an interesting result on its own when correlation is modelled through a common parameter { X j | Θ}∞j =1 are independent and identically distributed Result 2: If random variables then ⎡n ⎤ V ⎢∑ X i ⎥ = n ⋅ V [ X i ] + n ⋅ (n − 1)Cov ( X i , X j ) ⎣ i =1 ⎦ ⎡ n ⎤ V ⎢∑ X i ⎥ ⎣ i =1 ⎦ n→∞ → Cov( X i , X j ) = V [ E[ X i | Θ]] n The standard deviation per policy does not go to 0 as the number of policies goes to infinity. The parameter risk can not be diversified Portfolio of Policies Numerical example e.g frequency of motor accidents follows P (λ ) and λ ~ Gamma (a = 4, b = 12) number of policies in the portfolio 1 10 100 1,000 10,000 infinite number standard deviation per policy same lamda for different lamda for each policy different policies 0.6009 0.6009 0.2472 0.1900 0.1764 0.0601 0.1677 0.0190 0.1668 0.0060 0.1667 0.0000 Application 1: Risk Transfer Probability of reinsurer’s negative result No specified threshold sometimes 10% probability of 10% Loss (PV of Losses – PV of Premiums)/PV Premiums Includes Profit Commissions and Additional Premiums Only cash-flows between reinsurer and reinsured considered Expected Reinsurer’s Deficit (ERD) the conditional expectation of Reinsurer’s Loss is considered no standard threshold - usually around 3% Wang Transformation Van Slykes – Kreps Approach Other Application 1: Risk Transfer Report to the Casualty Actuarial Task Force of the NAIC Common method: about 37% of the respondents in their survey takes into account parameter uncertainty when calculating risk transfer. emphasises the importance of parameter uncertainty in the risk transfer calculations simulate the parameters and then simulating losses using the simulated parameters. for each simulation the risk transfer measurement is calculated and some statistics are calculated. Casualty Actuarial Task Force uses implicitly in one of the example of their report. The assessment of risk transfer is made on an individual contract basis. This means that the parameter risk can not be diversified. whatever risk transfer measure is used, it is a random variable whose value depends on the unknown parameters. E.g. the ERD is not a single number as it is commonly perceived to be, but a random variable with a distribution which depends on the distribution of the parameters. Application 1: Risk Transfer Example Contract Term Annual Limit Term Limit Annual Deductible Premium Reinsurer’s margin Additional Premium Profit Commission Losses, 0) 3 years 15 30 30 5 30% of premium 25% of annual losses Maximum of (Premium – Margin – Application 1: Risk Transfer Losses to the contract are assumed to follow a Pareto(c=10 and a=1.2). The parameter is assumed to have been estimated based on 10 losses For simplicity the discount rate is assumed to be 0%. If the simulations are done in the common way then the ERD is approximately 3.5%. Application 1: Risk Transfer The process risk can be diversified over time. The parameter risk in this case can not be diversified. The ERD is not a single number but a random variable. 100% 90% Cumulative Distribution 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% 11% 12% 13% 14% 15% 16% ERD There is a non negligible probability that the ERD takes values significantly higher than the 3.5% value which is estimated with the commonly used method of calculation. Application 1: Risk Transfer Someone could use the average of the ERD distribution as a measure of risk transfer which in this case is 3.7% However, a summary number does not necessarily adequately present the amount of risk. I have seen reinsurance transactions with It would be similar to trying to base decisions about the value of an excess of loss reinsurance programme on the expected loss rather than on the whole distribution of the loss. a genuine risk transfer motivation and a significant loss uncertainty such that several reinsurers did not accept the risk. parameter uncertainty being one of the main sources of uncertainty. However, the usual actuarial simulation methods of averaging over the space of parameters estimated risk transfer measures just above the usual benchmarks. The averaging over the space of parameter values failed to show one of the main drivers of risk and one of the primary motivations for the transaction. Likelihood function and estimation Bias in MLE estimates Example: Pareto MLE is asymptotically unbiased, but unfortunately our world is not asymptotic a Consider ⎛c⎞ F ( x) = 1 − ⎜ ⎟ ⎝ x⎠ 0 < c < x, a > 0 In practice it is usually capped The lower the value of parameter a , the “fatter” the tail Data x ,..., x 1 n n MLE estimate aˆ = ⎛ xi ⎞ ln⎜ ⎟ ∑ ⎝c⎠ i =1 n Bias in MLE estimates Example: Pareto If the true parameter is a0 it can be shown that the distribution of the MLE estimator has p.d.f. ( n ⋅ a0 ) e f (a) = n ⎛ na ⎞ −⎜ 0 ⎟ ⎝ a ⎠ a n +1Γ(n) It can be shown that n ⋅ a0 E[a ] = n −1 …which on average is higher than a0. (Lighter tail) Also it is more likely the estimate to be higher than a0 Bias in MLE estimates Example: Pareto Assume true parameter is 1.2 Both median and mean of estimator are higher than 1.2 3.5 Probability Density Function 3 2.5 2 10 points 20 points 100 points 1.5 1 0.5 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 a 1.8 2 2.2 2.4 2.6 2.8 3 A “Paradox” Consider the p x 100-th percentile If a is our estimate for the parameter, then the p100-th 1 − percentile is a Y = c ⋅ (1 − p ) It can be shown that under certain conditions ⎛ ⎞ n ⋅a 0 ⎜ ⎟⎟ E[Y ] = c⎜ ⎝ n ⋅ a0 + ln(1 − p ) ⎠ n “Paradox”: Although on average our estimate of the parameter is higher (lighter tail) than its true value, on average the percentiles are higher than their true value However, it is more likely to underestimate the percentile (as it is more likely to overestimate the value of the parameter) Distribution of the 95th percentile Assuming c=10, a=1.2 and n=10 100% 90% true 95th percentile 80% Cumulative Disbution 70% 60% 50% 40% average 95th percentile 30% 20% 10% 0% 0 200 400 600 800 1000 1200 95th percentile Unbiased parameter estimator does not necessarily imply unbiased percentile estimator Likelihood Function Same example as before with Pareto sample The likelihood function is L(a ) = a n n ∏x − ( a +1) i = a ⋅e − ( a +1) n i =1 Which has the form of a −a n a e f (a) = n ∑ ln( xi ) i =1 n (∑ ln( xi )) n +1 i =1 Γ( a ) n ∑ ln( xi ) i =1 ∝a e n ⎛ Gamma⎜ n + 1, ⎝ −a n ∑ ln( xi ) i =1 ⎞ ln( xi ) ⎟ ∑ i =1 ⎠ n Likelihood Function - Bootstrap Compare If the true parameter is a0 it was shown that the distribution of the MLE estimator has p.d.f. n ( n ⋅ a0 ) e f (a) = ⎛ na ⎞ −⎜ 0 ⎟ ⎝ a ⎠ a n +1Γ(n) The likelihood function is −a n a e f (a) = n ∑ ln( xi ) i =1 n (∑ ln( xi )) n +1 i =1 Γ( a ) The two are different. The first is an inverse gamma distribution, the second a gamma distribution. In the first case the parameter is given, in the second the sample is given. Likelihood Function - Bootstrap Common practice We estimate the parameter We assume it is the correct one and we simulate data samples based on the estimated parameter For each sample we estimate the parameter We construct the distribution of the parameter This is not strictly correct because we assume that the parameter is given, while what is given is the sample 0.9 0.8 Probability Density Function 0.7 0.6 0.5 estimator distribution (1st case) likelihood function (2nd case) 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 Pareto Parameter a 2.6 2.8 3 3.2 3.4 3.6 3.8 4 Numerical Example Assume Pareto(c=1,a=1.2) and simulated data 1.215 1.383 2.203 1.360 1.171 1.304 5.511 1.604 4.409 1.725 Likelihood estimate is a=1.6. (if true a=1.2, there is a probability of around 20% that likelihood estimate will be greater than a=1.6) 1 The true 95th percentile is 12.139. − 1.2 (1 − 0.95) = The 95th percentile based on the maximum likelihood estimate for is 1 − 6.503 (1 − 0.95) 1.6 = The expected value of the 95th percentile when allowing for parameter uncertainty is −1 ⎡ ⎤ a E0.95 [Y ] = E ⎢(1 − 0.95) ⎥ ≅ 8.851 ⎣ ⎦ Allowing for parameter uncertainty does not necessarily results in percentile estimates which are close to the true ones Calculation and Estimation Methods of the Parameter Distribution Calculation of Parameter Distribution Exact distribution has a “nice” known analytical form Normal approximation Monte Carlo methods Asymptotic v. Actual Distribution of Weibull Parameter Estimator based on 11 data points Asymptotic v. Actual Distribution of Weibull Parameter Estimators 11 points 20 points 50 points Asymptotic v. Actual Distribution of Weibull Parameter Estimators Actual distribution does not look like its normal approximation when data sparse Normal Distribution can produce negative values for parameters which are positive by definition Ignore negative values? Convenient, but not always a good approximation! Calculation of Parameter Distribution The Pareto examples discussed earlier lead into nice analytical formulae for the likelihood which had a gamma form. n − a ∑ ln( x ) n n a e f (a) = i i =1 (∑ ln( xi )) n +1 i =1 Γ( a ) However, this is not generally the case. γ −1 − cxγ For example Weibull f ( x) = c ⋅ γ ⋅ x e , x, c, γ > 0 has likelihood function n n n γ γ −c ∑ xi ( γ −1) ∑ ln( xi ) −c ∑ xi n n n γ n n − 1 i =1 L (c, γ ) = c γ e i =1 ∏ xi = c γ e i =1 e i =1 Which does not have a recognisable standard form Calculation of Parameter Distribution Monte Carlo Methods Monte Carlo Statistical Methods such as the Gibbs sampler can be of great help in these situations. These methods have found applications in Bayesian Statistics which are briefly mentioned later, but they can also be used in the classical case. The details of the Monte Carlo methods are not discussed here, but can be found in Robert, C. P. & Casella , G. (2004). A shorter description which is also more relevant to actuarial work can be found in Scollnik, D. P. M. (2000). Exact method used depends on the problem, here the generic case is shown Calculation of Parameter Distribution Monte Carlo Methods – Gibbs Sampling Let say we want to simulate from the joint density f (θ , θ ,..., θ ) 1 2 k Gibbs Sampling Initial valuesθ 1( 0 ) , θ 2( 0 ) ,..., θ k( 0 )are arbitrarily chosen Then s are simulated from the conditional distributions θ1(1) ~ f (θ1 | θ 2( 0 ) ,..., θ k( 0 ) ) θ θ 2(1) ~ f (θ 2 | θ1(1) , θ 3( 0 ) ..., θ k( 0 ) ) . . (1) (1) (1) θ ~ f ( θ | θ ,..., θ k k −1 ) 1 This is repeated many times k the first few simulated values are usually ignored If the conditional distributions can not be recognised, then a generic sampling algorithm usually based on variations of the Metropolis algorithm is used Calculation of Parameter Distribution Monte Carlo Methods – Gibbs Sampling Example: For the Weibull distribution we have f (c | γ ) ∝ c n e −c n ∑ xiγ i =1 γ n ∑ ln( xi ) −c n ∑ xiγ f (γ | c ) ∝ γ n e i =1 e i =1 f (c | γ ) has the form of a ⎛ Gamma⎜ n + 1, ⎝ ⎞ xi ⎟ ∑ i =1 ⎠ n γ but f (γ | c ) does not have a standard easily recognisable form Random numbers from f (γ | c) could be simulated using some version of the Metropolis algorithm The Metropolis algorithm Used to simulate losses (under some conditions) from a distribution which does not have a standard recognisable form There are many variations of the algorithm The appropriate variation depends on the nature of the problem and the form of the distribution Generate U 1 ~ Uniform(0, A) A is sufficiently large Generate U 2 ~ Uniform(0, 1) If ⎛ f (U 1 ) ⎞ U 2 < min⎜⎜ ,1⎟⎟ ⎝ f (at −1 ) ⎠ Else at = at −1 then a t = U 1 Temporal Parameter Uncertainty Temporal Parameter Uncertainty Changes may be gradual or sudden Effect of change may be different on different risk covers, may affect severity or frequency or both Changes may be estimated using external information or from the data A good measure of exposure may provide a better indication of risk changes compared to what can be extracted from some sparse data some special features may not be described sufficiently by any external information, then the loss data may be able to reveal some trends When data used, some non parametric methods may be useful in extracting trends, see later It is often the case that trends are hidden in randomness and can not be identified or measured easily Temporal Parameter Uncertainty – Fooled by Randomness Assume frequency follows Poisson( λ ). Poisson parameter changes in a compound way by k = 1 + g Experiment: investigate the behaviour of the estimators The log-likelihood function is ( l (λ , k ) = −λ 1 + k + ... + k n −1 ) ⎛ n ⎞ ⎛ n ⎞ + ⎜ ∑ ni ⎟ ln(λ ) + ⎜ ∑ (i − 1)ni ⎟ ln(k ) ⎝ i =1 ⎠ ⎝ i =1 ⎠ For λ =3, g=10% and number of years n=10 Temporal Parameter Uncertainty – Fooled by Randomness For λ =3, g=10% and number of years n=10, based on 5,000 sim 40% 35% 30% 25% growth g 20% 15% 10% 5% 0% 0 1 2 3 4 5 6 7 8 -5% -10% parameter lamda Random fluctuations in the simulated number of losses can be interpreted as growth trends Effect of “Unusually” Large Losses “Unusually” Large Losses and Rules of Thumb 100% 100% 90% 90% 80% 80% 70% 70% 60% using all data excluding 'unusual' losses 50% 40% 30% Cumulative Distribution Cumulative Distribution Large losses are treated separately. What is a large loss? Rules of thumb, e.g. largest loss is 3 times larger than the second largest Assume Pareto a=1.2 and follow the above rule of thumb 60% using all data excluding 'unusual' losses 50% 40% 30% 20% 20% 10% 10% 0% 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 Pareto Parameter 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 0% 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 95th percentile There is significant difference in the distribution of both the parameter and the 95th percentile if we exclude an “unusually” high loss. “Unusually” Large Losses and Rules of Thumb If X is the kth upper order statistics, i.e. X 1, n is the maximum of a k ,n sample of n and X n , n is the minimum of a sample of size n, then it can be shown that k ⎛ 1 − F ( y) ⎞ ⎟⎟ , y > x k +1 Pr[ X k ,n > y | X k +1,n = x k +1 ) = ⎜⎜ ⎝ 1 − F ( x k +1 ) ⎠ ⎛ ⎛ c ⎞a ⎞ Apply to Pareto ⎜⎜ ⎟ ⎟ Pr[ X 1,n > λ ⋅ x 2,n | X 2,n ⎜⎜λ⋅x ⎟ 2,n ⎠ = x 2,n ] = ⎜ ⎝ ⎜ ⎛ c ⎞a ⎟ ⎜ ⎜ ⎜ ⎜ x 2,k ⎟ ⎠ ⎝ ⎝ a ⎟ 1 ⎛ ⎞ ⎟ = ⎜ ⎟ , λ >1 ⎟ ⎝λ⎠ ⎟ ⎟ ⎠ For a=1.2 and λ = 3 there is a probability of 26.75% that the largest loss will be more than 3 times higher than the second largest loss. a large loss should not be considered to be “unusual” without careful examination. Model Uncertainty Model Uncertainty In the previous slides the model (distribution) was assumed known and the parameter was estimated The model (distribution) could also be estimated by the data Likelihood function could also be used Different distributions has different number of parameters and several criteria have been proposed for comparing different models Example criterion is the Schwartz Bayesian Criterion (SBC) which is ⎛ n ⎞ ⎟ ⎝ 2π ⎠ Ln(likelihood) – r ⋅ ln ⎜ Where r is the number of parameters and n is the sample size Model Uncertainty: Experiment Assume that the true distribution is Pareto(c=10,a=1.2) Random samples are generated and the following distributions are fitted: Pareto, Exponential, Weibull, Gamma, LogNormal and LogGamma This is repeated 10,000 times Model Uncertainty: Experiment The distributions of the 95th percentile are: 100% true 95th 90%percentile 80% Cumulative Disbution 70% 60% only Pareto true 95th percentile selection of distributions 50% 40% 30% 20% 10% 0% 0 100 200 300 400 500 600 700 800 95th percentile 900 1000 1100 1200 1300 1400 1500 Model Uncertainty: Experiment For a sample size of 10 the true distribution (Pareto) is only chosen 27.1% of the time The LogNormal distribution is chosen more often that the true underlying model when a Pareto model is assumed the 95th percentile is underestimated about 55% of the time and when a selection of distributions is used the 95th percentile is underestimated about 65% of the time In social sciences the use of distributions is often justified by data and general reasoning. E.g. it has been argued that the distribution of the size of cities, the size of the earthquakes or the income of people follows a Pareto type distribution. There have not been similar studies for the loss distributions in insurance. Results are subject to simulation error Application 2: Frequency of Aviation Losses Number of losses which include at least 1 fatality from a subset of commercial airlines Application 2: Frequency of Aviation Losses a non parametric is helpful in exploring possible temporal trends in the data. the number of losses in year t are assumed to follow a Poisson ( μ t ) distribution with, ln(μ t ) = s (t ) + ln( Et )where E t is the exposure for year t and s (t ) a smoothing spline. 0 -2 -2 -1 -1 s(y1) s(y1, df = 1) 0 1 1 2 5 10 y1 15 5 10 y1 15 Application 2: Frequency of Aviation Losses continuous reduction in the expected number of losses, anova analysis is inconclusive as to whether the trend has accelerated after 2001 (WTC). 6 4 2 Number of Losses 8 10 1995 2000 year 2005 Red: smoothing spline Blue: straight line Application 2: Frequency of Aviation Losses Assume ln(μ t ) = a + bt + ln( Et ) or μ = e E = λ ⋅ k ⋅ E k can be thought as k=1+g , where g is the annual rate of change in the Poisson parameter. λ and g are to be estimated. The likelihood function is n −λ ∑ E g ∑n n a + bt t l (λ , g ) = ∏ e t − λEt g t −1 t =1 Gibbs sampler l(g | λ) (λE g ) t −1 nt t n / nt = e t n t −1 t t =1 t ⋅λ t t =1 ⋅ ∏ g nt (t −1) t =1 n n ⎛ ⎞ l (λ | g ) ∝ Gamma⎜1 + ∑ nt , ∑ E t ⎟ t =1 t =1 ⎝ ⎠ is not in the form of a standard distribution and the Metropolis algorithm can be used Application 2: Frequency of Aviation Losses 100% 90% Cumulative Distribution 80% 70% 60% 50% 40% 30% 20% 10% 0% 0 0.5 1 1.5 2 2.5 Poisson Parameter 3 3.5 4 4.5 5 Application 2: Frequency of Aviation Losses Growth parameter negatively correlated with initial Poisson parameter Estimate of growth parameter is -11.7% and its 5th percentile is -15.8% and its 95th percentile is -7.1% The estimate for the number of losses is 1.87 per year with its 5th and 95th percentile 0.8 and 3.3 respectively The choice of the parameter needs to take into account other qualitative information related to the airlines industry Observations and Comments (1) Whether parameter risk can be diversified or not depends on the source of the parameter uncertainty The method of estimation/simulation also depends on the source of parameter uncertainty and whether it can be diversified or not Using the wrong method could result in underestimation and poor understanding of the risk The correct distribution of the parameter estimator is given by the likelihood function, given the sample The distribution of the estimator given its true value can be used for investigations and for validation of methods Allowing for parameter uncertainty does not necessarily results in percentile estimates which are close to the true ones Observations and Comments (2) The Normal approximation to the parameter distribution is convenient but not always good Monte Carlo methods can be used instead Random fluctuations in the simulated number of losses can be interpreted as growth trends a large loss should not be considered to be “unusual” without careful examination of the properties of its distribution The views expressed in this presentation are my personal views and in particular they should not necessarily be regarded as being those of my employer References AMERICAN ACADEMY OF ACTUARIES – COMMITTEE ON PROPERTY AND LIABILITY FINANCIAL REPORTING (2005) Risk Transfer in P&C Reinsurance – Report to the Casualty Actuarial Task Force of the NAIC. BINMORE, K. (2009) Rational Decisions, Princeton University Press BUHLMANN, H. & GISLER, A. (2005) A Course in Credibility Theory and its Applications, Springer CAIRNS, A. J. G. (2000) A Discussion of Parameter and Model Uncertainty in Insurance, Insurance Mathematics and Economics 27, p.313-330 CASELLA, G. & GEORGE, I. E. (1992) Explaining the Gibbs Sampler, The American Statistician, Vol. 46, No 3 EMBRECTHS, P., KLUPPELBERG, C. & MIKOSCH, T. (1997) Modelling Extremal Events, Springer FASB (1992) FAS 113 Accounting and Reporting for Reinsurance of Short-Duration and Long Duration Contracts GAZZANIGA, M. (2008) Human: The Science behind What Makes us Unique, Ecco GILLIES, D. (2000) Philosophical Theories of Probability, Routledge HALL, P & TAJVIDI, N (2000) Nonparametric Analysis of Temporal Trend when Fitting Parametric Models to Extreme – Value Data, Statistical Science, Vol. 15, No. 2, 153 - 167 KLUGMAN, S., PANJER, H. & WILLMOT, G. (1998) Loss Models, Wiley MAJOR, J. (1999) Taking Uncertainty into Account: Bias Issues Arising from Parameter Uncertainty in Risk Models, Casualty Actuarial Society Forum, Summer 1999 MATA, A. (2000) Parameter Uncertainty for Extreme Value Distributions, GIRO Convention MILLNS, R. & WESTON, R. (2005) Parameter Uncertainty and Capital Modelling, GIRO Convention PRESS, W. H., FLANNERY, B. P., TEUKOLSKY, S. A. & VETTERLING, W. T. (1994) Numerical Recipes in Pascal, Cambridge University Press ROBERT, C.P. & CASELLA, G. (2004) Monte Carlo Statistical Methods, Springer ROOTZEN, H. & TAJVIDI, N. (1995) Extreme Value Statistics and Wind Storm Losses: A Case Study, Scandinavian Actuarial Journal SCHMOCK, U. (1999) Estimating the Value of the WINCAT Coupons of the Winterthur Insurance Convertible Bond – A Study of the Model Risk, ASTIN Bulletin, Vol. 29, No 1, pp 101-163 SCOLLNIK, D.P.M. (2000) Actuarial Modelling with MCMC and Bugs, North American Actuarial Journal, Vol. 5, No. 2, pp 96 - 124 SORNETTE, D (2000) Critical Phenomena in Natural Sciences, Springer TIERNEY, L. (1994) Markov Chain for Exploring Posterior Distributions, The Annals of Statistics, Vol. 22, No 4. pp 1701 - 1728 WACEK, M (2005) Parameter Uncertainty in Loss Ratio Distributions and its Implications, Casualty Actuarial Society Forum, Fall 2005