Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Conditional Probability Let A and B be two events such that P (B) > 0, then P (A|B) = P (A ∩ B)/P (B). Bayes’ Theorem Let A and B be two events such that P (B) > 0, then P (A|B) = P (B|A) · P (A)/P (B). Theorem of total probability Let A1 , A2 , · · · be a countable collection of mutually exclusive and exhaustive events, so that Ai ∩ Aj = ∅ for i 6= j and ∪∞ i=1 Ai = Ω, then P (B) = ∞ X P (B|Ai ) · P (Ai ). i=1 2 Conditional Distributions The conditional probability function of X given that Y = y is fX|Y (x|y) = fX,Y (x, y)/fY (y). If X and Y are independent, then fX,Y (x, y) = fX (x)·fY (y) which implies fX|Y (x|y) = fX (x). Rewrite fX,Y (x, y) = fZX|Y (x|y) · fY (y). Then fX (x) = Application: fX (x) = fX|Θ (x|θ) · fΘ (θ)dθ; R R fX,Y (x, y)dy = fX|Y (x|y) · fY (y)dy. FX (x) = R FX|Θ (x|θ) · fΘ (θ)dθ. Note that fX|Y (x|y) · fY (y) = fX,Y (x, y) = fY |X (y|x) · fX (x), implying that fX|Y (x|y) = 3 fY |X (y|x) · fX (x) ∝ fY |X (y|x) · fX (x). fY (y) Conditional Expectation Let X be a discrete random variable such that x1 , x2 , . . . are the only values that X takes on with positive probability. Define the conditional expectation of X given that event A has occurred, denoted by E[X|A], as E[X|A] = xi · P [X = x1 |A]. i=1 Z Continuous case : E[X|Y = y] = ∞ X x · fX|Y (x|y)dx Let A1 , A2 , · · · be a countable collection of mutually exclusive and exhaustive events, and let X be a discrete random variable for which E[X] exist. Then E[X] = i=1 Z Continuous case : E[X] = ∞ X E[X|Y = y] · fY (y)dy EY [EX (X|Y )] = E[X] Z E[h(X, Y )|Y = y] = h(x, y) · fX|Y (x|y)dx EY {EX [h(X, Y )|Y ]} = E[h(X, Y )] V ar[X|Y ] = E{[X − E(X|Y )]2 |Y } = E[X 2 |Y ] − [E(X|Y )]2 E[V ar[X|Y ]] = E[E(X 2 |Y )] − E{[E(X|Y )]2 } = E[X 2 ] − E{[E(X|Y )]2 } 1 E[X|Ai ] · P [Ai ]. V ar[E(X|Y )] = E{[E(X|Y )]2 } − {E[E(X|Y )]}2 = E{[E(X|Y )]2 } − [E(X)]2 V ar[X] = E[V ar(X|Y )] + V ar[E(X|Y )] 4 Nonparametric Unbiased Estimators If X1 , X2 , . . . , Xn are independent but not necessarily identical with common mean µ = E[Xj ] and common variance σ 2 = V ar[Xj ], then n 1X µ̂ = X̄ = Xj is an unbiased estimator of µ, and n j=1 n 1 X σˆ2 = (Xj − X̄)2 is an unbiased estimator of v n − 1 j=1 5 Full and Partial Credibilities The following Zs subject to Z ≤ 1; if Z ≥ 1 then a full credibility is assigned. The standard for NO full credibility in terms of (λ0 = [yp /r]2 ) (1) (frequency case) the number of policies is (use observed number of policies for n) µ n ≤ nF = λ0 µ ⇒Z= n nF µ ¶1/2 = n λ0 σX θX ¶1/2 ¶2 µ X=N = λ0 θ̂X X=N = σ̂X µ σN θN n λ0 ¶2 P oisson = ¶1/2 θ̂N σ̂N λ0 λ µ P oisson = nλ̂ λ0 ¶1/2 ; (2) (frequency case) the expected number of claims is (use observed number of claims Pn j=1 Nj = n · λ̂ for n · λ = n · E[Nj ] = n · θN ) µ σX nλ ≤ λ0 λ θX µ ⇒Z= n λ0 ¶1/2 ¶2 µ σN = λ0 λ θN X=N θ̂X = σ̂X ½ ¶2 = λ0 2 σ λ N ¾1/2 nλ̂ P oisson P oisson = 2 (λ0 /λ̂)σ̂N = λ0 µ ¶1/2 nλ̂ λ0 ; (3) Severity case (based on compound Poisson assumption) the number of policies is (use observed number of policies for n) µ n ≤ λ0 σX θX ¶2 σ 2 θ 2 + θN σ 2 = λ0 N Y 2 2 Y θN θY · µ λ0 σY =⇒ n ≤ 1+ λ θY P oisson ¶2 ¸ ½ ⇒Z= n ¾1/2 (λ0 /λ̂)[1 + (σ̂Y /θ̂Y )2 ] (4) Severity case (based on compound Poisson assumption) the expected number of P claims is (use observed number of claims nj=1 Nj = n · λ̂ for n · λ = n · E[Nj ]) · µ σY n · E[Nj ] = n · λ ≤ λ0 1 + θY ¶2 ¸ ½ ⇒Z= 2 nλ̂ λ0 [1 + (σ̂Y /θ̂Y )2 ] ¾1/2 ; ; (5) Severity case (based on compound Poisson assumption) the expected total dollars of P P i Pn claims is (use observed total dollars of claims ni=1 N j=1 Yi,j = i=1 Xi = n· θ̂X = n· λ̂· θ̂Y = Pn ( j=1 Nj ) · θ̂Y for n · E[Xj ] = n · λ · θY ) · ¸ σ2 n · E[Xj ] = n · λ · θY ≤ λ0 θY + Y ⇒ Z = θY Pn θ̂X = Pure Premium= 6 i=1 n Xi Pn = j=1 n Nj Pn Pn · i=1 Xi Pn j=1 Nj j=1 = ½ Nj n ¾1/2 nλ̂θ̂Y λ0 [θ̂Y + (σ̂Y2 /θ̂Y )] Pn · PNi i=1 Pn j=1 j=1 Yi,j Nj . = θ̂N · θ̂Y . Losses # of Claims Losses = · =(Frequency)·(Severity). Exposures Exposures # of Claims Predictive and Posterior Distributions Assume we have observed X̃ = x̃, where X̃ = (X1 , . . . , Xn ) and x̃ = (x1 , . . . , xn ), and want to set a rate to cover Xn+1 . Let θ be the associated risk parameter (θ is unknown and comes from a r,v, Θ), and Xj have conditional probability density function fXj |Θ (xj |θ), j = 1, . . . , n, n + 1. We are interested in (1) fXn+1 |X̃ (xn+1 |x̃), the predictive probability density; and (2) fΘ|X̃ (θ|x̃), the posterior probability density. The joint probability density of X̃ and θ is ind. fX̃,Θ (x̃, θ) = f (x1 , . . . , xn |θ)π(θ) = ·Y n ¸ fXj |Θ (xj |θ) ·π(θ), j=1 (where π(θ) is called the prior probability density) and the probability density of X̃ is fX̃ (x̃) = Z ·Y n ¸ fXj |Θ (xj |θ) ·π(θ)dθ, j=1 implying that the posterior probability density is ·Y n f (x̃, θ) = πΘ|X̃ (θ|x̃) = X̃,Θ fX̃ (x̃) ¸ fXj |Θ (xj |θ) j=1 fX̃ (x̃) · π(θ). The predictive probability density is fX̃,Xn+1 (x̃, xn+1 ) Z = fXn+1 |Θ (xn+1 |θ) · πΘ|X̃ (θ|x̃)dθ. fXn+1 |X̃ (xn+1 |x̃) = fX̃ (x̃) If θ is known, the hypothetical mean (the premium charged dependent on θ) is Z µn+1 (θ) = E[Xn+1 |Θ = θ] = xn+1 · fXn+1 |Θ (xn+1 |θ)dxx+1 ; 3 if θ is unknown, the mean of the hypothetical means, or the pure premium (the premium charged independent of θ) is µn+1 = E[Xn+1 ] = E[E[Xn+1 |Θ]] = E[µn+1 (Θ)]. The mean of the predictive distribution (the Bayesian premium) is Z E[Xn+1 |X̃ = x̃] = Z xn+1 ·fXn+1 |X̃ (xn+1 |x̃)dxn+1 = µn+1 (θ)·πΘ|X̃ (θ|x̃)dθ = E[µn+1 (Θ)|X̃ = x̃], that is, (the mean of the predictive distribution) = (the mean of posterior distribution). 7 The Credibility Premium We would like to approximate µn+1 (Θ) by a linear function of the past data. That is, we will choose α0 , α1 ,. . . ,αn to minimize squared error loss, ½· Q=E µn+1 (Θ) − α0 − n X ¸2 ¾ αj Xj . j=1 E[Xn+1 ] = E[µn+1 (Θ)] = α̂0 + n X α̂j · E[Xj ], (1) j=1 For i = 1, 2, . . . , n, we have E[µn+1 (Θ) · Xi ] = E[Xn+1 · Xi ] and Cov(Xi , Xn+1 ) = n X α̂j · Cov(Xi , Xj ), (2) j=1 P α̂0 + nj=1 α̂j · Xj is called the credibility premium. Equation (1) and the n equations (2) together are called the normal equations which can be expressed in a matrix form as follows: µn+1 1 µ1 µ2 · · · µn α̂0 2 2 2 2 σ1,n+1 0 σ1,1 α̂1 σ1,2 · · · σ1,n 2 2 2 2 σ 2,n+1 = 0 σ2,1 σ2,2 · · · σ2,n α̂2 , . . . . . . . . . .. .. .. .. .. . . 2 σn,n+1 2 2 2 · · · σn,n σn,2 0 σn,1 α̂n 2 = Cov(Xi , Xj ), i = 1, 2, . . . , n and j = 1, 2, . . . , n + 1. where µj = E[Xj ] and σi,j Note that the values α̂0 , α̂1 ,. . . , α̂n also minimize ½· Q1 = E E(Xn+1 |X̃) − α0 − n X ¸2 ¾ ½· and Q2 = E αj Xj Xn+1 − α0 − n X ¸2 ¾ αj Xj . j=1 j=1 P That is, the credibility premium α̂0 + nj=1 α̂j · Xj is the best linear estimator of each the hypothetical mean µn+1 (Θ)(=E[Xn+1 |Θ]), the Bayesian premium E[Xn+1 |X̃], and Xn+1 (all µn+1 (Θ), E[Xn+1 |X̃] and Xn+1 have the same expectations). 4 2 2 = V ar(Xj ) = σ 2 , and σi,j = Cov(Xi , Xj ) = ρ · σ 2 for Special case: If µj = E[Xj ] = µ, σj,j i 6= j, where the correlation coefficient ρ satisfies −1 < ρ < 1. Then α̂0 = (1 − ρ) · µ ρ and α̂j = . 1 − ρ + nρ 1 − ρ + nρ The credibility premium is then n X n X ρ 1−ρ α̂0 + ·µ+ · α̂j · Xj = Xj = (1 − Z) · µ + Z · X̄, 1 − ρ + nρ 1 − ρ + nρ j=1 j=1 where Z = nρ/(1 − ρ + nρ). Thus, if 0 < ρ < 1, then 0 < Z < 1 and the credibility premium is a weighted average of the sample mean X̄ and the pure premium µn+1 = E[Xn+1 ] = µ. 8 ( 9 The Loss Functions L(Θ, Θ̂) (Θ − Θ̂)2 |Θ − Θ̂| c, if Θ̂ 6= Θ 0, if Θ̂ = Θ Θ̂ minimizes the mean, the median, the mode, EΘ [L(Θ, Θ̂)], EΘ [Θ], Π−1 Θ [1/2], EΘ [L(Θ, Θ̂)|X̃] EΘ [Θ|X̃] Π−1 [1/2] Θ|X̃ Maxθ PΘ [Θ = θ], Maxθ PΘ|X̃ [Θ = θ] The Parametric Buhlmann Model Assume X1 |Θ, X2 |Θ, . . . , Xn |Θ are independent and identically distributed. Denote the hypothetical mean µ(θ) = E[Xj |Θ = θ]; the process variance v(θ) = V ar[Xj |Θ = θ]; the expected value of the hypothetical mean (the collective premium) µ = E[µ(Θ)]; the expected value of the process variance v = E[v(Θ)] = E[V ar(Xj |Θ)]; and the variance of the hypothetical mean a = V ar[µ(Θ)] = V ar[E(Xj |Θ)]. Then E[Xj ] = E[E(Xj |Θ)] = E[µ(Θ)] = µ, and V ar[Xj ] = E[V ar(Xj |Θ)] + V ar[E(Xj |Θ)] = E[v(Θ)] + V ar[µ(Θ)] = v + a, for j = 1, 2, . . . , n. Also, for i 6= j, Cov[Xi , Xj ] = V ar[µ(Θ)] = a. From the special case, we have σ 2 = v + a and ρ = a/(v + a). The credibility premium is α̂0 + n X α̂j · Xj = Z · X̄ + (1 − Z) · µ, j=1 a linear function of X̄ with the slope Z and the intercept (1 − Z) · µ, where Z= n n nρ = = 1 − ρ + nρ n + v/a n+k 5 is called the Buhlmann credibility factor, and k= v E[v(Θ)] E[V ar(Xj |Θ)] = = . a V ar[µ(Θ)] V ar[E(Xj |Θ)] Note that Z is increasing in n and a = V ar[µ(Θ)] = V ar[E(Xj |Θ)], but decreasing in k and v = E[v(Θ)] = E[V ar(Xj |Θ)]. Theorem: Suppose that (1) in a single period of observation there are n independent trials, X1 , . . . , Xn , each with probability distribution function FX|Θ (x|θ), and (2) in each of n periods of observation there is a single trials, Xi , i = 1, 2, . . . , n, where the Xi ’s are independent and identically distributed with probability distribution function FX|Θ (x|θ). Then Z1 = Z2 (the credibility factor of case (1) is equal to the credibility factor of case (2)) and k2 = n · k1 . Z1 = 1 = 1 + k1 1 n n = = = Z2 . EΘ [V ar(X1 + X2 + · · · + Xn |Θ)] EΘ [V ar(X1 |Θ)] n + k2 1+ n+ V arΘ [E(X1 + X2 + . . . + Xn |Θ)] V arΘ [E(X1 |Θ)] Note that the mode is not necessarily unique. 10 The Non-parametric Buhlmann Model Given n policy years of experience data on r group policyholders, n ≥ 2 and r ≥ 2, let Xi,j denote the random variable representing the aggregate loss amount of the ith policyholder during the j th policy year for i = 1, . . . , r and j = 1, . . . , n, n + 1. We would like to estimate E[Xi,n+1 |Xi,1 , . . . , Xi,n ] for i = 1, . . . , r. Let X̃i = (Xi,1 , . . . , Xi,n ) denote the random vector of aggregate claim amount for the ith policyholder, i = 1, . . . , r. Furthermore, we assume (1) X̃1 , . . . , X̃r are independent; (2) For i = 1, . . . , r, the distribution of each element Xi,j (j = 1, . . . , n) of X̃i depends on an (unknown) risk parameter Θi = θi ; (3) Θ1 . . . Θn are independent and identically distributed random variables; (4) Given i, Xi,1 |Θi , . . . , Xi,n |Θi are independent; and (5) Each combination of policy year and policyholder has an equal number of underlying exposure units. For i = 1, . . . , r and j = 1, . . . , n, define µ(θi ) = E[Xi,j |Θi = θi ], v(θi ) = V ar[Xi,j |Θi = θi ], µ = E[µ(Θi )], the expected value of the hypothetical means, a = V ar[µ(Θi )] = V ar[E(Xi,j |Θi )], the variance of the hypothetical means, and v = E[v(Θi )] = E[V ar(Xi,j |Θi )], the expected value of the process variances. The Bühlmann estimate for the ith policyholder and the (n + 1)th policy year is E[Xi,n+1 |Xi,1 , . . . , Xi,n ] = Ẑ X̄i + (1 − Ẑ)µ̂, i = 1, . . . , r, where Ẑ = n/(n + k̂), k̂ = v̂/â, 6 n 1X X1,j n j=1 n 1X X̃2 = (X2,1 X2,2 · · · X2,n ) ⇒ X̄2 = X2,j n j=1 .. .. .. .. .. .. . . . . . . n 1X X̃r = (Xr,1 Xr,2 · · · Xr,n ) ⇒ X̄r = Xr,j n j=1 r r 1 X v̂ 1X â = (X̄i − X̄)2 − µ̂ = X̄ = X̄i r − 1 i=1 n r i=1 X̃1 = (X1,1 X1,2 ··· X1,n ) ⇒ X̄1 = n 1 X (X1,j − X̄1 )2 n − 1 j=1 n 1 X v̂2 = (X2,j − X̄2 )2 n − 1 j=1 .. . n 1 X v̂r = (Xr,j − X̄r )2 n − 1 j=1 r r X n X 1X 1 v̂ = v̂i = (Xi,j − X̄i )2 r i=1 r(n − 1) i=1 j=1 v̂1 = Note that (1) Ẑ and (1 − Ẑ)µ̂ are independent of i; (2) it is possible that â could be negative due to the substraction. When that happens, it is customary to set â = Ẑ=0, and the Bühlmann estimate becomes µ̂ = X̄. 11 The Die-Spinner Model Xk |(Ai ∩ Bj ) = Ik · Sk |(Ai ∩ Bj ) = (Ik |Ai ) ∩ (Sk |Bj ) where the frequency of claims Ik |Ai ∼ Bernoulli(pi ), and Sk |Bj is the severity of claims. P [Ai ∩ Bj ] = P [Ai ] · P [Bj ]. Bayesian Estimate ind. P [X̃ = x̃|(Ai ∩ Bj )] = n Y P [Xk = xk |(Ai ∩ Bj )] k=1 xk >0 P [Xk = xk |(Ai ∩ Bj )] = P [Ik = 1|Ai ] · P [Sk = xk |Bj ] = pi · P [Sk = xk |Bj ]. P [Xk = 0|(Ai ∩ Bj )] = P [Ik = 0|Ai ] = 1 − pi . P [(Ai ∩ Bj )|X̃ = x̃] = P [X̃ = x̃|(Ai ∩ Bj )] · P [Ai ∩ Bj ]/P [X̃ = x̃]. P [X̃ = x̃] = XX P [X̃ = x̃|(Ai ∩ Bj )] · P [Ai ∩ Bj ]. i=1 j=1 P [Xn+1 = xn+1 |X̃ = x̃] = XX P [Xn+1 = xn+1 |(Ai ∩ Bj )] · P [(Ai ∩ Bj )|X̃ = x̃]. i=1 j=1 E[Xk |(Ai ∩ Bj )] = E[Ik |Ai ] · E[Sk |Bj ] = pi · E[Sk |Bj ]. E[Xn+1 |X̃ = x̃] = = XX X mk · P [Xn+1 = mk |X̃ = x̃] k E[Xn+1 |(Ai ∩ Bj )] · P [(Ai ∩ Bj )|X̃ = x̃]. i=1 j=1 7 Credibility Estimate (combined) µ = E[Xk ] = E[E(Xk |Θ)] = v = E{V ar[Xk |Θ]} = XX XX E[Xk |(Ai ∩ Bj )] · P [Ai ∩ Bj ]. i=1 j=1 V ar[Xk |(Ai ∩ Bj )] · P [Ai ∩ Bj ]. i=1 j=1 V ar[Xk |(Ai ∩ Bj )] = V ar[Ik · Sk |(Ai ∩ Bj )] = E[Ik2 |Ai ] · E[Sk2 |Bj ] − {E[Ik |Ai ] · E[Sk |Bj ]}2 = E[Ik |Ai ]·V ar[Sk |Bj ]+V ar[Ik |Ai ]·{E[Sk |Bj ]}2 = pi ·V ar[Sk |Bj ]+pi ·(1−pi )·{E[Sk |Bj ]}2 . a = V ar{E[Xk |Θ]} = E{[E(Xk |Θ)]2 } − {E[E(Xk |Θ)]}2 = E{[E(Xk |Θ)]2 } − {E[Xk ]}2 = XX {E[Xk |(Ai ∩ Bj )] − E[Xk ]}2 · P [Ai ∩ Bj ] = i=1 j=1 XX {E[Xk |(Ai ∩ Bj )] − µ}2 · P [Ai ∩ Bj ]. i=1 j=1 PC = Z · X̄ + (1 − Z) · µ where Z = n/(n + k) and k = v/a. Credibility Estimate (separated) Frequency: µF = E[Ik ] = E[E(Ik |ΘA )] = vF = E{V ar[Ik |ΘA ]} = X X E[Ik |Ai ] · P [Ai ] = i=1 V ar[Ik |Ai ] · P [Ai ] = i=1 X X pi · P [Ai ]. i=1 pi · (1 − pi ) · P [Ai ]. i=1 aF = V ar{E[Ik |ΘA ]} = E{[E(Ik |ΘA )]2 } − {E[E(Ik |ΘA )]}2 = E{[E(Ik |ΘA )]2 } − {E[Ik ]}2 = X {E[Ik |Ai ] − E[Ik ]}2 · P [Ai ] = i=1 X {E[Ik |Ai ] − µF }2 · P [Ai ] = i=1 X [pi − µF ]2 · P [Ai ]. i=1 PF = ZF · I¯ + (1 − ZF ) · µF where ZF = n/(n + kF ) and kF = vF /aF . Severity: µS = E[Sk ] = E[E(Sk |ΘB )] = vS = E{V ar[Sk |ΘB ]} = X X E[Sk |Bj ] · P [Bj ]. j=1 V ar[Sk |Bj ] · P [Bj ]. j=1 aS = V ar{E[Sk |ΘB ]} = E{[E(Sk |ΘB )]2 } − {E[E(Sk |ΘB )]}2 = E{[E(Sk |ΘB )]2 } − {E[Sk ]}2 = X {E[Sk |Bj ] − E[Sk ]}2 · P [Bj ] = j=1 X {E[Sk |Bj ] − µS }2 · P [Bj ]. j=1 PS = ZS ·S̄+(1−ZS )·µS where ZS = nS /(nS +kS ), kS = vS /aS and nS is # of non-zero claims. PC = PF · PS . 8 12 The Parametric Buhlmann-Straub Model Assume X1 |Θ, X2 |Θ, . . . , Xn |Θ are independent distributed with common hypothetical mean µ(θ) = E[Xj |Θ = θ] but different process variances v(θ) mj V ar[Xj |Θ = θ] = where mj is a known constant measuring exposure, j = 1, . . . , n (mj could be the number of months the policy was in force in past year j, or the number of individuals in the group in past year j, or the amount of premium income for the policy in past year j). Let the expected value of the hypothetical mean the collective premium µ = E[µ(Θ)], the expected value of the process variance v = E[v(Θ)] (6= E[V ar(Xj |Θ)] = E[v(Θ)]/mj ), and the variance of the hypothetical mean a = V ar[µ(Θ)] = V ar[E(Xj |Θ)]. Then E[Xj ] = E[E(Xj |Θ = θ)] = E[µ(Θ)] = µ, and V ar[Xj ] = E[V ar(Xj |Θ)] + V ar[E(Xj |Θ)] = E[v(Θ)] v + V ar[µ(Θ)] = + a, mj mj for j = 1, 2, . . . , n. Also, for i 6= j, Cov[Xi , Xj ] = a. The credibility premium α̂0 + Pn j=1 α̂j · Xj minimizes ½· Q=E µn+1 (Θ) − α0 − n X ¸2 ¾ αj Xj j=1 where α̂0 = µ v/a k = ·µ= ·µ 1 + a · m/v m + v/a m+k and α̂j = a α̂0 mj m mj · · mj = = · v µ m + v/a m+k m with k = v/a. The credibility premium is then n X k m α̂0 + α̂j · Xj = ·µ+ · m+k m+k j=1 Pn j=1 P P mj · Xj = (1 − Z) · µ + Z · X̄, m where Z = m/(m+k) and X̄ = ( nj=1 mj ·Xj )/( nj=1 mj ). Note that if Xj is the average loss per individual for the mj group members in year j, then mj · Xj is the total loss for the mj group members in year j, and X̄ is the overall average loss per group over the n years. The credibility premium to be charged to the group in year n+1 would be mn+1 ·[Z · X̄ +(1−Z)·µ] for mn+1 members. 9 For the single observation X̄, the process variance is · Pn V ar[X̄|θ] = V ar j=1 ¯ n n X mj · Xj ¯¯ ¸ X m2j m2j v(θ) v(θ) θ = · V ar[X |θ] = · = , j ¯ 2 2 m mj m j=1 m j=1 m and the expected process variance is E[V ar(X̄|Θ)] = E[v(Θ)] v = . m m The hypothetical mean is ¯ ¸ · n n X ¯ mj 1 X E[X̄|θ] = E mj · Xj ¯¯θ = · E[Xj |θ] = µ(θ), m j=1 j=1 m the variance of the hypothetical means is V ar[E(X̄|Θ)] = V ar[µ(Θ)] = a, and the expected hypothetical means is E[E(X̄|Θ)] = E[µ(Θ)] = µ. Therefore, the credibility factor is Z= 13 1 m = . 1 + v/(am) m + v/a The Non-parametric Buhlmann-Straub Model Let r group policyholders be such that the ith policyholder has ni years of experience data, i = 1, 2, . . . , r (r ≥ 2). Let mi,j denote the number of exposure units and Xi,j be the random variable representing the average claim amount per exposure unit of the ith policyholder during the j th policy year for j = 1, . . . , ni , ni + 1 (ni ≥ 2) and i = 1, . . . , r. We would like to estimate E[Xi,ni +1 |Xi,1 , . . . , Xi,ni ] for i = 1, . . . , r. Let X̃i = (Xi,1 , . . . , Xi,ni ) denote the random vector of average claim amount and m̃i = (mi,1 , . . . , mi,ni ) be the random vector of the number of exposure units for the ith policyholder, i = 1, . . . , r. Furthermore, we assume (1) X̃1 , . . . , X̃r are independent; (2) For i = 1, . . . , r, the distribution of each element Xi,j (j = 1, . . . , ni ) of X̃i depends on an (unknown) risk parameter Θi = θi ; (3) Θ1 . . . Θn are independent and identically distributed random variables; and (4) Given i, Xi,1 |Θi , . . . , Xi,ni |Θi are independent. For i = 1, . . . , r and j = 1, . . . , ni (ni ≥ 2), define E[Xi,j |Θi = θi ] = µ(θi ) and V ar[Xi,j |Θi = θi ] = v(θi )/mi,j . 10 Let µ = E[µ(Θi )] denote the expected value of the hypothetical means, a = V ar[µ(Θi )] = V ar[E(Xi,j |Θi )] denote the variance of the hypothetical means, and v = E[v(Θi )] = mi,j · E[V ar(Xi,j |Θi )] denote the expected value of the process variances. X̃1 = (X1,1 X1,2 ··· X1,n1 ) ⇒ X̄1 = m̃1 = (m1,1 m1,2 ··· m1,n1 ) ⇒ m1 = n1 1 X m1,j · X1,j m1 j=1 n1 X v̂1 = n1 1 X m1,j · (X1,j − X̄1 )2 n1 − 1 j=1 v̂2 = n2 1 X m2,j · (X2,j − X̄2 )2 n2 − 1 j=1 m1,j j=1 X̃2 = (X2,1 X2,2 ··· X2,n2 ) ⇒ X̄2 = m̃2 = (m2,1 m2,2 ··· m2,n2 ) ⇒ m2 = .. . .. . .. . .. . X̃r = (Xr,1 Xr,2 ··· Xr,nr ) ⇒ X̄r = m̃r = (mr,1 mr,2 ··· mr,nr ) ⇒ mr = â = .. . r 1 X m− m2 m i=1 i m2,j nr 1 X mr,j · Xr,j mr j=1 nr X µ̂ = X̄ = .. . v̂r = nr 1 X mr,j · (Xr,j − X̄r )2 nr − 1 j=1 mr,j j=1 r X mi · (X̄i − X̄)2 − (r − 1) · v̂ i=1 n2 X j=1 .. . r X n2 1 X m2,j · X2,j m2 j=1 r X mi · X̄i i=1 r X v̂ = mi i=1 In this case, X̄i = ·(ni − 1) · v̂i i=1 r X (ni − 1) i=1 ni 1 X mi,j · Xi,j mi j=1 is the average claim amount per exposure unit and mi = ni X mi,j is the total number of j=1 exposure units for the ith policyholder during the first ni policy years for i = 1, . . . , r, and ni r r X 1 X 1 X X̄ = mi · X̄i = mi,j · Xi,j m i=1 m i=1 j=1 is the overall past average claim amount per exposure unit of the r policyholders where m= r X i=1 mi = ni r X X mi,j is the total exposure units. i=1 j=1 The Bühlmann estimate for the ith policyholder and the (ni + 1)th policy year is E[Xi,ni +1 |Xi,1 , . . . , Xi,ni ] = Ẑi · X̄i + (1 − Ẑi ) · µ̂, i = 1, . . . , r, where Ẑi = mi /(mi + k̂) and k̂ = v̂/â, and the credibility premium to cover all mi,ni +1 exposure units for policyholder i in the (ni + 1)th policy year is mi,ni +1 · E[Xi,ni +1 |Xi,1 , . . . , Xi,ni ]. Note that 11 (1) Ẑi is dependent on i; (2) it is possible that â could be negative due to the substraction. When that happens, it is customary to set â = Ẑi =0, and the Bühlmann-Straub estimate becomes µ̂ = X̄; (3) if mi,j = 1 and ni = n for all i and j, then mi = n, m = r · n, and the ordinary Bühlmann estimators are recovered. The method that preserves total losses (TP=TL): the total losses on all policyholders is P P T L = ri=1 mi · X̄i , and the total premium is T P = ri=1 mi · [Ẑi · X̄i + (1 − Ẑi ) · µ̂]. Then µ̂∗ = r · X i=1 ¸ Ẑi r X ·X̄i Ẑj j=1 Ẑi which is a credibility-factor-weighted average of X̄i s with weights wi = X , i = 1, . . . , r. r Ẑj Compare the alternative µ̂ with the original µ̂ = r · X i=1 mi r X j=1 ¸ ·X̄i , which is a exposure-unit- mj j=1 mi weighted average of X̄i s with weights wi = X , i = 1, . . . , r. Note that r mj r X (1) use X̄ = j=1 mi · X̄i i=1 r X for â of k̂ of µ̂∗ . mj j=1 (2) The difference of these two credibility premiums for policyholder i based on different estimators of µ is (1 − Ẑi ) · (µ̂∗ − µ̂) The above analysis assume that the parameters µ, v and a are all unknown and need to be estimated. If µ is known, then v̂ given above can still be used to estimate v as it is unbiased whether µ is known, and an alternative and simpler unbiased estimator for a is ã = r X mi i=1 m · (X̄i − µ)2 − r · v̂. m If there are data on only one policyholder (say policyholder i), ã with r = 1 becomes ni X ãi = mi 1 v̂i · (X̄i − µ)2 − · v̂i = (X̄i − µ)2 − = (X̄i − µ)2 − mi mi mi 12 mi,j · (Xi,j − X̄i )2 j=1 mi · (ni − 1) . 14 Semi-parametric Estimation Assume the number of claims Ni,j = mi,j · Xi,j for policyholder i in year j is Poisson distributed with mean mi,j · θi given Θi = θi , that is mi,j · Xi,j |Θi = θi ∼ Poisson(mi,j · θi ). Then E[mi,j · Xi,j |Θi ] = V ar[mi,j · Xi,j |Θi ] = mi,j · Θi or µ(Θi ) = E[Xi,j |Θi ] = Θi and v(Θi ) = mi,j · V ar[Xi,j |Θi ] = Θi . Therefore, µ = E[µ(Θi )] = E[Θi ] = E[v(Θi )] = v. In this case, we could use µ̂ = X̄ to estimate v. Example: Assume a (conditional) Poisson distribution for the number of claims per policyholder, estimate the Bühlmann credibility premium for the number of claims next year. number of claims number of insureds 0 r0 1 r1 2 r2 ... ... k rk total r Assume that we have r policyholders, ni =1 and mi,j =1 for i = 1, . . . , r. Since Xi,1 |Θi ∼ Poisson(Θi ), we have E[Xi,1 |Θi ] = V ar[Xi,1 |Θi ] = Θi , µ(Θi ) = v(Θi ) = Θi and µ = E[µ(Θi )] = E[Θi ] = E[v(Θi )] = v. Moreover, µ̂ = X̄ = r k 1X 1X Xi,1 = [ j · rj ]. r i=1 r j=1 Since V ar[Xi,1 ] = V ar[E(Xi,1 |Θi )] + E[V ar(Xi,1 |Θi )] = V ar[µ(Θi ] + E[v(Θi )] = a + v = a + u and E[Xi,1 ] = E[E(Xi,1 |Θi )] = E[µ(Θi ] = µ, implying that Xi,1 , Xi,2 , . . . , Xr,1 are independent random variables with common mean µ and variance a + v = a + µ. Since the sample mean of Xi,1 , Xi,2 , . . . , Xr,1 is an unbiased estimator of V ar[Xi,1 ] = a + v = a + µ, we have r 1 X ˆ i,1 ] = â + v̂ = â + µ̂, [Xi,1 − X̄]2 = V ar[X r − 1 i=1 or r 1 X â = [Xi,1 − X̄]2 − µ̂. r − 1 i=1 Then k̂ = v̂/â = µ̂/â, Ẑ = 1/(1 + k̂) and the Bühlmann credibility premium for the number of claim Xi,1 next year is X̂i,2 = Ẑ · Xi,1 + (1 − Ẑ) · µ̂, for Xi,1 = 0, 1, . . . , k. 13 15 Parametric Estimator Assume given i, Xi,j |Θi are identically and independently distributed with probability density function fXi,j |Θi (xi,j |θi ), and Θ1 , Θ2 , . . . , Θr are also identically and independently distributed with probability density function πΘ (θ). Let X̃i = (Xi,1 , Xi,2 , . . . , Xi,ni ); then the unconditional joint density of X̃i is Z fX̃i (x̃i ) = Z ind. fX̃i ,Θi (x̃i , θi )dθi = fX̃i |Θi (x̃i |θi )πΘi (θi )dθi = Z Y ni [ fXi,j |Θi (xi,j |θi )]πΘi (θi )dθi . j=1 The likelihood function is given by L= r Y i=1 ni r ½Z Y Y fX̃i (x̃i ) = [ i=1 ¾ fXi,j |Θi (xi,j |θi )]πΘi (θi )dθi . j=1 Maximum likelihood estimator of the associated parameters are chosen to maximize L or logL. 16 Exact Credibility When ”the Bayesian premium = the credibility premium”, we say the credibility is ”exact”. Recall that the solutions of the normal equations, α̃0 , α̃1 , . . . , α̃n yield the credibility premium P α̃0 + nj=1 α̃j · Xj which minimizes ½· Q=E µn+1 (Θ) − α0 − n X ¸2 ¾ αj Xj , j=1 ½· Q1 = E E(Xn+1 |X̃) − α0 − n X ¸2 ¾ αj Xj j=1 and ½· Q2 = E Xn+1 − α0 − n X ¸2 ¾ αj Xj . j=1 If the Bayesian premium, E(Xn+1 |X̃), is a linear function of X1 , X2 , . . . Xn (in general, it is P NOT), that is E(Xn+1 |X̃) = a0 + nj=1 aj · Xj , then α̃j = aj for j = 0, 1 . . . , n, and therefore P P Q1 = 0. Thus, the credibility premium α̃0 + nj=1 α̃j · Xj = a0 + nj=1 aj · Xj = E(Xn+1 |X̃), and the credibility is ”exact”. P In summary, the Bühlmann estimator α̃0 + nj=1 α̃j · Xj is the ”best linear” approximation to the Bayesian estimate E(Xn+1 |X̃) under the squared error loss function Q1 . Recall that in linear regression, Yi = α 0 + n X αj · Xj + ²i , where ²i ∼ N (0, σ 2 ), for i = 1, 2, . . . , m. j=1 14 α̂0 , α̂1 , . . . , α̂n are chosen to minimize Q = E the regression line Ŷi = α̂0 + n X ·X m ¸ ²2i = E i=1 ·X m (Yi −α0 − i=1 n X ¸ αj ·Xj )2 . Therefore, j=1 α̂j ·Xj corresponds to the credibility premium α̃0 + j=1 n X α̃j ·Xj , j=1 and observation Yi correspond to the Bayesian premium E[Xn+1 |X̃ = (x̃)i ], i = 1, 2, . . . , m. Condition for exact credibility Suppose that Xj |Θ = θ is independently distributed and is from the ”linear exponential family” (”linear” means the power of the exponential is a leaner function of xj ) with probability function p(xj ) · e−θxj fXj |Θ (xj |θ) = , q(θ) for j = 1, 2, . . . , n + 1, and Θ has probability function πΘ (θ) = [q(θ)]−k · eµkθ c(µ, k) where θ ∈ (θ0 , θ1 ) with πΘ (θ0 ) = πΘ (θ1 ) = 0 and −∞ ≤ θ0 < θ1 ≤ ∞. Then µ = EΘ [µ(Θ)] = E[E(Xj |Θ)] = E[Xj ] and the posterior distribution is ∗ ∗ k∗ θ [q(θ)]−k · eµ πΘ|X̃ (θ|x̃) = c(µ∗ , k ∗ ) , θ ∈ (θ0 , θ1 ), where k ∗ = n + k and µ∗ = n k n · X̄ + µ · k = · X̄ + · µ. n+k n+k n+k The Bayesian premium is E[Xn+1 |X̃] = EΘ|X̃ [µn+1 (Θ)] = where Z = n/(n + k) and k = Z θ1 θ0 µ(θ)πΘ|X̃ (θ|x̃)dθ = µ∗ = Z · X̄ + (1 − Z) · µ, v E[V ar(Xj |Θ)] = . a V ar[E(Xj |Θ)] Note that the prior distribution πΘ (θ) is a conjugate prior distribution (a prior distribution is a ”conjugate” prior distribution if the resulting posterior distribution is of the same type as the prior one, but perhaps with different parameters). Theorem: If fXj |Θ (xj |θ) is a member of a ”linear exponential family”, and the prior distribution πΘ (θ) is the ”conjugate” prior distribution, then the Bühlmann credibility estimator is equal to the Bayesian estimator (i.e. exact credibility) assuming a squared error loss function. 15 conditional distribution Xj |Θ prior distribution Θ µ(Θ) = E[Xj |Θ] µ = E[E(Xj |Θ)] v(Θ) = V ar[Xj |Θ] v = E[V ar(Xj |Θ)] a = V ar[E(Xj |Θ)] k = v/a Z credibility premium posterior distribution Θ|X̃ Bernoulli(Θ) Beta(α,β) Θ α α+β Θ(1 − Θ) αβ (α + β + 1)(α + β) αβ (α + β + 1)(α + β)2 α+β n n+α+β α Z · X̄ + (1 − Z) · α+β µ Beta n X i=1 posterior mean E[Θ|X̃] Xi + α, β + n − n X i=1 Xi + α Θ αβ αβ 2 1/β nβ nβ + 1 Z · X̄ + (1 − Z) · αβ ¶ Xi µX n β Gamma Xi + α, nβ + 1 i=1 β µX n n+α+β Bernoulli n X ¶ Xi + α i=1 µ predictive mean E[Xn+1 |X̃] αβ i=1 n X predictive distribution Xn+1 |X̃ n X Poisson(Θ) Gamma(α,β) Θ Xi + α ¶ i=1 n+α+β Xi + α nβ + 1 µX n β Xi + α, NB nβ + 1 i=1 β µX n i=1 n+α+β ¶ Xi + α i=1 nβ + 1 Note that the predictive mean (the Bayesian premium), E[Xn+1 |X̃] = the posterior mean, E[Θ|X̃] (= E[µn+1 (Θ)|X̃] since µn+1 (Θ) = E[Xn+1 |Θ] = Θ) = the credibility premium, Z · X̄ + (1 − Z) · µ. In some situations, we may want to work the logarithm of the data (Wj = logXj , j = 1, 2, . . . , n). Then µlog (Θ) = E[Wj |Θ] = E[logXj |Θ], vlog (Θ) = V ar[Wj |Θ] = V ar[logXj |Θ], µlog = E[µlog (Θ)] = E[Wj ] = E[logXj ], vlog = E[vlog (Θ)] = E[V ar(Wj |Θ)] = E[V ar(logXj |Θ)], alog = V ar[µlog (Θ)] = V ar[E(Wj |Θ)] = V ar[E(logXj |Θ)], and Zlog = n/(n + vlog /alog ). Thus, logC = Zlog · W̄ + (1 − Zlog ) · µlog , or C = eZlog ·W̄ +(1−Zlog )·µlog , which is denoted by Clog and is not unbiased because we use linear credibility to estimate the mean of the distribution of logarithms. Recall Ĉ = Z · X̄ + (1 − Z) · µ where µ = E[µ(Θ)] = E[E(Xj |Θ)] = E[Xj ], so E[Ĉ] = Z · E[X̄] + (1 − Z) · E[Xj ] = µ = E[Xj ]. To make Clog unbiased, let Clog = c · eZlog ·W̄ +(1−Zlog )·µlog where c is determined by E[eWj ] = E[Xj ] = E[Clog ] = c · E[eZlog ·W̄ +(1−Zlog )·µlog ]. 16 ¶ ¶