Download Estimating Claim Size and Probability in the Auto

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Estimating Claim Size and
Probability in the
Auto-insurance
Industry: the Zero-adjusted
Inverse Gaussian (ZAIG)
Distribution
Adriana Bruscato Bortoluzzo
Danny Pimentel Claro
Marco Antonio Leonel Caetano
Rinaldo Artes
Insper Working Paper
WPE: 175/2009
Copyright Insper. Todos os direitos reservados.
É proibida a reprodução parcial ou integral do conteúdo deste
documento por qualquer meio de distribuição, digital ou impresso, sem a expressa autorização do
Insper ou de seu autor.
A reprodução para fins didáticos é permitida observando-sea
citação completa do documento
Estimating Claim Size and Probability in the Auto-insurance Industry: the Zero-adjusted Inverse
Gaussian (ZAIG) Distribution
Adriana Bruscato Bortoluzzo
[email protected]
Danny Pimentel CLaro
Marco Antonio Leonel Caetano
Rinaldo Artes
This article aims at the estimation of insurance claims from an auto data set. Using a ZAIG method, we
identify factors that influence claim size and probability, and compared the results with the analysis of a
Tweedie method. Results show that ZAIG can accurately predict claim size and probability. Factors like
territory, vehicles´ advanced age, origin and body influence distinctly claim size and probability. The
distinct impact is not always present in Tweedie’s estimated model. Auto insurers should consider
estimating risk premium using ZAIG method. The fitted models may be useful to develop a strategy for
premium pricing.
Keywords: ZAIG method, Tweedie method, Premium pricing, Insurance
1. Introduction
There is a well-known problem in the insurance industry that is the proper pricing of an
insurance policy. The pure premium of an insurance company to an insured individual is made up of two
components: claim probability and expected claim size. The claim probability for any individual is
related to the expected number of claims occurring in a given year. The claim size is simply the dollar
cost associated with each claim. It is extensively reported in literature the difficulty to estimate size and
probability of claims in the insurance industry (e.g. Jong and Heller, 2008). In the past, the main
difficulty was related to the credibility of the data sets of insurance companies (Weisberg and
Tomberlin, 1982). Insurance data sets were typically very large, of the order of ten of thousands up to
millions of cases. Problems arise such as missing values and inconsistent or invalid records. As current
information technology systems have become sophisticated every year, the processing of information is
more credible than ever before.
The challenge is then to employ a proper statistical technique to analyze insurance data. Claims
and risks have long been estimated as a pure algorithmic technique or a simple stochastic technique
(Wüthrich and Merz, 2008). These methods result in poor estimations. Jørgensen and de Souza (1994)
suggested a Poisson sum of Gamma random variables called Tweedie to estimate insurance risk.
According to Smyth and Jørgensen, another problem is that, based on the proposed models, no
estimation for claim sizes is possible. One problem with the Tweedie distribution model is that more
explanatory variables are likely to be found significant (Smyth and Jørgensen, 2002).
Recent literature realized that a zero-adjusted Inverse Gaussian (ZAIG) distribution may be
appropriate to estimate claim and risk in insurance data (Heller, Stasinopoulos and Rigby, 2006). A
mixed discrete-continuous model, with a probability mass at zero and an Inverse Gaussian continuous
component, appears to accurately estimate in extreme right skewness distribution. This suggests that
probabilities can be calculated from data sets with a large number of zero claims. The ZAIG model
explicitly specifies a logit-linear model for the occurrence of a claim (i.e. claim probability). When a
claim has occurred, the ZAIG model also specifies log-linear models for the mean claim size and the
dispersion of claim sizes. It is important to measure the probability and size of claims separately because
it is possible for the probability to depend on a set of independent variables different from those that
influence claim size. Therefore, ZAIG estimation appears to be more appropriate to estimate price of
insurance policies.
Once a method of estimation is defined, the challenge is to identify potential explanatory
variables. Typically, policy holders are divided into discrete classes on the basis of certain measurable
characteristics predictive of their propensity to generate losses. We evaluate claims by considering
variables related to vehicles that are frequently used in literature. In addition to territory, claims have
also been studied according to the brand manufacturer and vehicle’s characteristics: age, body and
origin. Following previous research all of these variables must become part of the estimation.
Our objective is to present the ZAIG method of estimation to estimate probability claims and
expected claim size in the insurance industry and to formally test the results with the estimation of a
Tweedie regression model using insurance data set. An insurance data was collected to analyze the
impact of factors estimated by Tweedie and ZAIG methods.
This work is divided in six sections. In the next section, the theoretical background is discussed
based on previous research on claim insurance estimation. Also the ZAIG method is presented in the
section. The third section discusses the methodology and the data set. The fourth section presents the
analysis of results and the comparison of findings of the two methods, ZAIG and Tweedie. Finally,
concluding remarks highlight the major contributions of our study.
2. Theoretical Background
Insurance: Importance of predictions and predictors
Insurance companies are constantly looking for ways to better predict claims. Overall, insurance
involves the sum of a large number of individual risks from which very few insurance claims will be
reported. Meulbroek (2001) argues that insurance companies need to treat risk management as a series
of related factors and events. Boland (2007) suggests that, in order to handle claims arising from
incidents that already occurred; insurers must employ methods of prediction to deal with the extent of its
liability. Therefore, an insurance company has to find ways to predict claims and appropriately charge a
premium to take responsibility for the risk.
The prediction problem has to be considered in the light of competitive market insurance
(Weisberg, Tomberlin and Chatterjee, 1984). It is possible for an insurer to benefit at least temporarily
by identifying segments of the market that are currently being overcharged and offering coverage at
lower rates or by avoiding segments that are being undercharged (Doherty, 1981). Regulators are
usually concerned about the possibility of rate structures that severely penalize individuals with some
characteristics (e.g. place they live, model of vehicle). Therefore, insurers look for better ways to
capture the characteristics of individuals that affect claim size and probability, and consequently
discriminate among insurer drivers that have high propensity to generate losses.
Insurance companies attempt to estimate reasonably price insurance policies based on the losses
reported in certain kinds of policy-holders. The estimation has to consider past data in order to grasp
trends that occurred (Weisberg and Tomberlin, 1982). Information available to predict the price for a
period in the future usually consists of the claim experience for a population or a large sample from the
population over a period in the past. Precise estimation may consider a large number of exposures in a
data set and a stable claim generation process over time.
The predictors to estimate price insurance policy were selected from the automobile industry. In
our study, we consider the issue of price prediction in the context of automobile industry because the
most sophisticated proposals have been developed in this industry (Jong and Heller, 2008). Previous
research in the automobile setting includes predictors such as territory (e.g. Chang and Fairley, 1979)
and brand manufacturer (e.g. Heller, Stasinopoulos and Rigby, 2006). Weisberg, Tomberlin and
Chatterjee (1984) suggest including variables associated with the status of the vehicle, such as age,
body, origin.
Previous studies have recognized the use of Tweedie method of estimating auto insurance claims
(Smyth and Jørgensen, 2002) and recent studies have shown that ZAIG method may produce accurate
models of estimation (Heller, Stasinopoulos and Rigby, 2006). In order to estimate, it is necessary to let
yi be the amount expended with claims for client i and xi be a vector of independent variables related to
this client. One may represent the variable yi as
 0, with probability (1 - π i )
yi = 
,
Wi , with probability π i
where Wi is a positive right skewed distribution. This type of variable belongs to the class of the zero
inflated probability distributions (e.g. Gan, 2000). The parameter π i is the claim probability and Wi
represents the claim size, related to client i.
It is important to note that a claim is, in general, a rare event. A small proportion of claims in a
sample may lead to problems in predicting claim occurrence by a logistic model because, in this case,
the predicted probabilities tend to be small. King and Zeng (2001) proposed a correction to be used in
these situations. They used the fact that, in the presence of rare events, the independent variable
coefficients are consistent, but the intercept may not be.
Tweedie regression models
A Tweedie distribution (Jørgensen, 1987, 1997) is a member of the class of exponential dispersion
models. It is defined as a distribution of the exponential family (e.g. Jong and Heller, 2008) with mean µ
and variance φ µ p; as in Jørgensen and de Souza (1994) and Smyth and Jørgensen (2002), it will be
considered, in this work, the case 1<p<2. It is possible to write
(i
 0, if ( i = 0
yi = 
, Wi = ∑ X ij ,
j =i
Wi , if ( i > 0
(1)
where (i is a Poisson random variable that represents the number of claims occurred for the client i and
X i1 , L , X i(i are independent identically distributed Gamma random variables (continuous variable). As
a consequence Wi also follows a Gamma distribution, which has a positive and right skewed density
probability function. In this work, it was used a log-linear Tweedie regression model, given by
µ i = ex γ ,
T
i
where xi is a matrix of independent variables and γ is the parametric vector.
ZAIG regression model
The variable yi follows a ZAIG distribution (Heller, Stasinopoulos and Rigby, 2006) if Wi is a
Gaussian inverse random variable. The Gaussian inverse is a positive and highly skewed distribution
with two parameters: the mean (µi) and a dispersion parameter (λi). It may be proved that
E ( yi ) = π i µ i
and
(
)
Var ( yi ) = π i µ i2 1 − π i + µ i λ i2 .
In the context of this work, µi is the expected claim size and λi is a parameter related to claim
size dispersion. It is possible to propose regression models for πi, µi and λi as
π i = h 1(xiT β ), µ i = h 2 (ziT γ ) , λ i = h 3(wiT δ ),
where h1, h2 and h2 are continuous twice differentiable invertible functions, β, γ and δ are parametric
vectors and xi, zi and wi are vector of independent variables for client i.
In an insurance context, it is highly convenient to use different sets of independent variables to model
these three parameters. Consider, for instance, a variable that indicates the local of the residence of the
car´s owner. It is well known that robbery rates vary in a city, but the price of a vehicle do not. It is
expected that the local of residence will be important in explaining πi, but not µi, then one may include
the variable in the probability model but not in the expected claim size model. This example illustrates
Heller, Stasinopoulos and Rigby (2006) statement “… A problem with the Tweedie distribution model is
that the probabilities at zero can not modeled explicit as a function of explanatory variables…”.
The following models are adjusted:
e xi β
T
πi =
1+ e
xTi β
, µ i = e xi γ and λ i = e xi δ .
T
T
(2)
In short this modeling option assures that µi and λi are, as expected, always positive and that πi is
modeled as a logistic regression. It is important to remember the bad performance of logistic models in
predicting claims, when the frequency of claims in the sample is small.
3. Methodology
A sample was collected from a major automobile insurance company resulting in a data set of
33,257 passenger vehicle records. The data set was processed to clean for missing values and selection
of relevant variables. We focus the analysis on yearly claims that refer to robberies or accidents with
repair costs greater than the vehicles value. Claim size refers to the amount of dollars paid as liability of
a claim. Claim probability refers to the percentage of claims of a year period. The average annual claim
probability is 1.15%, and the average claim size is $21,002.30. For every insurance policy holder,
twenty explanatory variables were employed. The variables are coded by means of binary variables,
described in Table 1. Table 2 shows the descriptive statistics of the variables.
Table 1. List of explanatory variables
Variable
Description
Vehicle’s Age (II)
Equals 1 if the insured vehicle has one or two years (related to contract’s
year), 0 otherwise
Vehicle’s Age (III)
Equals 1 if the insured vehicle has three or four years (related to contract’s
year), 0 otherwise
Vehicle’s Age (IV)
Equals 1 if the insured vehicle has five or six years (related to contract’s year),
0 otherwise
Vehicle’s Age (V)
Equals 1 if the insured vehicle has seven to nine years (related to contract’s
year), 0 otherwise
Vehicle’s Age (VI)
Equals 1 if the insured vehicle has ten or more years (related to contract’s
year), 0 otherwise
Origin
Equals 1 if the insured vehicle is imported and 0 if it is national
Model/Manufacturer
A combination of different models and manufacturers in the data set. Groups
were assigned on the basis of a CHAID analysis.
Territory
Clusters of territory were assigned on the basis of Hierarchical Cluster
Analysis Method for claim size. It represents a division of the area into a set
of exclusive areas thought to be relatively homogeneous in terms of claims.
Vehicles are assigned to territories according to where they are usually
garaged.
Vehicle Body (II)
Equals 1 for midsize vehicle, 0 otherwise
Vehicle Body (III)
Equals 1 for luxury vehicle, 0 otherwise
Intercept
Vehicle with zero years (related to contract’s year - Vehicle’s Age (I)),
national, Model/Manufacturer (I), Territory (I) and small vehicle (Vehicle
Body (I))
Table 2. Descriptive statistics for claim size and for proportion of claims
Claim size
Proportion
Variable
Value
Total sample
Claim size>0
2
of claim
Mean
SD
Mean
SD
I
0.94%
306.50
3,511.06
32,645.59
16,132.40
II
1.06%
244.61
2,703.39
22,974.27
12,868.49 10,707
III
1.12%
210.90
2,181.39
18,766.88
8,726.57
6,674
IV
1.07%
157.03
1,672.73
14,660.81
7,063.96
3,361
V
2.08%
221.47
1,624.80
10,672.59
3,990.81
2,554
VI
2.17%
199.84
1,482.98
9,210.65
4,374.82
1,014
National
1.19%
251.06
2,735.15
21,049.92
13,781.40 30,771
Imported
0.68%
136.59
1,849.15
19,974.17
10,491.21
Model/
I
1.68%
330.31
2,883.55
19,676.86
10,732.91 11,676
Manufacturer
II
1.27%
627.34
6,009.68
49,380.62
21,589.70
Vehicle’s Age
Origin
8947
2,486
1,102
Claim size
Proportion
Variable
Value
Total sample
Claim size>0
2
of claim
Mean
Territory
Vehicle Body
SD
Mean
SD
III
0.37%
73.19
1,313.43
19,644.96
9,559.60
1,879
IV
1.07%
152.81
1,646.32
14,244.86
7,255.25
6,432
V
2.67%
416.29
3,139.55
15,594.96
11,948.36
487
VI
0.81%
260.14
3,607.39
32,256.95
26,900.88
620
VII
0.52%
91.11
1,352.48
17,497.01
7,007.88
3,841
VIII
0.72%
147.18
2,177.97
20,401.90
16,009.04
2,911
IX
1.35%
424.82
4,065.73
31,547.34
16,003.95
1,708
X
0.62%
203.19
2,686.77
33,030.77
9,728.11
2,601
I
7.69%
1,608.42
5,685.66
20,909.50
1,007.63
26
II
2.07%
502.80
4,213.33
24,259.88
16,871.42
2,895
III
0.45%
55.39
828.71
12,406.50
716.30
IV
1.07%
218.91
2,494.36
20,445.81
12,964.14 29,888
Small
1.20%
183.95
1,854.85
15,345.89
7,390.55 15,517
Midsize
1.44%
368.18
3,537.86
25,586.44
15,029.12 11,397
Luxury
0.54%
159.92
2,586.33
29,834.43
19,322.83
1.15%
242.50
2,679.22
21,002.30
13,643.46 33,257
448
6,343
Complete
sample
4. Results and Discussion
ZAIG model was estimated by the package GAMLSS (Stasinopoulos and Rigby, 2007;
Stasinopoulos, Rigby and Akantziliotou, 2008) for R system (R Development Core Team, 2008).
Tweedie model was estimated in SPSS package (version 16.0). Table 1 presents the results of our
estimates. The Vehicle’s age, Model/Manufacturer, territory and Vehicle Body served as the
independent variables. The dependent variable is the amount expended with claims that refers to a
robbery or an accident with repair costs greater than the vehicles value. Coefficients t-statistic and
standard error for the estimated models are reported.
Table 3. Results
ZAIG
Tweedie
Variable
Equation 2: ν=1-π
π
Equation 3: µ
(Claim probability)
(Claim size)
Equation 1
Estimate t value Estimate
Intercept
7.61**
11.08
(.69)
Vehicle’s
-0.05
Age (II)
(.16)
Vehicle’s
-0.19
Age (III)
(.18)
Vehicle’s
-0.43
Age (IV)
(.23)
Vehicle’s
-0.16
Age (V)
(.21)
Vehicle’s
-0.22
Age (VI)
(.30)
Model/Manu-
0.56
facturer (II)
(.31)
Model/Manu-
-1.26**
facturer (III)
(.44)
Model/Manu-
-0.71**
facturer (IV)
(.16)
Model/Manu-
0.47
facturer (V)
(.42)
Model/Manu-
-0.45
-2.35**
t value
Estimate
t value
Estimate
t value
-3.11
9.91**
16.16
-0.55**
-58.90
(.76)
-0.30
0.14
(.61)
0.93
(.15)
-1.09
0.18
0.16
1.08
0.79**
0.80
0.77**
4.31
-0.12
3.13
-1.08**
-0.40
-0.52**
-2.74
0.90**
-3.65
-0.76
-10.11
-1.24**
0.51**
-0.04
-0.08
2.86
0.06
-11.50
0.25
-0.06
-0.32*
-0.56**
(.17)
2.66
-0.20
-1.25
0.41
(. 14)
-1.65
25.70
-0.39
-2.54
(.12)
(. 07)
(.31)
-0.83
-1.07**
2.93**
(.14)
(.19)
(.14)
1.13
-6.62
(.19)
(.39)
-4.42
-0.81**
-2.11
(.11)
(.11)
(.29)
-2.84
-0.44
(.11)
(.25)
1.79
-0.53
-0.22*
(.10)
(.12)
(.18)
-0.73
-2.86
(1.19)
(.20)
-0.78
-0.30**
(.09)
(.11)
(.16)
-1.91
Equation 4: λ
0.48
-3.27
ZAIG
Tweedie
Variable
Equation 2: ν=1-π
π
Equation 3: µ
(Claim probability)
(Claim size)
Equation 1
Estimate t value Estimate
facturer (VI)
(.55)
Model/Manu-
-1.23**
facturer (VII)
(.26)
Model/Manu-
-0.78**
Facturer (VIII)
(.28)
Model/Manu-
0.05
Facturer (IX)
(0.26)
Model/Manu-
-0.69*
Facturer (X)
(.29)
Origin
-0.19
(.46)
-4.82
-1.10
(II)
(.69)
Territory
-3.31**
(III)
(.98)
Territory
-1.91**
(IV)
(.68)
Vehicle Body
0.44**
(II)
(.14)
Vehicle Body
-0.36
(II)
(.24)
Scale
104.51
-1.31**
-2.77
-0.82
-5.56
-0.19
-3.53
-1.04**
-0.80
-0.08
-3.85
-1.32
-0.28
-2.90**
-1.75
-1.90**
-2.82
0.11
-2.55
-1.02**
(.22)
0.22
4.50
1.31
0.33
1.55
0.43**
3.58
0.25
0.40
-0.23
-0.31
0.20
0.32
(.62)
0.86
(.12)
-1.51
0.35**
1.45
(.73)
(.75)
3.19
0.15
(.62)
(1.03)
-2.81
1.63
(.12)
(.75)
-3.38
0.19
(.22)
(.28)
-1.58
t value
(.17)
(.27)
-0.54
Estimate
(.10)
(.23)
-2.41
t value
(.12)
(.23)
0.20
Estimate
(.52)
(.24)
(.34)
Territory
t value
Equation 4: λ
0.18*
2.50
(.07)
-4.63
0.46**
(.07)
(.08)
6.55
-0.68**
-5.06
(.13)
*p<0.05; **p<0.01. Regression coefficients are standardized coefficients (β) and standard error within
parentheses.
Several explanatory variables were significantly related with dependent variables. Considering
all vehicle´s age variables, we can say that there is a significant increase in the expected claim
probability as the vehicle becomes older (0.79 and 0.77). On the other hand, the expected claim size
decreases for older vehicles. This is in line with intuition, because old vehicles are less expensive to
replace and there is also the fact that old vehicles are more attractive to be stolen. One might suggest
that old vehicles are more attractive because there is a great auto-part replacement market that is flooded
with parts of these old cars and old cars tend to be poorly maintained, which increases the probability of
accidents.
The variable model/manufacturer is related to claims probability and size. In general, the
model/manufacturer is more related to claim probability than claim size. Noteworthy, there is no way to
clearly identify whether claim size or probability is causing the significance of Tweedie´s coefficients.
The variable for origin of vehicles influences the claim size (Equation 3). This suggests, ceteris
paribus, that imported vehicles tend to be related to great claim size. Overall, imported vehicles are of
high purchase value.
The variable territory is generally related to claim probability, however not related to claim size.
One might suggest that claim size depends solely on the vehicle’s value and not on the territory of the
claim. Based only on Tweedie model, there is no way to clearly identify whether claim size or
probability is causing the significance of Tweedie´s coefficients for the territory variable.
Vehicles’ body is related to claim size and probability. The claim probably decreases for luxury
vehicles. However luxury cars lead to higher claim size followed by midsize cars. When taking
Tweedie´s results, the difficulty to accurately predict claims is shown in the non significance of
Tweedie’s coefficient for luxury cars (i.e. vehicle body 3). One might suggest that the not significant
coefficient is due to a negative claim probability effect and a positive claim size effect, as found in
ZAIG’s coefficients.
5. Concluding Remarks
In this work we tackled a well-known problem in the insurance industry that is the proper pricing
of an insurance policy. Employing the ZAIG method of estimation for claims and risks in the insurance
industry, we found distinct factors that influence claim size and probability. Factors like territory,
vehicles´ advanced age, origin and body influence distinctly claim size and probability. The distinct
impact is not always present in Tweedie’s estimated model. The ZAIG method of estimation also allows
for insurance companies to create a score system to predict claims, based on the logistic model. This
score system intends to identify police holders who tend to be more risky. The estimated models may be
employed to develop a strategy for premium pricing.
Some limitations of this study may be recognized. First, the methods require a high
computational effort which might preclude the use of large data sets. Second, there is room for
developing suitable methods for longitudinal data analysis. Future work may consider the use of
estimating equations techniques or multivariate ZAIG distributions. We concentrated our research on
the auto insurance industry and specific vehicles’ variables. Studies may address other insurance
industries and include customer related variables.
6. References
[1] Boland, Philip J. Statistical and probabilistic methods in actuarial science. Boca Raton: Chapman
& Hall/CRC, 2007. 351 p.
[2] Chang, L. and Fairley W. B. Pricing Automobile Insurance under Multivariate Classification of
Risks: Additive versus Multiplicative. The Journal of Risk and Insurance. Vol. 46, (2), 75-98,
1979.
[3] Doherty, N. A., Is Rate Classification Profitable? The Journal of Risk and Insurance, Vol. 48,
(2), 286-295, 1981.
[4] Gan, N., General zero-inflated models and their applications. North Carolina State University,
2000. Phd Thesis.
[5] Heller, G., Stasinopoulos, M. and Rigby, B. The zero-adjusted inverse Gaussian distribution as a
model for insurance claims. Proceedings of the 21th International Workshop on Statistical
Modelling. J. Hinde, J. Einbeck, and J. Newell (Eds.). 226-233, Ireland, Galway, 2006. Available
in http://studweb.north.londonmet.ac.uk/~stasinom/papers. Accessed on 10/27/2008.
[6] Jong, Piet de; Heller, Gillian Z. Generalized linear models for insurance data. Cambridge:
Cambridge University Press, 2008. 196 p.
[7] Jørgensen, B. Exponential dispersion models. Journal of the Royall Statistical Society, ser. B.,
49, 127-162, 1987.
[8] Jørgensen, B. Theory of Dispersion Models. London: Chapman & Hall, 1997.
[9] Jørgensen, B and De Souza, M. C. P. Fitting Tweedie´s compound Poisson model to insurance
claims data. Scandinavian Actuarial Journal, 69-93, 1994.
[10]
King, G. and Zeng, L. Logistic regression in rare events data. Political Analysis, 9, 137-
163, 2001.
[11]
Meulbroek, L. A Better Way to Manage Risk. Harvard Business Review, 22-23, 2001.
[12]
R Development Core Team. R: A Language and Environment for Statistical Computing,
2008. Available in http://www.R-project.org. Accessed on 12/01/2008.
[13]
Smyth, G. K. and Jørgensen, B. Fitting tweedie’s compound poisson model to insurance
claims data: dispersion modeling. ASTI( Bulletin 32: 143-157, 2002.
[14]
Stasinopoulos, D. M. and Rigby, R. A. Generalized additive models for location scale and
shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, (7), 2007.
[15]
Stasinopoulos, M., Rigby, B. and Akantziliotou, C. gamlss: Generalized Additive Models
for Location Scale and Shape, 2008. Available in http://www.gamlss.com/. Accessed on
12/05/08.
[16]
Weisberg, H. I.; Tomberlin, T. J. and Chartterjee, S. Predicting Insurance Losses under
Cross-Classification: A Comparison of Alternative Approaches. Journal of Business &
Economic Statistics. Vol. 2, (2)m 170-178, 1984.
[17]
Weisberg, H. I. and Tomberlin, T. J. A Statistical Perspective on Actuarial Methods for
Estimating Pure Premiums from Cross-Classified Data. The Journal of Risk and Insurance, Vol.
49, (4), 539-563, 1982.
[18]
Wüthrich, M. V.; Merz. M. Stochastic claims reserving methods in insurance. West
Sussex: John Wiley & Sons, 2008. 424 p.