Download Fitting a Loss Distribution: Household Theft Sample Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Statistical inference wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Time series wikipedia , lookup

Transcript
Example: Fitting a Loss Distribution to Claims Data1
The following example lists the claims amounts (in dollars) for 120 theft claims made to a
household insurance portfolio. The second table gives the sample statistics for the claims
amounts.
Table 1: Amounts of 120 Theft Claims Made in a Household Insurance Portfolio
3
130
233
412
716
877
1223
1436
11
138
237
423
734
942
1283
1470
27
139
254
436
743
942
1288
1512
36
140
257
456
756
945
1296
1607
47
143
259
473
784
998
1310
1699
49
153
265
475
786
1029
1320
1720
54
193
273
503
819
1066
1367
1772
77
195
275
510
826
1101
1369
1780
78
205
278
534
841
1128
1373
1858
85
207
281
565
842
1167
1382
1922
104
216
396
656
853
1194
1383
2042
121
224
405
656
860
1209
1395
2247
Table 2: Sample Statistics, Theft Data
Mean
2020.291667
Median
868.50
Standard Deviation
3949.85736
Minimum
3.00
Maximum
32043.00
Count
120
2348
2377
2418
2795
2964
3156
3858
3872
4084
4620
4901
5021
A histogram of the data is shown below (with the largest three data values excluded).
Frequency
Histogram
50
45
40
35
30
25
20
15
10
5
0
Bin
5331
5771
6240
6385
7089
7482
8059
8079
8316
11453
22274
32043
We may attempt to fit the data to each of several conjectured loss distributions, and determine
which type of loss distribution best fits the data, using the Kolmogorov distance or the Cramervon Mises statistic.
If we want to fit an exponential loss distribution, the MLE for the mean of the distribution is πœ‡Μ‚ =
π‘₯Μ… = $2020.291667. The Kolmogorov-Smirnov test of fit to an exponential loss distribution
yields D = 0.2013, with p-value = 0.0001192.
If we want to fit a Pareto II loss distribution, the likelihood equations are (with n = 120):
πœ•π‘™
πœ•π›Ό
𝑛
πœ•π‘™
= 𝛼 + 𝑛𝑙𝑛(𝛽) βˆ’ βˆ‘π‘›π‘–=1 𝑙𝑛(𝛽 + π‘₯𝑖 ) ≑ 0, and πœ•π›½ =
𝑛𝛼
𝛽
1
βˆ’ (𝛼 + 1) βˆ‘π‘›π‘–=1 𝛽+π‘₯ ≑ 0.
𝑖
These equations may be solved by numeric methods, yielding 𝛼̂ = 1.88047 and
𝛽̂ = 1872.131756.
For these parameter values, the graph of the Pareto II p.d.f. is shown below for loss values
between $0.00 and $8316.00 (the 4th largest value in the data set).
The Kolmogorov-Smirnov test of fit to a Pareto II distribution yields D = 0.0561, p-value =
0.8443. The Pareto II distribution provides a better fit than the exponential distribution.
1
P. J. Boland, Statistical and Probabilistic Methods in Actuarial Science, Chapman & Hall/CRC, 2007.