bayesian estimation and credibilityhome.cc.umanitoba.ca/~farhadi/asper/bayesian estimation and...

41
Bayesian Estimation and Credibility π Θ|X (θ |x)= f X|Θ (x|θ ) π (θ ) f X|Θ (x|θ ) π (θ ) dθ f Y|X (y|x)= f Y|Θ (y|θ ) f X|Θ (x|θ ) π (θ ) dθ f X|Θ (x|θ ) π (θ ) dθ f Y|X (y|x)= f Y|Θ (y|θ ) π Θ|X (θ |x) dθ P(Y A|x)= P(YA|θ ) f X|Θ (x|θ ) π (θ ) dθ f X|Θ (x|θ ) π (θ ) dθ P(Y A|x)= P(Y A|θ ) π Θ|X (θ |x) dθ E(Y|x)= E(Y A|θ ) π Θ|X (θ |x) dθ P(Y A|s(x)) = P(YA|θ ) f S|Θ (s|θ ) π (θ ) dθ f S|Θ (s|θ ) π (θ ) dθ 1

Upload: others

Post on 26-Mar-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Bayesian Estimation and Credibility

πΘ|X(θ |x) =fX|Θ(x|θ)π(θ)∫fX|Θ(x|θ)π(θ)dθ

fY|X(y|x) =∫fY|Θ(y|θ)fX|Θ(x|θ)π(θ)dθ∫

fX|Θ(x|θ)π(θ)dθ

fY|X(y|x) =∫fY|Θ(y|θ)πΘ|X(θ |x)dθ

P(Y ∈ A|x) =∫P(Y∈A|θ)fX|Θ(x|θ)π(θ)dθ∫

fX|Θ(x|θ)π(θ)dθ

P(Y ∈ A|x) =∫P(Y ∈ A|θ)πΘ|X(θ |x)dθ

E(Y|x) =∫E(Y ∈ A|θ)πΘ|X(θ |x)dθ

P(Y ∈ A|s(x)) =∫P(Y∈A|θ)fS|Θ(s|θ)π(θ)dθ∫

fS|Θ(s|θ)π(θ)dθ

1

Page 2: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Example. Let X be a random variable Bernoulli(θ). Suppose that θ ∼ Beta(a , b). Calculate the

posterior pdf of θ .

Solution.

assumptions:

π(θ) = θ a−1(1−θ)b−1

B(a,b) 0 < θ < 1

fX|Θ(x|θ) = θ x (1−θ)1−x x = 0 , 1

joint density : fX ,Θ(x,θ) = fX|Θ(x|θ)π(θ) =θ a+x−1(1−θ)b−x

B(a,b)

marginal density : fX(x) =∫ 1

0fX ,Θ(x,θ)dθ =

∫ 1

0

θ a+x−1(1−θ)b−x

B(a,b)dθ

=B(a+x , b−x+1)

B(a,b)beta integral formula

posterior density : fΘ|X(θ |x) =fX ,Θ(x,θ)

fX(x)=

θ a+x−1(1−θ)b−x

B(a+x , b−x+1)∼ Beta(a+x , b−x+1)

Note. The calculations related to the Bayesian study is normally complicated as it requires the

calculation of the posterior distribution which mostly are difficult. However, for some classes of priors,

the posterir distribution belongs to the same class and just the metaparameters change. In such cases,

the computstion of the posterior pdf is done with ease.

Definition. A prior pdf f(θ |λ ) is said to be conjugate to the likelihood function f(x|θ) if the posterior

pdf is of the form f(θ |λ ) with the same functional form but with different hyperparameter λ .

Example. The previous example was based on one individual observation. Now calculate the posterior

pdf if there are n observations {X1 , ... , Xn}.

Solution. Let us denote {X1 , ... , Xn} by the bold X and denote {x1 , ... , xn} by the bold x. Then

2

Page 3: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

fX|Θ(x|θ) =n

∏i=1

θ xi (1−θ)1−xi = θ ∑xi(1−θ)n−∑xi

fX ,Θ(x,θ) = fX|Θ(x|θ)π(θ) ={

θ ∑xi(1−θ)n−∑xi}{θ a−1(1−θ)b−1

B(a,b)

}=

cθ (a+nx)−1(1−θ)(b+n−nx)−1 c being a constant

fX(x) =∫ 1

0fX ,Θ(x,θ)dθ = c

∫ 1

0θ (a+nx)−1(1−θ)(b+n−nx)−1 dθ = cB(a+nx , b+n−nx)

fΘ|X(θ |x) =fX ,Θ(x,θ)

fX(x)=

θ (a+nx)−1(1−θ)(b+n−nx)−1

B(a+nx , b+n−nx)∼ Beta(a+nx , b+n−nx)

Corollary. Beta-Bernoulli (prior-likelihood) is a conjugate distribution.

Important Note. It is always true that

fΘ|X(θ |x) = c fX|Θ(x|θ)π(θ)

so there is no actual need to calculate the marginal pdf of X in order to find the functional form of

posterior distribution.

Example. Let X ∼ Binomial(m , θ) where m is fixed and θ ∼ Beta(a,b). Let {x1 , ... , xn} be observed.

Find the posterior distribution of θ .

Solution.

fΘ|X(θ |x) = cfX|Θ(x|θ)π(θ) = c{

θ ∑xi(1−θ)∑(m−xi)}{

θ a−1(1−θ)b−1}=cθ (a+nx)−1(1−θ)(b+mn−nx)−1

⇒ posterior distribution ∼ Beta(a+nx , b+mn−nx)

Corollary. Beta-Binomial (prior-likelihood) is a conjugate distribution.

3

Page 4: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Example (Gamma-Poisson). Suppose {X1 , ... , Xn} is an i.i.d. from Poisson(λ ) where

λ ∼ Gamma(α,β ). Then the posterior distribution of λ is Gamma(α , β ) where

α = α +nx β =β

1+βn

Example (Gamma-Exponential). Suppose {X1 , ... , Xn} is an i.i.d. from Exponential(λ ) where

λ ∼ Gamma(α,β ). Then the posterior distribution of λ is Gamma(α , β ) where

α = α +nx β =β

1+βnx

Example (Beta-Geometric). Suppose {X1 , ... , Xn} is an i.i.d. from Geometric(β ) where

β ∼ Beta(a,b). Then the posterior distribution of β is Beta(a , b) where

a = α +n b = b+nx

4

Page 5: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Bayesian Estimation

Suppose that some sample x = {x1 , ... , xn} has been collected, and we want to estimate

µXn+1(θ) = E(Xn+1|θ) after x has been observed. This value is a function g(θ) of θ (for example, it

might be θ 2), and we may estimate it using a point estimator similar to the way we did in the

non-Bayesian case where for example we used the MLE estimator. A point estimator, as in the case of

MLE, is a function θ(x1 , ... , xn) of the observed data set; for example in the case of MLE for an

exponential population it was x = x1+···+xnn . Similarly, for Bayesian approach to point estimation, we

have some function θ(x1 , ... , xn) of the observed data set , as the point estimator. In the MLE case, we

minimized the likelihood function (or equivalently, the log-likelihood function) to find this estimator,

but in Bayesian estimation , we have the square-error loss function

L(θ , g(θ)) =(

θ −g(θ))2

and we minimize the expected loss

E[(θ −g(θ))2 | x

]=∫(θ −g(θ))2 fθ |X(θ |x)dθ

Note that the value θ = θ(x1 , ... , xn) is a function of x1 , ... , xn (such as x) and therefore does not

depend on θ , hence we can simplify the expected loss as follows:

=∫(θ −g(θ))2fθ |X(θ |x)dθ =

∫ {g(θ)2 −2θ g(θ)+ θ 2

}fθ |X(θ |x)dθ

=

{∫g(θ)2fθ |X(θ |x)dθ

}−2θ

{∫g(θ)fθ |X(θ |x)dθ

}+ θ 2

But we know from algebra that a quadratic form x2 −2bx+ c is minimized if and only if x = b (just

differentiate with respect to x and set the derivative equal to zero). Therefore the expected loss is

minimized if and only if we have (viewing θ as the variable of interest)

θ =∫

g(θ) fθ |X(θ |x)dθ =∫

E(Xn+1|θ) fθ |X(θ |x)dθ = E[Xn+1 | x

]5

Page 6: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Definition. The Bayesian point estimation E[Xn+1 | x

]is called the Bayesian Premium.

Example. Let the model distribution for a single observation be Bernoulli(θ), and that θ ∼ Beta(a,b).

If the data x = {x1 , ... , xn} has been observed, calculate the Bayesian premium E(Xn+1|x).

Solution. We already learned that the posterior distribution of θ is

Beta(a , b) a = a+nx , b = b+n−nx

On the other hand, since Xn+1 ∼ Bernoulli(θ), we have E(Xn+1|θ) = θ . Therefore:

E(Xn+1 | x) =∫

E(Xn+1|θ)fθ |X(θ |x)dθ =∫

θ fθ |X(θ |x)dθ =

expected value of θ when posterior distribution is assumed =

expected value of θ when θ is assumed to follow Beta(a , b) =a

a+ b=

a+nxa+b+n

Example. Let the model distribution for a single observation be Binomial(2 , θ) and that

θ ∼ Beta(a,b). If the data x = {x1 , ... , xn} has been observed, calculate the Bayesian premium

E(Xn+1|x).

Solution. We already learned that the posterior distribution of θ is

Beta(a , b) a = a+nx , b = b+2n−nx

On the other hand, since Xn+1 ∼ Binomial(2 , θ), we have E(Xn+1|θ) = 2θ . Therefore:

E(Xn+1 | x) =∫

E(Xn+1|θ)fθ |X(θ |x)dθ = 2∫

θ fθ |X(θ |x)dθ =

2 times the expected value of θ when posterior distribution is assumed =

2 times the expected value of θ when θ is assumed to follow Beta(a , b) =2a

a+ b=

2a+2nxa+b+2n

Example (from the textbook). There are two types of driver. Good drivers make up 75% of the

population and in one year have zero claims with probability 0.7, one claim with probability 0.2, and

6

Page 7: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

two claims with probability 0.1. Bad drivers make up the other 25% of the population and have zero,

one, or two claims with probabilities 0.5, 0.3, and 0.2, respectively.

(i) Describe this process and how it relates to an unknown risk parameter.

(ii) For a particular policyholder, suppose we have observed x1 = 0 and x2 = 1. Determine the

predictive distribution of X3|(X1 = 0 , X2 = 1) and the posterior distribution of

θ |(X1 = 0 , X2 = 1).

(iii) Calculate the Bayesian premium EX3|X1 = 0 , X2 = 1) in two ways.

Solution to part (i). When a driver buys this insurance, we do not know to which class (good driver or

bad driver) he/she belongs. Therefore, the risk parameter takes one of two values θ = G for good

drivers and θ = B for bad drivers. Corresponding to the above information, we have the following

table:

x P(X = x |θ = G) P(X = x |θ = B) θ P(θ)

0 0.7 0.5 G 0.75

1 0.2 0.3 B 0.25

2 0.1 0.2

Claim Probabilities Given State

State (B or G) Number of Claims

0 1 2 Total

G 0.7 0.2 0.1 1

B 0.5 0.3 0.2 1

Solution to part (ii).

7

Page 8: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

fX(0,1) = ∑θ

fX1|Θ(0,θ) fX1|Θ(1,θ)π(θ)

= (0.7)(0.2)(0.75)+(0.5)(0.3)(0.25)

= 0.1425

fX,X3(0,1,0) = (0.7)(0.2)(0.7)(0.75)+(0.5)(0.3)(0.5)(0.25) = 0.09225

fX,X3(0,1,1) = (0.7)(0.2)(0.2)(0.75)+(0.5)(0.3)(0.3)(0.25) = 0.03225

fX,X3(0,1,2) = (0.7)(0.2)(0.1)(0.75)+(0.5)(0.3)(0.2)(0.25) = 0.01800

predictive distribution

fX3|X(0|0,1) = 0.09225

0.1425 = 0.647368

fX3|X(1|0,1) = 0.032250.1425 = 0.226316

fX3|X(2|0,1) = 0.018000.1425 = 0.126316

posterior distribution

π(G|0,1) = f(0|G) f(1|G)π(G)f(0,1) = (0.7)(0.2)(0.75)

0.1425 = 0.736842

π(B|0,1) = f(0|B) f(1|B)π(B)f(0,1) = (0.5)(0.3)(0.25)

0.1425 = 0.263158

Solution to part (iii).

We first calculate the (unobservable) hypothetical means: µ3(G) = (0)(0.7)+(1)(0.2)+(2)(0.1) = 0.4

µ3(B) = (0)(0.5)+(1)(0.3)+(2)(0.2) = 0.7

Using the formula E(Xn+1|X = x) =∫

xn+1 fXn+1|X(xn+1|x)dxn+1 we will have:

Bayesian premium E(X3|x1 = 0,x2 = 1) = (0)(0.647368)+(1)(0.226316)+(2)(0.126316)

= 0.478948

But, by using the formula E(Xn+1|X = x) =∫

µn+1(θ) πΘ|X(θ |x)dθ we get to:

E(X3|x1 = 0,x2 = 1) = (0.4)(0.736842)+(0.7)(0.263158) = 0.478947

8

Page 9: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Buhlmann Model

In this model we have an independent identically distributed process {X1 , ... , XN , XN+1 , ....} with

common mean and variance:

Hypothetical Mean : µ(θ) = E(X1|θ) = E(X2|θ) = · · ·

Process Variance : σ 2(θ) = Var(X1|θ) = Var(X2|θ) = · · ·

The portion {X1 , ... , XN} is used to forecast the future outcomes {XN+1 , Xn+1 , ....}. Now we define

the following quantities:

(1) Population mean: µ = E[µ(θ)] = E[E[Xt|θ ]]

(2) Expected Value of Process Variance: EPV = E[σ2(θ)] = E[Var[Xt|θ ]

](3) Variance of Hypothetical Means: VHM = Var[µ(θ)] = E

[(µ(θ)−µ)2

]If no prior information is available, then the population mean is used as an estimate for the expected

values of the Xt′s.

Example (from the Dean’s note). The number of claims Xt during the t-th period for a risk has a

Poisson distribution with parameter θ :

P[Xt = x] =θ xeθ

x!

The risk was selected at random from a population for which θ is uniformly distributed over the

interval [0,1]. It will be assumed that θ is constant through time for each risk.

(1) Hypothetical mean for risk with parameter θ is µ(θ) = E[Xt|θ ] = θ because the mean of the

Poisson random variable is the parameter θ .

(2) Process variance for risk with parameter θ is

σ 2(θ) = Var[Xt |θ ] = θ

because the variance equals the parameter θ for the Poisson.

9

Page 10: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

(3) Variance of the Hypothetical Means (VHM) is

Var(

E[Xt |θ ])= Var(θ) = E(θ 2)−E(θ)2 =

∫ 1

0θ 2dθ −

(∫ 1

0θdθ

)2

=1

12

(4) Expected Value of the Process Variance (EPV) is

E[Var(Xθ |θ)

]= E[θ ] =

∫ 1

0θdθ =

12

(5) Unconditional Variance (or total variance) is

Var[Xθ ] =V HM+EPV =112

+12=

712

10

Page 11: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Derivation of the Credibility factor in Buhlmann Model

By setting X = X1+···+XNN = 1

N ∑Ni=1 Xi we have

E(X |θ) = E

[1N

N

∑i=1

Xi|θ

]=

1N

N

∑i=1

E(Xi|θ) =1N

N

∑i=1

µ(θ) = µ(θ)

So , in other words , X is an unbiased estimator for µ(θ) . Now we seek a and b so as to minimize the

expected value:

min E[a+bX −µ(θ)

]2

where the expectation is taking with respect to the joint distribution of (X1 , ... , XN , θ) .

For simplicity, set

Y = X −µ(θ)

Then X = Y+µ(θ) and of course we have

E(Y |θ) = E(X |θ)−E(µ(θ)|θ) = E(X |θ)−µ(θ) = 0

Now note that [a+bX−µ(θ)

]2=

[a+bY+(b−1)µ(θ)

]2

= (bY+ c(θ))2 c(θ) = a+(b−1)µ(θ)

= b2Y2 +2bc(θ)Y+ c(θ)2

Then

E[a+bX −µ(θ)

]2= b2E(Y 2)+2bE

[c(θ)Y

]+E

[c(θ)2

](1)

But:

E[c(θ)Y

]= E

[E[c(θ)Y |θ

]]= E

[c(θ)E

[Y |θ

]]= E

[c(θ)zero

]= E(zero) = 0

So the equality (1) reduces to

E[a+bX −µ(θ)

]2= b2E(Y 2)+E

[c(θ)2

](2)

11

Page 12: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

To minimize this , we must set the partial derivative of it equal to zero:

∂∂a

= 2E[c(θ)

∂c(θ)∂a

]= 2E

(c(θ)

)= 2{

a+(b−1)E[µ(θ)]}= 2{

a+(b−1)µ}

Then

if∂

∂a= 0 ⇒ a = (1−b)µ

Next Step. Using the equality a = (1−b)µ we can now write the right-hand side of equation (2) as

b2E(Y2)+E[(1−b)2(µ(θ)−µ)2

]= b2E(Y2)+(1−b)2E

[(µ(θ)−µ)2

]

= b2E(Y2)+(1−b)2Var(µ(θ))

= b2E(Y2)+(1−b)2VHM (3)

Further note that

E(Y2) = E{

E[Y2|θ ]}

= E{

E[(X−µ(θ))2|θ

]}

= E{

Var[X|θ

]}= E

{1N Var

[X1|θ

]}

= 1N E{

Var[X1|θ

]}= 1

N EPV (4)

Putting this into (3) , the right-hand side of (3) reads:

b2 EPVN

+(1−b)2V HM

Now differentiate this with respect to b and set it equal to zero:

∂∂b

= 0 ⇒ 2bEPV

N−2(1−b)V HM

⇒ b =V HM

V HM+ EPVN

=N

N + EPVV HM

=N

N +Kwhere K =

EPVV HM

This quantity is denoted by Z:

Z =V HM

V HM+ EPVN

12

Page 13: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Then

a = (1−b)µ = (1−Z)µ

Then our estimate for µ(θ) will be

µ(θ) = a+bX = (1−Z)µ +ZX

Note. As we saw in the calculations in (4) we have:

E{

Var[X |θ

]}=

1N

EPV

Also:

Var{

E[X |θ

]}=Var

{E[ 1

N

N

∑i=1

Xi|θ]}

=Var{ 1

N

N

∑i=1

E[Xi|θ

]}=Var

{ 1N

N

∑i=1

µ(θ)}=Var(µ(θ))=V HM

Now by adding up these expressions , we get:

E{

Var[X |θ

]}+Var

{E[X |θ

]}=

EPVN

+V HM ⇒ Var(X) =EPV

N+V HM

Z =V HM

V HM+ EPVN

=Var(µ(θ))

Var(X)=

Variance of the Hypothetical MeansTotal Variance of the Estimator X

Also note that

K =E(Var[X |θ ])Var(E[X |θ ])

Example (from the Dean’s notes). Two risks have the following severity distributions and that Risk 1

is twice as likely to be observed as Risk 2.

Probability of Claim Probability of Claim

Amount of Claim Amount of Risk 1 Amount of Risk 2

250 0.5 0.7

2500 0.3 0.2

60000 0.2 0.1

13

Page 14: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

A claim of 250 is observed. Determine the Buhlmann credibility estimate of the second claim amount

from the same risk.

Solution

Let us denote the claim amount by X.

Step 1. Calculate the variance of the hypothetical means :

E[X |Risk 1] = (0.5)(250)+(0.3)(2500)+(0.2)(60000) = 12875

E[X |Risk 2] = (0.7)(250)+(0.2)(2500)+(0.1)(60000) = 6675

E[X ] = (23)(12875)+(

13)(6675) = 10808.33

V HM = (23)(12875−10808.33)2 +(

13)(6675−10808.33)2 = 8542,222.2

Step 2. Calculate the expected value of the process variance :

Var[X |Risk 1] = (0.5)(250−12875)2+(0.3)(2500−12875)2+(0.2)(60000−12875)2 = 55,6140,625.0

Var[X |Risk 2] = (0.7)(250−6,675)2 +(0.2)(2500−6675)2 +(0.1)(60000−6675)2 = 316,738,125.0

EPV = (23)(556,140,625.0)+(

13)(316,738,125.0) = 476,339,791.7

K =EPVV HM

=476,339,791.7

8,542,222.2= 55.76

Z =N

N +K=

11+55.76

=1

56.76

Buhlmann credibility estimate = (1

56.76)(250)+(

55.7656.76

)(10,808.33) = 10,622

Example ∗. You are given the following:

(i) The number of claims made by an individual insured follows a Poisson distribution.

(ii) The expected number of claims, λ , for insureds in the population has the probability density

function

f (λ ) = 4λ−5 for 1 ≤ λ < ∞

14

Page 15: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Determine the value of the Buhlmann k used for estimating the expected number of claims for an

individual insured.

Solution. Here X denotes the number of claims.

E[X |λ ] = E(Poisson(λ )) = λ

E(λ 2) = 4∫ ∞

1λ 2λ−5 dλ = 4

∫ ∞

1λ−3 dλ =

4−2

λ−2]∞

1= 2

E(λ ) = 4∫ ∞

1λλ−5 dλ = 4

∫ ∞

1λ−4 dλ =

4−3

λ−3]∞

1=

43

Var(E[X |λ ]) = Var(λ ) = E(λ 2)−E(λ )2 = 2− 169

=29

Var[X |λ ] = Var(Poisson(λ )) = λ

E(Var[X |λ ]) = E(λ ) =43

K =E(Var[X |λ ])Var(E[X |λ ])

=4329

= 6

Definition. If the Bayesian estimate equals the Buhlmann estimate, then we say that the Buhlmann

credibility estimate has exact credibility.

Note. In the above examples of conjugate distributions exact credibility occurs. We verify it for the

Gamma-Poisson conjugate:

Example . Suppose {X1 , ... , Xn} is an i.i.d. from Poisson(λ ) where λ ∼ Gamma(α,β ). Calculate

both the Bayesian and Buhlmann estimates and verify the exact credibility for this case.

Solution.

Xi ∼ Poisson(λ ) ⇒ µ(λ ) = E[Xi|λ ] = λ

EPV = E[Var[Xi|λ ]

]= E[λ |] = αβ

15

Page 16: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

VHM = Var[E[Xi|λ ]] = Var(λ ) = αβ 2

k =EPVVHM

=1β

z =n

n+k=

nn+ 1

β=

nβnβ +1

total expectation µ = E[E(Xi|λ )] = E(λ ) = αβ

Buhlmann credibility estimate zx+(1− z)µ =

(nβ

nβ +1

)x+(

βnβ +1

)=

β (nx+α)

nβ +1

Step 2.

Bayesian estimate = E[µ(λ ) | x] = E(λ | x) = αβ = (α +nx)(

βnβ +1

)=

β (nx+α)

nβ +1

Buhlmann credibility estimate = Bayesian estimate ✓

Exercise. Verify the exact credibility for each of the conjugate distributions:

Beta-Bernoulli

Beta-Binomial

Gamma-Exponential

Beta-Geometric

16

Page 17: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Buhlmann-Straub Model

The Buhlmann’s model cannot be applied to group insurances because that model does not allow for

changes in the number of insured members of the group. Therefore we appeal to the Buhlmann-Straub

model for such cases. In the Buhlmann-Straus Model we assume that there are n policy years and for

each year t there are mt exposures , and that Xt is the claim size , number of claims , or ...

per unit of exposure during period t. Note that the (loss, claim size, or number of claims) “per unit of

exposure” is used because the exposure can vary through time and from risk to risk.

So , if the aggregate claim size in year t is Yt , then we actually have Xt =Ytmt

. Note that since Xt

measures a quatity per unit of exposure, the Xt’s are no longer assumed to have the same distribution.

Risk Periods of Experience

1X11 X12 · · · X1N1

m11 m12 · · · m1N1

2X21 X22 · · · · · · · · · X2N2

m21 m22 · · · · · · · · · m2N2

......

......

RXR1 XR2 · · · · · · XRNR

mR1 mR2 · · · · · · mRNR

The number of periods of experience can vary by risk, and that the experience periods do not have to

start at the same time either.

Example (from the Dean’s lectures). ABC Insurance, Inc. sells dental insurance plans to companies

with fewer than one hundred employees. An actuary is analyzing the number of claims per employee.

Looking at the first company in her file, she sees that the company has three full years of plan coverage.

In the first year there were 40 employee-years with 84 claims, in the second year there were 44

employee-years with 88 claims, and in the third year there were 42 employee-years with 105 claims.

Designating this selected company as Risk 1, then:

X11 = 84 claims / 40 employee-years = 2.1 claims/employee-year

17

Page 18: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

X12 = 88 claims / 44 employee-years = 2.0 claims/employee-year

X13 = 105 claims / 42 employee-years = 2.5 claims/employee-year

The exposures are m11 = 40 employee-years, m12 = 44 employee-years, and m13 = 42

employee-years. ■

18

Page 19: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Buhlmann-Straub Model for one policyholder when underlyingprobabilities are known

As in the Buhlmann’s model we assume that the conditional random variables X1|θ , X2|θ , ... are

independent. Further assumption is that the process variances , Var(Xt|θ) , are inversely proportional

to the size (i.e., exposure) of the risk during each observation period , in other words , the product

σ 2(θ) := mt Var(Xt |θ)

is constant (for all t).

Now we define the following quantities:

1. Hypothetical Mean for risk θ per unit of exposure:

µ(θ) = E(X1|θ) = E(X2|θ) = · · ·

2. Process Variance for risk θ :

Var(X1|θ) =σ2(θ)

m1· · · Var(Xt |θ) =

σ 2(θ)mt

· · ·

3. Population mean: µ = E[µ(θ)] = E[E[Xt|θ ]]

4. Expected Value of Process Variance: EPV = E[σ 2(θ)]

5. Variance of Hypothetical Means: VHM = Var[µ(θ)]

Example (Dean’s notes page 12). The annual numbers of claims for truck drivers in a homogeneous

population are independently and identically distributed. [The population might represent the work

force of a large trucking company with strict hiring standards and good safety training for each driver.]

For each driver the number of claims per year has a mean of µ(θ) and a variance of σ2(θ). (The θ

parameter applies to every driver in the group.).

19

Page 20: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

A group of 10 drivers is selected from the larger population.

(1) What is the expected annual claims frequency for the group of 10 drivers?

(2) What is the variance of the annual claims frequency for the group?

Solution (from the Dean’s notes). Let X1t, X2t,..., X10t be random variables representing the number

of claims in year t for each of the ten selected drivers. Then Xt =110 ∑10

i=1 Xit is the annual claims

frequency for the group; that is, it is the annual number of claims per driver. The exposure is mt = 10

and the unit of exposure is one driver. The expected value and variance for the annual claims frequency

for the group are

E[Xt |θ ] = E

[110

10

∑i=1

Xit |

]=

110

10

∑i=1

E [Xit |] =1

10

10

∑i=1

µ(θ) = µ(θ)

Var[Xt |θ ] = Var

[110

10

∑i=1

Xit |θ

]=

1(10)2

10

∑i=1

Var [Xit |θ ] =1

100

10

∑i=1

σ 2(θ) =σ 2(θ)

10

In this example, the exposure is the number of drivers in the group, which is 10. The expected claims

frequency is the same whether there is one driver, 10 drivers, or 100 drivers in the group; however, the

variance in the groups claims frequency is inversely proportional to the number of drivers in the group.

In the Buhlmann-Straub model one seeks a point estimation for E[µ(θ)|X1 = x1 , ... , Xn = xn]. But as

we have argued before , this conditional expectation is the same as the conditional expectation

E[Xn+1|X1 = x1 , ... , Xn = xn]:

E[µ(θ)|X1 = x1 , ... , Xn = xn] = E[Xn+1|X1 = x1 , ... , Xn = xn]

We set:

X =N

∑i=1

(mt

m

)Xt where m =

N

∑t=1

mt

20

Page 21: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Then

E(X |θ) = E

(N

∑t=1

(mt

m

)Xt |θ

)=

N

∑t=1

(mt

m

)E(Xt |θ) =

N

∑t=1

(mt

m

)µ(θ) = µ(θ)

Var(X |θ) = Var

(N

∑t=1

(mt

m

)Xt |θ

)=

N

∑t=1

(mt

m

)2Var(Xt |θ)

=N

∑t=1

(mt

m

)2 σ2(θ)mt

=σ 2(θ)

m

The unconditional mean and variance of X are :

E[X ] = E[E[X |θ ]] = E[µ(θ)] = µ

Var[X ] = Var[E[X |θ ]]+E[Var[X |θ ]] = Var[µ(θ)]+E[σ 2(θ)]

m=V HM+

EPVm

In Buhlmann-Straub model , the credibility assigned to X (to estimate µ(θ)) is

Z =Variance of the Hypothetical Means

Total Variance of the Estimator

Upon simplifying , we get:

Z =m

m+K

where the value K is defined by:

K =EPVV HM

The credibility estimate is

µ(θ) = Z · X +(1−Z) ·µ

Note. The Buhlmann’s Model is a special case of the Buhlmann-Straub Model with mt = 1 for all time

t.

21

Page 22: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Buhlmann-Straub Model for more-than-one policyholder (nonparametricestimation)

Here we have r group policyholders and for each group i we have ni policy years ; the start of the years

for different groups may differ. We adopt the following notations:

Xit = the average loss/claim for policyholder i in year t:

Xi = (Xi1 , · · · , Xini)

mit denote the number of exposure units for policyholder i in year t:

The total number of exposure units over all years for each group i is

mi =T

∑t=1

mit

The total exposure units for all policyholders over all years is

m =r

∑i=1

mi

The average loss experience of policyholder i over all the years

Xi =1mi

ni

∑t=1

mitXit

The overall average losses is

X =1m

r

∑i=1

miXi

Assumptions:

1. The random vectors {X1 , ... , Xr} are assumed to be mutually statistically independent.

2. The distribution of each vector Xi depends on a risk parameter θi , and we assume that the

random variables {θ1 , ... , θr} form an i.i.d.

22

Page 23: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

3. within any group i , the variables

Xi1|θi , ... , Xini |θi

are independent.

Set

µ(θi) = E[Xit |θi]

so, for each group , the hypothetical means are constant over time. Here we have:

σ2(θi) = mitVar(Xit |θi)

µ = E[µ(θi)]

EPV = E[σ 2(θi)]

VHM = Var[µ(θi)]

We are going to estimate these parameters , which are called structural parameters.

Unbiased Estimation for µ :

µ = X

Unbiased Estimation for σ2(θi) :

σ 2i =

1ni −1

ni

∑t=1

mit(Xit − Xi)2

Unbiased Estimation for EPV :

EPV =r

∑i=1

wi σ 2i =

1∑r

i=1(ni −1)

r

∑i=1

ni

∑t=1

mit(Xit − Xi)2 wi =

ni −1∑r

i=1(ni −1)

23

Page 24: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Unbiased Estimation for VHM :

V HM =m

m2 −∑m2i

{r

∑i=1

mi(Xi − X)2 − (r−1)EPV

}

If we set

k =EPV

V HMzi =

mi

mi + k

then the credibility estimate for the credibility premium

E[Xi,n+1|Xi,1 = xi,1 , ... , Xn,i = xn,i]

and for µ(θi) is

ZiXi +(1− Zi)X

Estimate of the premium for policyholder i is :

mi ,ni+1

(ZiXi +(1− Zi)X

)(i) Determine the number of periods ni for each of the policyholders.

(ii) Determine the exposure measure mit for each policyholder i during each period t.

(iii) Calculate the claim amounts xit.

(iv) Calculate the average claim amounts xi for each policyholder over all periods.

(v) Calculate the estimated µ = x.

(vi) Calculate the estimated EPV =r∑

i=1wi σ 2

i wi =ni−1

∑ri=1(ni−1)

(vii) Calculate the estimated VHM = mm2−∑m2

i

{r∑

i=1mi(Xi − X)2 − (r−1)EPV

}(viii) Calculate k = EPV

VHM

24

Page 25: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

(ix) Calculate the credibility factors: zi =mi

mi+k

(x) Calculate the average claim amount per exposure unit for policyholder i:

ZiXi +(1− Zi)X

(xi) Calculate the aggregate claim amount for (policyholder) group i:

mi ,ni+1

(ZiXi +(1− Zi)X

)

Example. The aggregate claim amount for two groups over three years are given in the following table:

Policy Year

Group ↓ 1 2 3 4

1Aggregate Claim 8,000 11,000 15,000 ?

size of group 40 50 70 75

2Aggregate Claim 20,000 24,000 19,000 ?

size of group 100 120 115 95

Estimate the aggregate claim amount to be observed during the fourth year for each group.

Solution.

Group 1. Exposure measures

m11 = 40 , m12 = 50 , m13 = 70.

m1 = 40+50+70 = 160

Average claim amounts:

x11 =8,000

40= 200 x12 =

11,00050

= 220 x13 =15,000

70= 214.29

x1 =8,000+11,000+15,000

160= 212.50

25

Page 26: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Group 2. Exposure measures

m21 = 100 , m22 = 120 , m23 = 115.

m2 = 100+120+115 = 335

Average claim amounts:

x21 =20,000

100= 200 x22 =

24,000120

= 200 x23 =19,000

115= 165.22

x2 =20,000+24,000+19,000

335= 188.06

Overall exposure units for the first three years:

m = m1 +m2 = 160+335 = 495

Estimate for overall mean:

µ = x =m1x1 +m2x2

m=

(160)(212.50)+(335)(188.06)495

= 195.96

Estimate of the EPV:

EPV =

2∑

i=1

3∑

j=1mij(xij − xi)

2

2∑

i=1(3−1)

=40(200−212.5)2 +50(212−212.5)2 +70(214.29−212.5)2 +100(200−188.06)2 + · · ·

2+2

= 25160.58

r

∑i=1

mi(Xi − X)2 = (160)(212.5−195.96)2 +335(188.06−195.96)2 = 64678.806

26

Page 27: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

mm2 −∑m2

i=

1m− 1

m ∑m2i=

1

495− 1495

{(160)2 +(335)2

} = 0.0046

VHM =m

m2 −∑m2i

{r

∑i=1

mi(Xi − X)2 − (r−1)EPV

}= 0.0046

{64678.806− (1)(25160.58)

}= 182.48

k =EPV

VHM=

25160.58182.48

= 137.88

the credibility factors for the two policyholders:

z1 =m1

m1 + k=

160160+137.88

= 0.537

z2 =m2

m2 + k=

335335+137.88

= 0.708

Buhlmann-Straub estimates of the average claim amounts per exposure unit:

Z1X1 +(1− Z1)X = (0.537)(212.50)+(0.463)(195.96) = 204.84

Z2X2 +(1− Z2)X = (0.708)(188.06)+(0.292)(195.96) = 190.37

The aggregate claim amount for each of the two groups:

m1 ,4

(Z1X1 +(1− Z1)X

)= (75)(204.84) = 15363.00

m2 ,4

(Z2X2 +(1− Z2)X

)= (95)(190.37) = 18085.15

Example. The aggregate claim amount for two groups over three years are given in the following table:

27

Page 28: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Policy Year

Group ↓ 1 2 3 4

1Aggregate Claim —- 11,000 15,000 ?

size of group —- 50 70 75

2Aggregate Claim 20,000 24,000 19,000 ?

size of group 100 120 115 95

Estimate the aggregate claim amount to be observed during the fourth year for each group.

Solution.

Note that there is no data available for policyholder 1 for the first year, so the calculations would start

like this:

m11 = 50 , m12 = 70.

m1 = 50+70 = 120

Average claim amounts:

x11 =11,000

50= 220 x12 =

15,00070

= 214.29 x13 =15,000

70= 214.29

x1 =11,000+15,000

120= 216.67

students will do the rest.

Note. In some situations we might have VHM ≤ 0. In this case we set VHM ≤ 0 which then results in

k =+∞ and then Z = 0.

Example (from the Dean’s notes - page 20). Two risks were selected at random from a population.

Risk 1 had 0 claims in year one, 3 claims in year two, and 0 claims in year three: (0 , 3 , 0). The claims

28

Page 29: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

by year for Risk 2 were (2 , 1 , 2). In this case, R = 2 and N = 3. Use the Buhlmann’s model to estimate

the expected number of claims per year for each risk for the fourth year.

Solution.x1 =

0+3+03 = 1

x2 =2+1+2

3 = 53

x =1+( 5

3 )

2 = 43

σ2

1 = (0−1)2+(3−1)2+(0−1)2

3−1 = 3

σ21 =

(2− 53 )

2+(1− 53 )

2+(2− 53 )

2

3−1 = 13

EPV =σ2

1+σ22

2 =3+ 1

32 = 5

3

V HM =1

2−1

{(1− 4

3)2 +(

53− 4

3)2}−

533= −1

3

this happened to be negative, so we make it zero

Then

Z = 0

29

Page 30: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Semiparametric estimation

It may be possible to have information about the conditional distribution fXij|Θi(x|θi) of the loss

variables. For example, in our study, Xij may be the number of claims per exposure unit, and the

number of claims for policyholder i is then mijXij, and this might be distributed as Poisson(mijθi). Then E[mijXij |θi] = mijθi

Var[mijXij |θi] = mijθi

µ(θi) = E[Xij |θi] = θi

σ2(θi) = mijVar[Xij |θi] = θi

take expectation⇒ µ = EPV

According to this equality and the fact that X is the MLE and unbiased for µ , we approximate EPV by

X.

Special case of Bulmann credibility (uniform exposures)

In the special case of B ulmann model (uniform exposures) the assumption is this: there is an i.i.d.

{X1 , ... , Xn} with Xi ∼ Poisson(θ). In this case, the same as above we have µ = EPV , so we use x as

the estimate for EPV

EPV = x

Furthermore:

Law of Total Variance ⇒ Var(Xi) = EPV+VHM ⇒

VHM = Var(Xi)−EPV ≈ s2 − x where s2 =1

n−1

n

∑i=1

(Xi − X)2

V HM = s2 − x

Once the estimations v and VHM have been calculate, then we calculate K and z. Then the

semiparametric estimate will be (bearing in mind that µ = x)

z(average of the values used for prediction)+(1−z)(average of the values used to get the structural parameters)

Example (SOA sample question #240). For a group of auto policyholders, you are given:

30

Page 31: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

(i) The number of claims for each policyholder has a conditional Poisson distribution.

(ii) During Year 1, the following data are observed for 8000 policyholders:

Number of Claims Number of Policyholders

0 5000

1 2100

2 750

3 100

4 50

5+ 0

A randomly selected policyholder had one claim in Year 1.

Calculate the semiparametric empirical Bayes estimate of the number of claims in Year 2 for the same

policyholder.

Solution. In here we have {X1 , ... , X8000} with the conditional distribution Xi|θ ∼ Poisson(θ). We

want to estimate E[Xnew|Xold = 1]. The prior distribution is not given.

EPV = x =(5000)(0)+(2100)(1)+(750)(2)+(100)(3)+(50)(4)

8000= 0.5125

s2 =(5000)(0−0.5125)2 +(2100)(1−0.5125)2 +(750)(2−0.5125)2 +(100)(3−0.5125)2 +(50)(4−0.5125)2

8000=

0.5874

VHM = s2 − x = 0.5874−0.5125 = 0.0749

k =EPV

VHM=

0.51250.0749

= 6.8425

The prediction is being done based on N = 1 observation.

z =N

N+k=

11+6.8425

= 0.1275

31

Page 32: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Bayes estimate = z(1)+(1− z)(0.5125) = (v)(1)+(1−0.1275)(0.5125) = 0.5747

32

Page 33: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Talk about Regression on page 21

33

Page 34: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Limited Fluctuation Credibility

Limited Fluctuation Credibility (also called the classical approach):

Update the prediction of loss, as a weighted average of the prediction based on recent data and the rate

taken from the insurance manual.

Limited Fluctuation Credibility

Full Credibility : the updated prediction is based on recent data only

Partial Credibility : the recent data is insufficient for updating prediction

We apply credibility theory to these measures:

• (i): Claim Frequency N.

• (ii): Aggregate Loss S.

• (iii): Claim Severity The average claim severity is SN .

• (iv): Pure Premium If E denotes the number of exposure units, then the quotient SE is called the

pure premium.

Note. The claim frequency N is random, but number of exposure units E is fixed over time (like the

number of workers covered for work compensation plan).

Note. If the predicted loss value based on the companies manual is denoted by M, and the predicted

value based on the recent data is denoted by D, then the updated prediction is some weighted

combination

ZD+(1−Z)M

The value Z is called the credibility factor. If Z = 1, then we say that full credibility has been

obtained. If 0 < Z < 1, then partial credibility has been obtained.

In the classical credibility approach, the minimum size of data required for full credibility is called

standard for full credibility.

34

Page 35: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Full credibility for claim frequency

Convention. In the classical credibility, we say that full credibility has been reached if there is a large

probability of p (large enough to give enough confidence) that the relative error N−E(N)E(N) is small in

absolute value,∣∣∣N−E(N)

E(N)

∣∣∣< r , r a small number, with a probability of at least p.

p ≤ P(∣∣∣∣N−E(N)

E(N)

∣∣∣∣< r)= P(|N−E(N)|< rE(N)) = P

(∣∣∣∣N−E(N)

s(N)

∣∣∣∣< rE(N)

s(N)

)= P(|N(0,1)|< rµ

σ)

⇒ z 1+p2

<rµσ

In particular, if N ∼ Poisson(λ ), then

z 1+p2

<rλ√

λ= r

√λ

Therefore, full credibility is attained if

λ ≥(z 1+p

2

r

)2

In practice, if n is the observed number of claims under the assumption of Poisson, then for full

credibility we check for

n ≥(z 1+p

2

r

)2

Example. An insurance company wants to assign full credibility to 800 claims or more. What is the

required coverage probability for the number of claims to be within 8% of the true value. Assume that

the claims frequency is Poisson and normal approximation applies (i.e. λ is large).

Solution.

800 =

( z 1+p2

0.08

)2

⇒ z 1+p2

= 2.2627 ⇒ p = 97.63%

Example. Recent experience has given the mean accident rate to be 0.045 and the standard for full

35

Page 36: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

credibility of claims to be 1200. For a group with similar risk, what is the number of exposure units for

full credibility?.

Solution. The standard for full credibility based on exposure unit:

12000.045

= 26,667 exposure units

Full credibility for claim severity

Here we assume an i.i.d. {X1 , ... , Xn} of severity random variables with mean µ and variance σ 2. In

this case, we say that full credibility is attained if

p ≤ P(∣∣∣∣ X−E(X)

E(X)

∣∣∣∣< r)

But, E(X) = µ and s(X) = σ√n . So we can write:

p ≤ P(∣∣∣∣ X−E(X)

E(X)

∣∣∣∣< r)= P(|X−E(X)|< rE(X)) = P

(∣∣∣∣ X−E(X)

s(X)

∣∣∣∣< rE(X)

s(X)

)

= P(|N(0,1)|<√

n rµσ

)

⇒ z 1+p2

≤√

n rµσ

⇒ n ≥(z 1+p

2

r

)2(σµ

)2

Example. Suppose that the estimates for mean and variance of the severity are 1000 and 2,000,000

respectively. Find the standard of full credibility for p = 0.99 and r=0.05 .

Solution.

z 1+p2

= z0.995 = 2.5758

standard =

(z 1+p2

r

)2(σµ

)2

=

(2.5758

0.05

)2 σ 2

µ2 =

(2.57580.05

)2 2,000,000(1000)2 = 5308

36

Page 37: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

Full credibility for aggregate loss

Here we have S = X1 + · · ·+XN , where the Xi’s have common mean µX and common variance σ 2X. If

N is Poisson, then µS = µNµX = λ µX

σ2S = λ (µ2

X +σ 2X)

Criterion for full credibility is:

p ≤ P(∣∣∣∣S−E(S)

E(S)

∣∣∣∣< r)

Similar to the previous section, one gets:

p ≤ P(∣∣∣∣S−E(S)

E(S)

∣∣∣∣< r)= P(|S−E(S)|< rE(S)) = P

(∣∣∣∣S−E(S)s(S)

∣∣∣∣< rE(S)s(S)

)

= P(|N(0,1)|< rµ(S)

σ(S)

)= P

|N(0,1)|< rλ µX√λ ((µ2

X +σ2X))

⇒ z 1+p2

≤ rλ µX√λ ((µ2

X +σ 2X))

⇒ λ ≥(z 1+p

2

r

)2{

1+(

σµ

)2}

In practice, if n is the observed number of claims under the assumption of Poisson, then for full

credibility we check for

n ≥(z 1+p

2

r

)2{

1+(

σµ

)2}

Note. This expression on the right-hand side is:

(z 1+p2

r

)2{

1+(

σµ

)2}

=

(z 1+p2

r

)2

+

(z 1+p2

r

)2(σµ

)2

So for Poisson claim distribution we have :

37

Page 38: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

standard for full credibility of aggregate loss =

standard for full credibility of claim frequency + standard for full credibility of claim severity

Full credibility for pure premium

Pure premium P = SE , where the number of exposure units, E, is a constant, is the premium charged to

cover losses before taking into consideration the profits and expenses. Since P and S differ by a

constant only, we have µPσP

= µSσS

. Therefore in the expression

p ≤ P(∣∣∣∣S−E(S)

s(S)

∣∣∣∣< r E(S)s(S)

)one can substitute µP

σPfor µS

σS, and then the standard for full credibility of the pure premium would be the

same as that of the aggregate loss. See problem 2 of the SOA sample questions.

Partial Credibility

Assume that W is any of the loss measures claim frequency, claim severity, or aggregate loss. When

the risk group is not large enough, the full credibility may not be achieved in which case a combination

ZW+(1−Z)M is taken, where 0 < Z < 1. The number Z is determined such that the event

|ZW−ZE(W)| ≤ rE(W) occur with probability p. For example, for the case of claims frequency N we

want to have

p = P(|ZN−ZE(N)|< rE(N)) = P(∣∣∣∣N−E(N)

s(N)

∣∣∣∣< rE(N)

Zs(N)

)= P(|N(0,1)|< rµ

Zσ)

⇒ z 1+p2

=rµZσ

=rλ

Z√

λ=

r√

λZ

⇒ Z =

(r

z 1+p2

)√

λ

For other two cases :

38

Page 39: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

claim frequency : Z =

(r

z 1+p2

)√λ =

√λ

standard =√

the thing to be compared with the standard for full credibilitystandard

claim severity : Z =

(r

z 1+p2

)√N

C2X=√

Nstandard =

√the thing to be compared with the standard for full credibility

standard

CX being coefficient of variation

aggregate loss : Z =

(r

z 1+p2

)√λ

1+C2X=√

λstandard =

√the thing to be compared with the standard for full credibility

standard

Example (exercise 17.7 of the textbook ∗). The average claim size for a group of insureds is 1,500

with a standard deviation of 7,500. Assume that claim counts have the Poisson distribution. Determine

the expected number of claims so that the total loss will be within 6% of the expected total loss with

probability 0.90.

Solution.

z 1+p2

= z0.95 = 1.645

(z 1+p2

r

)2{

1+(

σµ

)2}

=

(1.6450.06

)2{

1+(

75001500

)2}

= 19543.51 ⇒ 19544 claims

Example (exercise 17.8 of the textbook ∗). A group of insureds had 6,000 claims and a total loss of

15,600,000. The prior estimate of the total loss was 16,500,000. Determine the limited fluctuation

credibility estimate of the total loss for the group. Use the standard for full credibility determined in

the previous example.

Solution.√λ

standard=

√6000

19543.51= 0.55408

ZW+(1−Z)M = (0.55408)(15,600,000)+(1−0.55408)(16,500,000) = 16,001,328

Example. A portfolio of policies has 896 claims in the current period with mean loss of 45 and

39

Page 40: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

variance being 5067. Full credibility is based on a coverage probability of 98% for a range of within

10% of the true mean. The mean frequency of claims is 0.09 per policy and the portfolio has 18600

policies. Calculate Z for the claim frequency, claim severity, and aggregate loss.

Solution.

Part 1.

expected claim frequency for the portfolio = (18600)(0.09) = 1674

z 1+p2

= z0.99 = 2.3263

Full credibility standard for claim frequency:

(z 1+p2

r

)2

=

(2.3263

0.1

)2

= 541.17 < λ = 1674 ⇒ full credibility for claim frequency ✓

Part 2.

Coefficient of variation for claim severity:

CX =

√506745

= 1.5818

Full credibility standard for claim severity:

standard for claim frequency times C2X = (541.17)(1.5818)2 = 1354.13 > 896

⇒ partial credibility for claim severity ✓

Now we calculate the partial credibility factor for claim severity:

Z =

√896

1354.13= 0.8134

Part 3.

40

Page 41: Bayesian Estimation and Credibilityhome.cc.umanitoba.ca/~farhadi/ASPER/Bayesian Estimation and Credibility.pdf · two claims with probability 0.1. Bad drivers make up the other 25%

The full credibility standard for aggregate loss:

sum of two standards found above = 541.17+1354.13 = 1895.30 > λ = 1674 ⇒

partial credibility for aggregate loss

Partial credibility factor for aggregate claim:

Z =

√1674

1895.30= 0.9398

Example (exercise 17.13 of the textbook ∗). The number of claims has the Poisson distribution. The

number of claims and the claim severity are independent. Individual claim amounts can be for 1, 2, or

10 with probabilities 0.5, 0.3, and 0.2, respectively. Determine the expected number of claims needed

so that the total cost of claims is within 10% of the expected cost with 90% probability.

Solution.

E(X) = (0.5)(1)+(0.3)(2)+(0.2)(10) = 3.1

E(X2) = (0.5)(1)2 +(0.3)(2)2 +(0.2)(10)2 = 21.7

Var(X) = E(X2)−E(X)2 = 21.7− (3.1)2 = 12.09

z 1+p2

= z0.95 = 1.645

(z 1+p2

r

)2{

1+(

σµ

)2}

=

(1.6450.01

)21+

(√12.093.1

)2= 611.04 ⇒ 612 claims

41