Insurance with riders

In this post, we will build a model for an insurance product with riders. The model will be built in Python using the cashflower package. An interesting part of this project will be that we will use two model point sets.


List of content:

  1. Data format
  2. Preparation
  3. Model
  4. Results

Data format

In this post, we will model an insurance product where a policyholder can have multiple coverages. For example, the policyholder can be insured against death and illness within the same policy.

In terms of model point data, the easiest approach would be to have additional columns in the main model point set. These columns would contain information on the additional coverages.

However, with many coverages, the model point file might grow out of proportion. Also, some policyholders can have one coverage and others multiple. If the policyholder is insured only against death then the remaining columns would remain empty.

The model point set could look like this.

print(main.data)

id  age    sex  premium sum_assured_death sum_assured_illness
 1   32   MALE    140.0        100 000.00           30 000.00
 2   48 FEMALE    150.0        120 000.00                 NaN

A more efficient solution is to store coverage data in the long format in a separate file. If the policyholder has 2 coverages, then there will be 1 record in the main model point set and 2 in the additional one. If the policyholder has only 1 coverage, there will be 1 record in both of the files.

print(main.data)

id  age    sex  premium
 1   32   MALE    140.0
 2   48 FEMALE    150.0

print(coverage.data)

id  sum_assured     type
 1   100 000.00    DEATH
 1    30 000.00  ILLNESS
 2   120 000.00    DEATH

The first policyholder has two coverages (against death and illness) and the second policyholder has one coverage (only against death).

Preparation

Install the cashflower package, if you don't have it yet.

terminal
pip install cashflower

Create a new model named riders.

python console
from cashflower import create_model

create_model("riders")

The new folder riders has been created with the initial files structure.

Let's prepare model point sets and assumptions for the model. For this model, we will use two model point sets: one for main policy data and the second for coverages.

riders/input.py
import pandas as pd
from cashflower import ModelPointSet

main = ModelPointSet(data=pd.DataFrame({
    "id": [1, 2],
    "age": [32, 42],
    "sex": ["MALE", "FEMALE"],
    "premium": [140, 150]
}))

coverage = ModelPointSet(data=pd.DataFrame({
    "id": [1, 1, 2],
    "sum_assured": [100_000, 30_000, 120_000],
    "type": ["DEATH", "ILLNESS", "DEATH"],
}))

The first policyholder is a 32-year-old male. The insurance company will pay a benefit of €100 000 in case of death and €30 000 in case of illness. The second policyholder is a 42-year-old female. She has only coverage for the event of death with a sum assured of €120 000. They pay €140 and €150 premiums respectively.

We will use 3 assumptions for the model: mortality rate, morbidity rate and interest rate. For simplicity, we will assume that each of these assumptions is a single value. In production models, these assumptions will form a table. For mortality and morbidity, the rates depend on age and sex. For the interest rate, the value depends on the term.

riders/model.py
DEATH_PROB = 0.003
MORB_PROB = 0.008
INTEREST_RATE = 0.005

We have all the input data that we need. Let's start modelling!

Model

Let's start with modelling survival rate and expected premium.

riders/model.py
from cashflower import variable

from input import main, coverage
from settings import settings

@variable()
def survival_rate(t):
    if t == 0:
        return 1 - DEATH_PROB
    return survival_rate(t-1) * (1 - DEATH_PROB)


@variable()
def expected_premium(t):
    if t == 0:
        return 0
    return main.get("premium") * survival_rate(t-1)

Now we will calculate the expected benefit.

We have multiple records in the coverage file, so we can model this variable as a list.

riders/model.py
@variable()
def expected_coverage_benefit(t):
    benefits = []
    for i in range(coverage.model_point_data.shape[0]):
      if t == 0:
          benefits.append(0)
      else:
        sum_assured = coverage.get("sum_assured, i)
        _type = coverage.get("type", i)

        if _type == "DEATH":
            benefits.append(survival_rate(t - 1) * DEATH_PROB * sum_assured)
        elif _type == "ILLNESS":
            benefits.append(survival_rate(t - 1) * DEATH_PROB * sum_assured)
        else:
            raise ValueError(f"Unknown coverage type {_type}.")
    return benefits

The output of this variable is a list.

The first policyholder has two coverages, so the list will contain two elements. The first element will be the expected benefit in case of death and the second element will be the expected benefit in case of illness.

The second policyholder has only one coverage, so the list will contain only one element. We will see it later on in the results.

We use coverage.model_point_data.shape[0] to get the number of records for the given model point.

Knowing the expected benefit for each of the coverages, we can calculate the overall expected benefit.

riders/model.py
@variable()
def expected_benefit(t):
    return sum(expected_coverage_benefit(t))

We sum up all the elements of the list.

We can now calculate the present value of expected cash flows and the best estimate of liabilities.

riders/model.py
@variable()
def pv_expected_premium(t):
    if t == settings["T_MAX_CALCULATION"]:
        return expected_premium(t)
    return expected_premium(t) + pv_expected_premium(t+1) * 1/(1+INTEREST_RATE)


@variable()
def pv_expected_benefit(t):
    if t == settings["T_MAX_CALCULATION"]:
        return expected_benefit(t)
    return expected_benefit(t) + pv_expected_benefit(t+1) * 1/(1+INTEREST_RATE)


@variable()
def best_estimate_liabilities(t):
    return pv_expected_benefit(t) - pv_expected_premium(t)

Results

We will produce the individual output, to see clearly how the model evaluated variables for each of the model points.

settings.py
settings = {
    "AGGREGATE": False,
    ...
}

To run the model, source the run.py script.

output/<timestamp>_output.csv
  t  survival_rate expected_coverage_benefit  expected_premium  pv_expected_premium  expected_benefit  pv_expected_benefit  best_estimate_liabilities
  0       0.997000                    [0, 0]              0.00             17392.26              0.00             48449.76                   31057.50
  1       0.994009            [299.1, 89.73]            139.58             17479.22            388.83             48692.01                   31212.79
  2       0.991027            [298.2, 89.46]            139.16             17426.34            387.66             48544.70                   31118.36
  3       0.988054           [297.31, 89.19]            138.74             17373.62            386.50             48397.83                   31024.21
  4       0.985090           [296.42, 88.92]            138.33             17321.05            385.34             48251.39                   30930.34
  5       0.982134           [295.53, 88.66]            137.91             17268.63            384.19             48105.38                   30836.75
  6       0.979188           [294.64, 88.39]            137.50             17216.37            383.03             47959.80                   30743.43
...
714       0.116691            [35.11, 10.53]             16.39               112.00             45.64               312.00                     200.00
715       0.116341             [35.01, 10.5]             16.34                96.09             45.51               267.69                     171.60
716       0.115992             [34.9, 10.47]             16.29                80.15             45.37               223.29                     143.14
717       0.115644             [34.8, 10.44]             16.24                64.18             45.24               178.81                     114.63
718       0.115297            [34.69, 10.41]             16.19                48.18             45.10               134.24                      86.06
719       0.114951            [34.59, 10.38]             16.14                32.15             44.97                89.59                      57.44
720       0.114606            [34.49, 10.35]             16.09                16.09             44.84                44.84                      28.75
  0       0.997000                       [0]              0.00             18634.56              0.00             44722.81                   26088.25
  1       0.994009                  [358.92]            149.55             18727.73            358.92             44946.42                   26218.69
  2       0.991027                  [357.84]            149.10             18671.07            357.84             44810.44                   26139.37
  3       0.988054                  [356.77]            148.65             18614.58            356.77             44674.86                   26060.28
  4       0.985090                   [355.7]            148.21             18558.26            355.70             44539.68                   25981.42
  5       0.982134                  [354.63]            147.76             18502.10            354.63             44404.90                   25902.80
  6       0.979188                  [353.57]            147.32             18446.11            353.57             44270.52                   25824.41
...
715       0.116341                   [42.01]             17.50               102.94             42.01               247.08                     144.14
716       0.115992                   [41.88]             17.45                85.87             41.88               206.10                     120.23
717       0.115644                   [41.76]             17.40                68.76             41.76               165.04                      96.28
718       0.115297                   [41.63]             17.35                51.62             41.63               123.90                      72.28
719       0.114951                   [41.51]             17.29                34.44             41.51                82.68                      48.24
720       0.114606                   [41.38]             17.24                17.24             41.38                41.38                      24.14

Notes:

  • there are two sets of results one under another because there are two model points,
  • expected_coverage_benefit is a list with two elements for the first policyholder and one element for the second policyholder,
  • sum of expected_coverage_benefit amounts to expected_benefit.

We have dealt with riders by creating a model point set with a long format and using lists are a result of model variables.


Thank you for reading! If you have any questions, feel free to leave a comment below or on the github repository.

Read also:

Log in to add your comment.