A predictive machine learning model for non-life insurance pricing
Competitive and regulatory context
In a context of constant increase in competition and heightened regulatory pressure, accuracy, actuarial precision, as well as transparency and understanding of the tariff, are key issues in non-life insurance pricing. On the one hand, the traditionally used Generalized Linear Model (GLM) results in a multiplicative structure that favors interpretability and ease of use. On the other hand, machine learning and deep learning techniques capture complex non-linear relationships between variables and enhance tariff performance. However, these black-box predictive models face difficulties in providing transparency into their decision-making processes. As a result, insurers often face a trade-off in the selection of predictive methods between performance and interpretability.
Raise of Artificial Intelligence
Recent years have been marked by the development of explainable artificial intelligence techniques in alignment with the efforts to enhance the transparency, explainability and interpretability of models. When applied in the insurance context, these tools help to better understand the relationships between the target variable and the input features, both globally and locally. They also enable the ranking of feature importance as well as the assessment of interactions between features.
Explainable Boosting Machine model
In light of this recent work focusing on the interpretability of machine learning models, we introduce the Explainable Boosting Machine (EBM) model for non-life insurance pricing. EBM relies on the use of a Generalized Additive Model (GAM) and combines bagged-boosted decision trees as shape functions. More precisely, it employs a cyclic gradient boosting method for fitting, a technique based on the classical cyclic coordinate descent optimization, ensuring interpretability through an independent learning of shape functions. This approach offers the advantage of being directly interpretable without relying on post-hoc interpretation techniques, making it more suitable for decision-making processes.
Ensuring the development of machine learning models
The EBM model presented in this paper is classified as a glass box model, i.e. a model that combines intrinsically interpretable characteristics and high prediction performance. The use of shape functions enables visualization of the local contribution of each feature to the final prediction and facilitates the computation of feature importance scores, which is valuable for model interpretation and feature selection. Moreover, the use of the GA2M selection algorithm in the EBM method accounts for pairwise interaction terms, which remain interpretable by plotting heatmaps.
Our contribution thus aligns with a recent stream of literature focused on the development of both accurate and inherently interpretable machine learning models.
This paper examines the interpretability of the EBM model in the context of an application to the claim frequency and severity modeling in car insurance. To evaluate the robustness of the results of the EBM model, we compare its prediction performance as well as the explanations provided with the results of the other benchmark machine learning models.