The truth about GAMs, GLMs and GBMs in insurance pricing
Introduction
AI-powered pricing sophistication is a relatively new journey for insurers, but one departure they definitely do not want to miss. The consistent application of Generalized Additive or Generalized Linear Model structure (GAMs or GLMs) for all risk models and product lines is commonly regarded as the prerequisite to start this journey. But the VIP departure gate for forward-looking insurers is the use of AI-based and machine-learning pricing tools in production. This next frontier holds the potential to unlock game-changing advantages:
Higher predictive performance of risk and pricing models;
Accelerated time to market thanks to the speed-to-accuracy properties of AI-powered modeling, compared to manual cumbersome computation.
The race to such benefits is multi-faceted and hides off-the-beaten path pitfalls. Insurance carriers’ data labs have seen an explosion of blackbox algorithms over the past few years: Gradient Boosting Machines (GBMs) like XGBoost, Support Vector Machines, Random Forests, Neural Networks or Deep Learning, to name a few.
Yet the reality of the insurance industry remains that no regulator is likely to blindly support a rating plan that relies on blackbox models any time soon. This was reaffirmed as recently as August 2020, when the US National Association of Insurance Commissioners (NAIC) adopted 5 guiding principles regarding the use of AI in Insurance: Fair & ethical, Accountable, Compliant, Transparent and Secure / safe / robust, summarized with the acronym ‘FACTS’. Though these guidelines apply to the US, the largest P&C insurance market in the world, virtually all insurance regulators worldwide share a common distrust for blackbox modeling applied to rating plans.
As such, if AI benefits are to be allowed out of insurers’ data labs, to go in production and deliver on their speed-to-accuracy promise at scale, it has to be within such guidelines.
Akur8 GAMs and GBMs share some commonalities that build the basis of enhanced performance
Akur8 provides a tool designed to fit GAMs. The GAMs built by Akur8 combine the strength of classical GAMs found in literature, for which the model can be decomposed into the effect of each explanatory variable, and that of GLMs, for which link functions and various loss distributions can be used.
The different types of GAMs leveraged by Akur8 are listed below (in the blue insert) alongside other classes of models: Linear Models and Generalized Linear Models (which suppose a linear relation between explanatory variables and the target) and blackbox models, which can be expressed through extremely complex formulas.
GBMs fall in the blackbox category: these models cannot be expressed by a tractable formula.
Akur8 models and GBMs share several statistical properties that allow actuaries to build highly stable models with little data preparation, saving them significant time while generating models of superior predictive performance:
They try to find the model that best represents the observed portfolio by maximizing the statistical concept of likelihood;
The final rendition of such a model will be made robust and stable, for example by grouping together effects that do not carry significant signal, introducing a certain notion of credibility in the model;
The data preparation required is purposely light, the flexibility built into the models allows to capture non-linear effects with ease and the robustness of the model is achieved through the fitting process, rather than via a manual process.
They also bear strong differences:
In terms of structure: Gradient Boosting typically produces prediction models in the form of decision trees, which is a very different structure from GAMs;
It is impossible to precisely review the trade-offs between the models’ sensitivity and robustness and, if no good trade-off exists, it is impossible to edit the model;
GBMs do not provide variables’ selection - they use a greedy approach, trying to predict as well as possible and do not allow additional constraints.
The key difference between GAMs (and their variations, like the ones used in Akur8) and GBMs though is that, while GAMs can be decomposed and directly analysed, GBMs cannot.
To make up for this limitation in the case of GBMs, explainability comes into play in the shape of different ways that have been suggested to measure the effect of each variable. Yet these measures are not a direct view of the model - which wouldn’t be possible due to the very high number of complex interactions between variables - but rather a simplified, reverse-engineering of the models. As stated above: explainable differs from transparent. The situation is akin to a researcher trying to understand a complex physical phenomenon by simplifying it and creating a simple model of it, except that here the “physical phenomenon” is the GBM itself.
Among these explainability methods, PDP (Partial Dependence Plots), ICE (Individual Conditional Expectation), ALE (Accumulated Local Effects) or Shapley values can be mentioned, their output simplifications of the variables’ effects and importance being very popular. They do not however provide any direct visualization of the models, but rather simplified trends that give some insights into their behavior. These insights are valuable but clearly insufficient in the context of insurance pricing. Again, explainability is not equal to transparency.
Example of modeling with a GAM (top) or a GBM (bottom). The x-axis represents the driver’s age; the purple curve represents the average claims’ frequency for each year; the blue bars represent the number of policies.
The dark-blue line on the GAM (top) directly displays the model itself; all the impact of the driver’s age is derived from this curve. Looking at it provides all the information to understand the role played by this variable in the model, and it is possible to directly edit it (for instance, if the modeler fears the risk may be under-estimated for young drivers, it is possible to increase the model value for these profiles and the predictions will be impacted in a straightforward way).
The dark blue line on the GBM (bottom) displays the partial dependency (PDP) of the driver’s age: it is the average impact of age on the predictions, computed from the data. For each contract, the actual individual impact (ICE plots, displayed by the black lines for a few contracts) will vary. In some cases, like the bold black line, the effect of age can be very counter-intuitive, with strong increases and decreases for no obvious reasons.
GAMs, GLMs and GBMs are not created equal in the face of ‘FACTS’ principles: Fairness, Accountability, Compliance, Transparency and Security for insurers, regulators and policy holders
When it comes to compliance, transparency and safety in production, GBMs fall short due to their blackbox nature. GBMs are, by design, not transparent. It is not possible to split a GBM into effects of each function used, because all the effects are relying on interactions and are therefore strongly intertwined.
This trait of GBMs - or Random Forest, or Neural Networks or any other blackbox machine learning approach - prevents the user from directly seeing and understanding the model they manipulate: reviewing a set of 500 trees could be feasible but would hardly provide any intuitive understanding of what the model does.
If used in production, GBMs entail adverse selection risks that are unsafe for insurance carriers, making them non-compliant in the eyes of regulators. This is why their use remains largely limited to the exploratory use of data labs. Explainable AI (XAI) has embraced the mission of making GBMs ‘explainable’, yet this can only be achieved by simplifying and reverse-engineering the inner workings of the algorithms, providing a level of understanding that not only remains unsatisfactory to regulators but that also undermines the initial performance of GBMs.
The alignment of GAMs and GLMs with ‘FACTS’ principles can be verified and ensured. GAMs and GLMs are already safely used in production by countless insurers over the world. The qualities that GBMs lack are exactly the ones that make GAMs and GLMs the darlings of insurance regulators.
The current limitation of GAMs and GLMs does not lie in their structure, but in the challenges linked to their manual creation. Indeed, to build a solid and predictive GAM, an actuary or pricing analyst needs to select the most relevant variables within a potentially very long list of more or less predictive ones; and simultaneously create the best effect for each selected variable within the model. When these tasks are carried out manually, as is the case with legacy market solutions, the task becomes herculean as soon as the number of variables is significant. Indeed, everytime you include or modify a variable, all the others are impacted, generating that many more iterations and making the manual computation of GAMs extremely iterative, slow and cumbersome. Actuaries cannot possibly test all variables’ combinations to create the perfect GAM: they can only do so much with the time they have. Given the timely impossibility to test all combinations, it is no wonder that manually-built GAMs can underperform when compared to GBMs.
Wait, why are we making such a big deal out of Transparency in insurance rate making?
The need for actual transparency and understanding is motivated by the very nature of technical pricing: the goal is not only - as in many data science contexts such as claims fraud detection or marketing - to build a model that predicts well “on average”, but to make sure that the risk of all clients is safely estimated. For instance, if a particular profile is rare in a portfolio used to build a risk model, mispricing it will not have much impact on the models’ performance measured by a Gini coefficient or EDR (Expected Deviance Ratio), as these are measures of the average model’s quality. This mispricing will however have a very strong impact on the final results, because clients with similar profiles will also be under-priced, and underwrite contracts en-masse, anti-selecting the insurer, resulting in a massive multiplier effect when the claims occur several years down the road. More than that, the same models will be used to assess the quality of the portfolio, the state and level of the reserves as well as to estimate the financial results of the company. This only adds to the domino effect of such modeling mistake, leaving the insurer in a complete blind spot and utterly unprepared for the backfire.
Another detrimental side effect can add up: these models are also used to monitor the quality of the portfolio, the status of the reserves and the financial results of the company while claims are developing. In case of mistake, not only is the carrier facing potential adverse selection, but all the while, they will believe that they were making money until the claims have developed a good three years later.
Akur8 allows insurers to safely apply the power of Transparent AI to the known and trusted GAMs / GLMs structure
Models following a GAM structure can easily be decomposed and analyzed piece by piece.
To make the most of this capability, Akur8 provides, on top of the additive (or multiplicative) structure of the models, parsimonious modeling capabilities. The models can be constrained to use a limited number of variables - the exact number being decided by the user, and the actual variables selected being selected by the algorithm based on this number.
Models with a large number of variables are more complex and often more predictive, but are also much less interpretable, requiring further work by the actuaries or pricing teams in order to feel comfortable with their use.
Managing the trade-off between complexity and predictive power is, of course, key in a pricing context because of adverse-selection risk.
Achieving the best of two worlds: the reconciliation between the Transparency and Compliance imperatives with Performance, in production
This is why GAMs / GLMs, powered by Transparent AI are a match made in heaven, giving insurers access to the best of both worlds: automated rate modeling with state-of-the-art speed-to-accuracy performance, that can safely go in production thanks to unquestioned transparency and compliance properties.
The ability to apply the automation power of AI to the generation of GAMs and GLMs is the founding stone of Akur8. It is our deep belief that it is today the only way for insurers to safely apply AI to insurance pricing in production, to generate substantial benefits at scale, without incurring blackbox related risks.
Moreover, whilst it is true that GBMs can outperform GLMs in terms of pure predictive power measured on past data, a number of factors come into play, strongly mitigating this statement:
The notion of pure performance of GBMs is to be handled with care as it can be dangerously misleading. A case in point, witnessed first hand at Akur8, involves a leading insurer that was somewhat astonished by the high performance of their GBMs compared to historical benchmarks and data points, without being able to comprehend why. Generating models with Akur8 enabled them to understand and decompose the factors explaining such performance, through the ability to decide and trace which interactions to include in the models. As a matter of fact, the GBMs they generated were using a posteriori information hidden in the database, resulting in artificially high model accuracy, that they were neither able to detect, nor explain. Using Akur8, the modeler decided to rule out these specific interactions, considering them as way too predictive. To quote them: “We were suspicious there was some data leakage behind our GBMs performance, so we ran Akur8 models by specifically excluding some very predictive interactions - correctly identified by Akur8 - that would bias the results. These interactions were exploited in the GBMs, explaining the performance levels reached. Having such insights on the model building process and such focus on causal inference is a huge strength of Akur8, efficiently preventing such biases and risks - and avoiding kaggle-like competition on some metrics.”
Beyond mere performance, the trade-off between performance and control / interpretability is what really matters here. Akur8 leverages statistical schemes, such as controlling the number of variables actually used in the model, that GBMs are not well-suited to handle. This is where its Transparency lies.
Conclusion
Regulators require Fairness & ethics, Accountability, Compliance, Transparency and Security / safety / robustness from insurers, in order to protect end-consumers and businesses. This is a requirement that carriers are bound by, to serve their original purpose, which is to protect their policyholders and compensate them fairly for unwanted eventualities.
At Akur8, we have yet to encounter an insurer that uses GBMs for technical pricing in production. While this may well evolve in the future, our role is to deliver pricing sophistication to insurers while protecting their bound to regulators and policy holders, which for now, calls for Transparent AI applied to the automation of GAMs and GLMs.