Excerpt

## Abstract

This paper presents a Bayesian assessment approach that is based on user expectation for loss estimationmodels. During the model validation and review cycles, the quality of loss estimate is often tested andconcluded in statistical terms that are not straightforward from business practitioners’ perspective. Thisapproach provides a quantitative assessment that is directly linked to the impact of new observationsagainst the prior assumption. The author starts the discussion with description of the expectedperformance of the model estimate in practice and graphically demonstrates the modeling effort to achievethe development target. Then he carries on formulating the distributional properties regarding mixturedistribution in the Estimated Loss Ratio buckets and sets the prior belief accordingly. He then obtains theDensity Distribution of the newly observed data and the post-observation adjusted distribution perEstimated Loss Ratio bucket by applying the Bayesian posterior update on the mixture distribution. Thisanalysis shows that the quantified change between the prior and posterior belief, i.e. initial and postobservation updated expectation is a good measure of model performance and the predictive power.

This approach bridges the practitioners’ insight with the model performance and provides an intuitivequantitative assessment for the change of model performance in monetary terms. The approachframeworks the development target and performance according to end user belief, and can be easilyextended to quantify the best estimate of post-observation loss ratio and gaps in the capital estimation.Meantime, this approach adopts the benefit of the Bayesian methods and provides a unified inference andmeasurement criterion when cross comparing multiple models during both development and reviewprocess. Lastly, this approach can be easily applied to the non-default or non-resolved portfolio providingbest effort loss estimates.

## Introduction

Loss ratio model is a heated topic in the finance industry, especially in the recent post-crisisperiod due to 2 main reasons: First, it can be easily transformed into monetary terms. Second,

Disclaimer: All Tables, Figures and Equations in this paper are original with this text. this ratio is linear to estimates such as the Expected Loss or Risk Weighted Assets because it isused as a simple multiplier in the calculation. Together with probability^{2} and exposure models,these three key factors are widely applied in various areas such as pricing, hedging, valuationand capitalization.

Defined as the percentage ratio of the total underlying exposure given default, the loss rate estimate come in several different forms, for example in the case of pricing and hedging models, the recovery rate is often modeled first and the loss rate is then calculated as the the difference between 100% and the recovery rate. In case for regulatory credit risk models, the empirical Loss Given Default (LGD) is modeled directly (See^{5} and^{6} for more details).

For LGD models covering corporate exposures, the observed loss ratio at portfolio level is knownto follow the skewed “L”, “J” or “U” shaped distribution which could vary over time andeconomical conditions for the same portfolio. Meantime, as loss rates can be multiplied directlyto the underlying exposure to provide loss estimate in money terms, it is more important to linkthe modeling target with the expected performance during both the model development andreview process.

Generally for mathematical models used in the finance industry, the accuracy of an estimate is often determined by comparing the model estimate with the actual observation. This is in agreement with the requirement of the minimization procedure of mean squared errors at calibration stage for some models.

In this paper, we start the discussion with expectation from the user perspective. To avoid making specific assumptions on the underlying model, we skip the details of model such as input variables and technical specification.

## Initial Expectation

Many key performance indicators of the models in the Finance industry are closely linked withthe comparison between estimated and observed values. We denote the Estimated Loss Ratioas ELR and the Observed Loss Ratio as OLR throughout this paper. With some tolerance ofestimation error, while the actual observed loss ratio could range from 0 to 100% (and sometimesexceeding these boundaries), one would hope that lower OLR is paired with lower ELR moreoften than with higher ELR, and similarly, higher OLR is expected to be observed more oftenfrom higher ELRs than lower ELRs.

More specifically, assuming the user expected the majority of all 0 − 20% ELRs are realized to be 0 − 20% OLRs. We summarize this expectation in terms of statistical distributions conditioned on the initial predicted ELR in Table 1.

Imagine that a practitioner could express his tolerance of loss estimate error in 5 equal-sized buckets in the loss ratio range of 0 − 100%. With some tolerance, for all ELR values in each bucket, the majority of OLR values are expected to be in the corresponding bucket.

Table 1 is an example of ELR bucket scale as described above. Note that the model estimate is not required to be a bucketed estimate, i.e. an output of 15% falls into the ELR B1 bucket but is not required to be represented by the mean of the ELR bucket.

Table 1: Example of a 5-bucket LGD Scale

illustration not visible in this excerpt

Here the discrete bucketing scheme can be changed to fit distribution of model estimated values, and the buckets do not have to be numerical in case for categorical estimation. Moreover, for model reviews that compare the estimated vs observed values in a continuous setup, one may wish to assume the buckets here is very small, e.g. using 5% or 1% buckets.

Figure 1: Initial Expectation of OLR Distribution in ELR Buckets

illustration not visible in this excerpt

Figure 1 illustrates the expected model performance with user tolerance of prediction errors. The vertical dash-lines are the upper boundaries for each ELR bucket, and the horizontal dash-lines are the expected OLR bucket given the ELR predictions.

As shown in the figure, although all OLR values are possible in every ELR bucket, the expectationon the model performance is that the model will help to reduce the probability of observing a 100% OLR in the ELR bucket like 0 − 20% and 20 − 40% and do the same for 0% OLR in ELR buckets like 60 − 80% and 80 − 100%.

## Modelling Effort

Here we conceptually demonstrate the modeling efforts according to the OLR using graphs as shown in Figure 2.

Figure 2: Example of Loss Ratio Modelling Effort

illustration not visible in this excerpt

Sub-figure (a) is the original observed data, randomly ordered by the customer ID index as X-axis and the Y-axis is the OLR value. One can see that the single OLR observations are scattered over the X-Y plane.

Sub-figure (b) is a simple frequency plot of the data points in sub-figure (a), we see that the OLR follows a bi-modality distribution where the density is high on boundary values 0% and 100% while low for the range in the middle.

The modeling effort and impact between sub-figure (a) and sub-figure (c) is a simple clustering effort that groups similar customers by their common characters into the five ELR buckets as described in Table 1. Comparing sub-figure (a) and sub-figure (c), it is clear that the modelordered OLR data follow the user expected properties, i.e. the records in each bucket concentrate towards the center point of the bucket range.

One can see that the model-ordered results are not perfect as there are OLR points out of the ELR buckets, this represents the estimation error. However, note that the OLR distribution is centered accordingly to the ELR in each of the buckets.

Table 2 summarize the percentage density between the OLR and ELR bucket, the table isstructured in a way that the numerical values can be directly mapped and compared withFigure 1.

Each row in Table 2 is an OLR bucket, and each column in Table 2 is an ELR bucket. Therefore, cell [OLR B1, ELR B1] is the likelihood of observing an OLR in range 0 − 20%, here in ourexample it is 93.28%. The first column is the full OLR distribution for ELR in 0 − 20% (ELRB1). Similarly, we see that the value in cell [OLR B5, ELR B4], 25.79%, is the likelihood ofobserving an OLR in range 80 − 100% while initial predicted ELR is in range 60 − 80%.

Table 2: Prior Density in ELR Buckets

illustration not visible in this excerpt

## The Prior Distribution

In this section, we discuss the initial modelled/observed distribution and some properties of the mixture distribution across the ELR buckets.

To demonstrate the modeling and impact in details, we assume that the bucketed model output follows a mixture beta distribution as shown below in Figure 1, the choice of complex distribution here is just to demonstrate the boundary densities and the different modality properties for different ELR buckets respectively.

Figure 1 demonstrates the distribution based initial expectation using the ELR bucketing scheme as defined in Table 1. The key difference here is that extreme losses are considered as the “tail events” that happens to each individual bucket as a rare event instead of simply classifying them as an inaccurate estimate.

Here the illustration in Figure 3 is under the assumption of a bucketed Inflated Double Beta distribution. Note that this assumption is on the model output, for more on this modeling approach, see^{8} and^{11} for more details.

We summarize some properties of mixture distribution below. The b denote the different loss ratio buckets and the d indicates the underlying single distribution within each bucket, i.e. each single beta distribution in our double beta example. Assume that there are m distributions in each loss ratio bucket where N is the total number of buckets, function F denote the cumulative distribution of the bucket mixture distribution and function G is the overall mixture distribution for the whole loss ratio range across all buckets^{3}.

**[...]**

^{2} More details are discussed in^{12}.

^{3} In case the buckets are for independent business categories, this overall level mixture is not necessarily required subject to analytical target.

- Quote paper
- Yang Liu (Author), 2017, Assessment of Loss Ratio Model Performance. A Bayesian Approach, Munich, GRIN Verlag, https://www.grin.com/document/372254

Publish now - it's free

Comments