Grin logo
de en es fr
Shop
GRIN Website
Texte veröffentlichen, Rundum-Service genießen
Zur Shop-Startseite › VWL - Statistik und Methoden

Pure Premium in Auto Insurance. A Reproducible Case Study Using R

Titel: Pure Premium in Auto Insurance. A Reproducible Case Study Using R

Hausarbeit , 2025 , 27 Seiten , Note: 10.00

Autor:in: Nabil Nakbi (Autor:in)

VWL - Statistik und Methoden
Leseprobe & Details   Blick ins Buch
Zusammenfassung Leseprobe Details

This work separately models claim frequency and severity to investigate the application of Generalised Linear Models (GLMs) to estimate the pure premium in auto insurance. We show how transparent, data-driven methods can support fair and efficient pricing in a regulated sector using an open dataset from Kaggle and repeatable R code.

Leseprobe


Table of Contents

1. Introduction

1.1 Context and Importance

1.2 GLM-Based Pricing Models in Actuarial Science

1.3 Why R and Reproducible Modeling?

1.4 Problem Statement and Objectives

1.5 Relevance to Actuarial Practice and Regulation

1.6 Structure of the Article

2. Literature Review

2.1 Definition of Pure Premium

2.1.1 Concept and Fundamental Role

2.1.2 Exposure, Homogeneity, and Credibility

2.1.3 Two-Step vs. One-Step Modeling Approaches

2.1.4 From Pure Premium to Charged Rate

2.1.5 Practical Example

2.1.6 Advantages and Justification of Pure Premium Method

2.1.7 Limitations and Considerations

2.2 Overview of Generalized Linear Models (GLMs) in Actuarial Science

2.2.1 GLM Framework and Components

2.2.2 GLMs in Risk Classification and Pricing

2.2.3 Model Estimation and Interpretation

2.2.4 Model Diagnostics and Validation

2.2.5 Regression Models for Frequency and Severity with GLMs

2.2.6 Extensions Beyond Standard GLMs

2.2.7 GLMs in R: Applied Tools and Ecosystem

2.3 Typical Distributions for Frequency and Severity

2.3.1 Distributional Choices in Insurance Modelling

2.3.2 Poisson Distribution for Frequency

2.3.3 Addressing Overdispersion: Quasi-Poisson & Negative Binomial

2.3.4 Zero-Inflated and Hurdle Models

2.3.5 Gamma Distribution for Severity

2.3.6 Log-normal and Alternative Severity Models

2.3.7 Compound and Tweedie Models

2.4 Pricing Models Using GLMs in R

2.5 Advances in Actuarial Modeling and Reproducibility

2.5.1 Mixture-of-Experts and Segmented Models

2.5.2 Open Science Practices

2.5.3 Robust and Heavy-Tailed Models

2.5.4 Machine Learning vs. GLM: Explainability vs. Performance

3. Methodology

3.1 Data Description

3.1.1 Source and Structure

3.1.2 Features Overview

3.1.3 Data Cleaning and Preprocessing

3.2 Modeling Claim Frequency

3.2.1 Statistical Assumption

3.2.2 Selected Predictors

3.2.3 Model Fit and Diagnostics

3.3 Modeling Claim Severity

3.3.1 Assumption and Preprocessing

3.3.2 Variable Selection and Transformation

3.3.3 Estimation and Interpretation

3.4 Calculating the Pure Premium

3.5 Evaluation Metrics and Visualization

3.5.1 Statistical Evaluation

3.5.2 Visual Diagnostics

3.5.3 Reproducibility Measures

4. Results

4.1 Frequency Model Results (Poisson GLM)

4.2 Severity Model Results (Gamma GLM with Log Link)

4.3 Estimated Pure Premium by Segment

4.4 Visualizations

4.4.1 Boxplot – Predicted Severity by Occupation

4.4.2 Histogram – Residuals from Gamma Model

4.4.3 Heatmap – Average Pure Premium by Age and Auto Make

5. Discussion and Interpretation

5.1 Interpretation of Frequency and Severity Results

5.2 Segment-Based Premium Insights

5.4 Practical Implications for Actuarial Pricing

6. Conclusion

Research Objectives and Themes

This study aims to demonstrate a transparent, reproducible, and data-driven approach to estimating pure premiums in auto insurance using Generalized Linear Models (GLMs) and the R programming language. The primary research question explores how actuarial models can achieve both statistical rigor and interpretability when applied to real-world datasets.

  • Application of two-step frequency-severity modeling to calculate pure premium.
  • Implementation of Poisson and Gamma GLMs in R for transparent and auditable pricing.
  • Validation of model performance using statistical diagnostics and reproducible workflows.
  • Evaluation of policyholder risk factors (e.g., occupation, vehicle type) for segmented insurance pricing.

Excerpt from the Book

1.1 Context and Importance

Accurate pricing is a basic chore in non-life (property and casualty) insurance. Maintaining profitability and competitiveness in particular for auto insurance depends on each policyholder finding a suitable premium. In insurance, pricing goes beyond simply assigning expenses to include risk classification, fairness, and regulatory constraints while projecting expected claims. Excluding any expenses, commissions, or profit loadings, the pure premium—defined as the expected cost of claims linked with an insured risk—defines a major component in this pricing structure (Goldburd, Khare, & Tevet, 2021).

Auto insurance contracts expose insurers to both the degree of loss per claim (severity) and the likelihood of claim occurrence (frequency). The frequency-severity method breaks out the whole expected loss cost into these two multiplicative components:

Pure Premium=E[Frequency]×E[Severity]

This decomposition facilitates separate modeling strategies and provides greater interpretability. It also allows insurers to isolate the impact of different rating variables—such as driver age, vehicle type, location, and claims history—on each risk component (Ohlsson & Johansson, 2010).

Summary of Chapters

1. Introduction: Presents the foundational concepts of pure premium, the rationale for GLM-based modeling, and the importance of reproducibility in contemporary actuarial practice.

2. Literature Review: Provides an overview of actuarial ratemaking, GLM frameworks, frequency/severity distribution choices, and recent trends in reproducible modeling.

3. Methodology: Details the statistical process, including data cleaning, variable selection, specific GLM implementations for frequency and severity, and diagnostic evaluation in R.

4. Results: Reports the outputs of the Poisson and Gamma GLMs, showing the impact of risk factors on claim frequency and cost, supported by segment-based premium tables and visualizations.

5. Discussion and Interpretation: Offers a critical evaluation of the findings, linking statistical results to practical insurance underwriting, and discusses the implications for regulatory compliance and model robustness.

6. Conclusion: Summarizes the key contributions of the study, reinforcing the value of combining statistical rigor with transparent, reproducible workflows for modern insurance analytics.

Keywords

Auto Insurance, Pure Premium, GLM, Frequency-Severity Modeling, R, Reproducibility, Risk Segmentation, Poisson Distribution, Gamma Distribution, Actuarial Science, Insurance Pricing, Data Modeling, Statistical Rigor, Transparency, Underwriting

Frequently Asked Questions

What is the core focus of this research?

The research focuses on the reproducible estimation of pure premiums in auto insurance by using Generalized Linear Models (GLMs) to separately model claim frequency and severity.

What are the primary themes addressed in the work?

Central themes include the application of statistical modeling in insurance, the shift towards open and reproducible research practices, and the balance between predictive performance and model transparency.

What is the main goal or research question?

The goal is to provide a transparent, end-to-end, and auditable modeling case study that proves how conventional statistical tools like GLMs remain robust and interpretable in a modern regulatory environment.

Which scientific methods are employed?

The study uses a two-step modeling approach: a Poisson GLM for claim frequency and a Gamma GLM for claim severity, implemented in the R programming language with extensive diagnostic verification.

What does the main body of the paper cover?

The main body covers the literature on actuarial ratemaking, the specific methodological steps for data cleaning and GLM fitting, the interpretation of results from real-world insurance datasets, and visual diagnostics.

Which keywords characterize this paper?

Key terms include Auto Insurance, Pure Premium, GLM, Frequency-Severity Modeling, R, Reproducibility, and Risk Segmentation.

Why is "reproducibility" so important in this actuarial case study?

Reproducibility is emphasized because it ensures that pricing models can be audited, validated by regulators, and peer-reviewed, which is essential for transparency and legal compliance in insurance markets.

How does this study help practitioners in the field?

It provides a pedagogical and practical R-based pipeline that actuarial students and professionals can adapt for their own portfolios to enhance modeling techniques and meet regulatory expectations.

Ende der Leseprobe aus 27 Seiten  - nach oben

Details

Titel
Pure Premium in Auto Insurance. A Reproducible Case Study Using R
Hochschule
Université Mohammed V Rabat
Veranstaltung
Actuariat
Note
10.00
Autor
Nabil Nakbi (Autor:in)
Erscheinungsjahr
2025
Seiten
27
Katalognummer
V1595608
ISBN (PDF)
9783389144183
Sprache
Englisch
Schlagworte
Auto insurance Pure Premium GLM R Frequency Severity Modeling Risk segmentation*
Produktsicherheit
GRIN Publishing GmbH
Arbeit zitieren
Nabil Nakbi (Autor:in), 2025, Pure Premium in Auto Insurance. A Reproducible Case Study Using R, München, GRIN Verlag, https://www.grin.com/document/1595608
Blick ins Buch
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
Leseprobe aus  27  Seiten
Grin logo
  • Grin.com
  • Versand
  • Kontakt
  • Datenschutz
  • AGB
  • Impressum