Variational AutoEncoder for synthetic insurance data

This article explores the application of Variational AutoEncoders (VAEs) to insurance data. Previous research has demonstrated the successful implementation of generative models, especially VAEs, across various domains, such as image recognition, text classification, and recommender systems. However...

Full description

Saved in:
Bibliographic Details
Main Authors: Charlotte Jamotton, Donatien Hainaut
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:Intelligent Systems with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667305324001297
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850109317588451328
author Charlotte Jamotton
Donatien Hainaut
author_facet Charlotte Jamotton
Donatien Hainaut
author_sort Charlotte Jamotton
collection DOAJ
description This article explores the application of Variational AutoEncoders (VAEs) to insurance data. Previous research has demonstrated the successful implementation of generative models, especially VAEs, across various domains, such as image recognition, text classification, and recommender systems. However, their application to insurance data, particularly to heterogeneous insurance portfolios with mixed continuous and discrete attributes, remains unexplored. This study introduces novel insights into utilising VAEs for unsupervised learning tasks in actuarial science, including dimension reduction and synthetic data generation. We propose a VAE model with a quantile transformation for continuous (latent) variables, a reconstruction loss that combines categorical cross-entropy and mean squared error, and a KL divergence-based regularisation term. Our VAE model’s architecture circumvents the need to pre-train and fine-tune a neural network to encode categorical variables into n-dimensional representative vectors within a continuous vector space of dimension Rn. We assess our VAE’s ability to reconstruct complex insurance data and generate synthetic insurance policies using a motor portfolio. Our experimental results and analysis highlight the potential of VAEs in addressing challenges related to data availability in the insurance industry.
format Article
id doaj-art-a7dc249cff694ebda650880954d12a41
institution OA Journals
issn 2667-3053
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series Intelligent Systems with Applications
spelling doaj-art-a7dc249cff694ebda650880954d12a412025-08-20T02:38:06ZengElsevierIntelligent Systems with Applications2667-30532024-12-012420045510.1016/j.iswa.2024.200455Variational AutoEncoder for synthetic insurance dataCharlotte Jamotton0Donatien Hainaut1Corresponding author.; Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA) of the Université catholique de Louvain (UCLouvain), Voie du Roman Pays 20/L1.04.01, Louvain-la-Neuve, 1348, BelgiumInstitute of Statistics, Biostatistics and Actuarial Sciences (ISBA) of the Université catholique de Louvain (UCLouvain), Voie du Roman Pays 20/L1.04.01, Louvain-la-Neuve, 1348, BelgiumThis article explores the application of Variational AutoEncoders (VAEs) to insurance data. Previous research has demonstrated the successful implementation of generative models, especially VAEs, across various domains, such as image recognition, text classification, and recommender systems. However, their application to insurance data, particularly to heterogeneous insurance portfolios with mixed continuous and discrete attributes, remains unexplored. This study introduces novel insights into utilising VAEs for unsupervised learning tasks in actuarial science, including dimension reduction and synthetic data generation. We propose a VAE model with a quantile transformation for continuous (latent) variables, a reconstruction loss that combines categorical cross-entropy and mean squared error, and a KL divergence-based regularisation term. Our VAE model’s architecture circumvents the need to pre-train and fine-tune a neural network to encode categorical variables into n-dimensional representative vectors within a continuous vector space of dimension Rn. We assess our VAE’s ability to reconstruct complex insurance data and generate synthetic insurance policies using a motor portfolio. Our experimental results and analysis highlight the potential of VAEs in addressing challenges related to data availability in the insurance industry.http://www.sciencedirect.com/science/article/pii/S2667305324001297AutoencoderVariational inferenceSynthetic data generationHeterogeneous insurance dataDimension reduction
spellingShingle Charlotte Jamotton
Donatien Hainaut
Variational AutoEncoder for synthetic insurance data
Intelligent Systems with Applications
Autoencoder
Variational inference
Synthetic data generation
Heterogeneous insurance data
Dimension reduction
title Variational AutoEncoder for synthetic insurance data
title_full Variational AutoEncoder for synthetic insurance data
title_fullStr Variational AutoEncoder for synthetic insurance data
title_full_unstemmed Variational AutoEncoder for synthetic insurance data
title_short Variational AutoEncoder for synthetic insurance data
title_sort variational autoencoder for synthetic insurance data
topic Autoencoder
Variational inference
Synthetic data generation
Heterogeneous insurance data
Dimension reduction
url http://www.sciencedirect.com/science/article/pii/S2667305324001297
work_keys_str_mv AT charlottejamotton variationalautoencoderforsyntheticinsurancedata
AT donatienhainaut variationalautoencoderforsyntheticinsurancedata