Heavy-Tailed Linear Regression and <i>K</i>-Means

Most standard machine learning algorithms are formulated with the implicit assumption that empirical data are “well-behaved”. In this work, we consider heavy-tailed data whose underlying distribution does not necessarily possess finite moments. For such a scenario, classical linear regression techni...

Full description

Saved in:
Bibliographic Details
Main Authors: Mario Sayde, Jihad Fahs, Ibrahim Abou-Faycal
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/16/3/184
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Most standard machine learning algorithms are formulated with the implicit assumption that empirical data are “well-behaved”. In this work, we consider heavy-tailed data whose underlying distribution does not necessarily possess finite moments. For such a scenario, classical linear regression techniques and the standard <i>K</i>-means algorithm fail. We formulate and validate heavy-tailed versions of these machine learning methods for both scalar and multidimensional settings. The new algorithms are based on recently defined appropriate location and power parameters. Additionally, we showcase the enhanced performance of the proposed methods in comparison to some other tailored ones found in the literature.
ISSN:2078-2489