Heavy-Tailed Linear Regression and <i>K</i>-Means

Most standard machine learning algorithms are formulated with the implicit assumption that empirical data are “well-behaved”. In this work, we consider heavy-tailed data whose underlying distribution does not necessarily possess finite moments. For such a scenario, classical linear regression techni...

Full description

Saved in:
Bibliographic Details
Main Authors: Mario Sayde, Jihad Fahs, Ibrahim Abou-Faycal
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/16/3/184
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850090431206916096
author Mario Sayde
Jihad Fahs
Ibrahim Abou-Faycal
author_facet Mario Sayde
Jihad Fahs
Ibrahim Abou-Faycal
author_sort Mario Sayde
collection DOAJ
description Most standard machine learning algorithms are formulated with the implicit assumption that empirical data are “well-behaved”. In this work, we consider heavy-tailed data whose underlying distribution does not necessarily possess finite moments. For such a scenario, classical linear regression techniques and the standard <i>K</i>-means algorithm fail. We formulate and validate heavy-tailed versions of these machine learning methods for both scalar and multidimensional settings. The new algorithms are based on recently defined appropriate location and power parameters. Additionally, we showcase the enhanced performance of the proposed methods in comparison to some other tailored ones found in the literature.
format Article
id doaj-art-ca8ba38964254d9bb983eca1386b8ce4
institution DOAJ
issn 2078-2489
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Information
spelling doaj-art-ca8ba38964254d9bb983eca1386b8ce42025-08-20T02:42:34ZengMDPI AGInformation2078-24892025-02-0116318410.3390/info16030184Heavy-Tailed Linear Regression and <i>K</i>-MeansMario Sayde0Jihad Fahs1Ibrahim Abou-Faycal2Electrical and Computer Engineering Department, American University of Beirut, Beirut 1107 2020, LebanonElectrical and Computer Engineering Department, American University of Beirut, Beirut 1107 2020, LebanonElectrical and Computer Engineering Department, American University of Beirut, Beirut 1107 2020, LebanonMost standard machine learning algorithms are formulated with the implicit assumption that empirical data are “well-behaved”. In this work, we consider heavy-tailed data whose underlying distribution does not necessarily possess finite moments. For such a scenario, classical linear regression techniques and the standard <i>K</i>-means algorithm fail. We formulate and validate heavy-tailed versions of these machine learning methods for both scalar and multidimensional settings. The new algorithms are based on recently defined appropriate location and power parameters. Additionally, we showcase the enhanced performance of the proposed methods in comparison to some other tailored ones found in the literature.https://www.mdpi.com/2078-2489/16/3/184<i>K</i>-meanslinear regressionheavy-tailedalpha-stable?-power
spellingShingle Mario Sayde
Jihad Fahs
Ibrahim Abou-Faycal
Heavy-Tailed Linear Regression and <i>K</i>-Means
Information
<i>K</i>-means
linear regression
heavy-tailed
alpha-stable
?-power
title Heavy-Tailed Linear Regression and <i>K</i>-Means
title_full Heavy-Tailed Linear Regression and <i>K</i>-Means
title_fullStr Heavy-Tailed Linear Regression and <i>K</i>-Means
title_full_unstemmed Heavy-Tailed Linear Regression and <i>K</i>-Means
title_short Heavy-Tailed Linear Regression and <i>K</i>-Means
title_sort heavy tailed linear regression and i k i means
topic <i>K</i>-means
linear regression
heavy-tailed
alpha-stable
?-power
url https://www.mdpi.com/2078-2489/16/3/184
work_keys_str_mv AT mariosayde heavytailedlinearregressionandikimeans
AT jihadfahs heavytailedlinearregressionandikimeans
AT ibrahimaboufaycal heavytailedlinearregressionandikimeans