The effects of mismatched train and test data cleaning pipelines on regression models: lessons for practice

Data quality problems are present in all real-world, large-scale datasets. Each of these potential problems can be addressed in multiple ways through data cleaning. However, there is no single best data cleaning approach that always produces a perfect result, meaning that a choice needs to be made a...

Full description

Saved in:
Bibliographic Details
Main Authors: James Nevin, Michael Lees, Paul Groth
Format: Article
Language:English
Published: PeerJ Inc. 2025-04-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-2793.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!