Multiple Regression with Transformations and Variable Selection at the Industrial Scale

This article explores the development of a statistical model to predict roll forces during hot rolling in a commercial steel mill based on a dataset from an industrial-scale operation; this dataset consists of 2255 individual coils, processed through five roll stands in the mill, for a total of 11,2...

Full description

Saved in:
Bibliographic Details
Main Authors: Gus Greivel, Soutir Bandyopadhyay, Alexandra M. Newman, Brian G. Thomas
Format: Article
Language:English
Published: Taylor & Francis Group 2025-07-01
Series:Journal of Statistics and Data Science Education
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/26939169.2025.2490027
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This article explores the development of a statistical model to predict roll forces during hot rolling in a commercial steel mill based on a dataset from an industrial-scale operation; this dataset consists of 2255 individual coils, processed through five roll stands in the mill, for a total of 11,275 distinct data points. This work is informed by a physics-based equation and applies variable transformations, variable selection and model comparisons; it illustrates several topics introduced throughout a regression modeling course, including data visualization, multiple linear regression, residual diagnostics and heteroscedastiscity, variable transformations, collinearity and confounding due to strong correlations between predictor variables, the false correlation paradox, variable selection, and measures of model quality for predictive use. Extensions, including autocorrelation, mixed models, and resampling methods, are also suitable as more advanced topics and for open-ended courses. Finally, we add an interesting real dataset and associated analysis to the publicly available resources for statistics and data science education.
ISSN:2693-9169