Hierarchical data modeling: A systematic comparison of statistical, tree-based, and neural network approaches

Hierarchical modeling approaches have evolved significantly, yet comprehensive comparisons between fundamentally different methodological paradigms remain limited. This research presents a systematic comparative analysis of three distinct hierarchical modeling approaches: statistical (Hierarchical M...

Full description

Saved in:
Bibliographic Details
Main Authors: Marzieh Amiri Shahbazi, Nasibeh Azadeh-Fard
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:Machine Learning with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666827025000714
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Hierarchical modeling approaches have evolved significantly, yet comprehensive comparisons between fundamentally different methodological paradigms remain limited. This research presents a systematic comparative analysis of three distinct hierarchical modeling approaches: statistical (Hierarchical Mixed Model), tree-based (Hierarchical Random Forest), and neural (Hierarchical Neural Network). Based on the 2019 National Inpatient Sample — comprising more than seven million records from 4568 hospitals across four U.S. regions — the models were assessed for their ability to predict length of stay at the patient, hospital, and regional levels. The evaluation framework integrated quantitative metrics and qualitative factors, including analyses across varying sample sizes, simplified hierarchies, and a separate intensive-care dataset. Results demonstrate that tree-based approaches consistently outperform alternatives in predictive accuracy and explanation of variance while maintaining computational efficiency. These performance patterns remain generally consistent across sample sizes, simplified hierarchies, and the external dataset. Neural approaches excel at capturing group-level distinctions but require substantial computational resources and exhibit prediction bias. Statistical approaches offer rapid inference and interpretability but underperform in accuracy at intermediate hierarchical levels. Each model exhibits distinctive hierarchical information processing: neural models favor bottom-up flow, statistical models emphasize top-down constraints, and tree-based models achieve balanced integration. This research establishes practical guidelines for selecting appropriate hierarchical modeling approaches based on data characteristics, computational constraints, and analytical requirements, thereby advancing understanding of fundamental trade-offs in multilevel analysis.
ISSN:2666-8270