PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSIS
Time series forecasting often faces challenges in producing reliable predictions due to inherent uncertainty in dynamic systems. While point predictions are commonly used, they may not adequately capture this uncertainty, especially in financial systems where forecasting accuracy directly impacts de...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Universitas Pattimura
2025-07-01
|
| Series: | Barekeng |
| Subjects: | |
| Online Access: | https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/16433 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849237900464488448 |
|---|---|
| author | Wa Ode Rahmalia Safitri Farit Mochamad Afendi Budi Susetyo |
| author_facet | Wa Ode Rahmalia Safitri Farit Mochamad Afendi Budi Susetyo |
| author_sort | Wa Ode Rahmalia Safitri |
| collection | DOAJ |
| description | Time series forecasting often faces challenges in producing reliable predictions due to inherent uncertainty in dynamic systems. While point predictions are commonly used, they may not adequately capture this uncertainty, especially in financial systems where forecasting accuracy directly impacts decision-making. Prediction intervals offer a solution by providing a range of likely outcomes rather than single-point estimates. This study implements multivariate time series forecasting using gradient boosting algorithms (XGBoost, CatBoost, and LightGBM) to predict cash flow patterns in banking transactions, focusing on constructing reliable prediction intervals. Using transaction data from Bank Rakyat Indonesia (BRI), the research analyzes both office and e-channel transactions with different lag structures based on Granger Causality tests. Model performance was evaluated using RMSLE, MAE, and MAPE metrics, with RMSLE chosen as primary due to its ability to handle skewed distributions. LightGBM achieved best performance for office cash-in transactions (RMSLE: 0.2395), while CatBoost outperformed others for office cash-out (RMSLE: 0.2848), e-channel cash-in (RMSLE: 0.3946), and e-channel cash-out (RMSLE: 0.4221). For prediction intervals, two methods were compared: Residual Bootstrap with 500 samples and Quantile Regression. Residual Bootstrap generally produced coverage probabilities closer to the 80% level (i.e., 10–90% prediction interval), especially for office transactions, while maintaining narrower interval widths. In contrast, Quantile Regression tended to generate wider intervals and often overestimated uncertainty, resulting in overly high coverage in some cases. However, both methods showed clear limitations when applied to e-channel transactions, particularly for cash-in e-channel, where coverage probabilities fell below 50% due to high volatility and irregular transaction patterns. Unlike previous work focused only on point forecasts, this study offers insights into forecast uncertainty by evaluating how well each method quantifies, providing practical guidance for financial institutions aiming to improve risk management through interval-based forecasting. |
| format | Article |
| id | doaj-art-2a23d2ba21ff47288f8c3e459acebe21 |
| institution | Kabale University |
| issn | 1978-7227 2615-3017 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Universitas Pattimura |
| record_format | Article |
| series | Barekeng |
| spelling | doaj-art-2a23d2ba21ff47288f8c3e459acebe212025-08-20T04:01:48ZengUniversitas PattimuraBarekeng1978-72272615-30172025-07-011931625163610.30598/barekengvol19iss3pp1625-163616433PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSISWa Ode Rahmalia Safitri0Farit Mochamad Afendi1Budi Susetyo2Statistics and Data Science Department, Faculty of Mathematics and Natural Sciences, IPB University, IndonesiaStatistics and Data Science Department, Faculty of Mathematics and Natural Sciences, IPB University, IndonesiaStatistics and Data Science Department, Faculty of Mathematics and Natural Sciences, IPB University, IndonesiaTime series forecasting often faces challenges in producing reliable predictions due to inherent uncertainty in dynamic systems. While point predictions are commonly used, they may not adequately capture this uncertainty, especially in financial systems where forecasting accuracy directly impacts decision-making. Prediction intervals offer a solution by providing a range of likely outcomes rather than single-point estimates. This study implements multivariate time series forecasting using gradient boosting algorithms (XGBoost, CatBoost, and LightGBM) to predict cash flow patterns in banking transactions, focusing on constructing reliable prediction intervals. Using transaction data from Bank Rakyat Indonesia (BRI), the research analyzes both office and e-channel transactions with different lag structures based on Granger Causality tests. Model performance was evaluated using RMSLE, MAE, and MAPE metrics, with RMSLE chosen as primary due to its ability to handle skewed distributions. LightGBM achieved best performance for office cash-in transactions (RMSLE: 0.2395), while CatBoost outperformed others for office cash-out (RMSLE: 0.2848), e-channel cash-in (RMSLE: 0.3946), and e-channel cash-out (RMSLE: 0.4221). For prediction intervals, two methods were compared: Residual Bootstrap with 500 samples and Quantile Regression. Residual Bootstrap generally produced coverage probabilities closer to the 80% level (i.e., 10–90% prediction interval), especially for office transactions, while maintaining narrower interval widths. In contrast, Quantile Regression tended to generate wider intervals and often overestimated uncertainty, resulting in overly high coverage in some cases. However, both methods showed clear limitations when applied to e-channel transactions, particularly for cash-in e-channel, where coverage probabilities fell below 50% due to high volatility and irregular transaction patterns. Unlike previous work focused only on point forecasts, this study offers insights into forecast uncertainty by evaluating how well each method quantifies, providing practical guidance for financial institutions aiming to improve risk management through interval-based forecasting.https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/16433bootstrap methodcatboostlightgbmprediction intervalsquantile regressionxgboost |
| spellingShingle | Wa Ode Rahmalia Safitri Farit Mochamad Afendi Budi Susetyo PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSIS Barekeng bootstrap method catboost lightgbm prediction intervals quantile regression xgboost |
| title | PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSIS |
| title_full | PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSIS |
| title_fullStr | PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSIS |
| title_full_unstemmed | PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSIS |
| title_short | PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSIS |
| title_sort | prediction intervals in machine learning residual bootstrap and quantile regression for cash flow analysis |
| topic | bootstrap method catboost lightgbm prediction intervals quantile regression xgboost |
| url | https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/16433 |
| work_keys_str_mv | AT waoderahmaliasafitri predictionintervalsinmachinelearningresidualbootstrapandquantileregressionforcashflowanalysis AT faritmochamadafendi predictionintervalsinmachinelearningresidualbootstrapandquantileregressionforcashflowanalysis AT budisusetyo predictionintervalsinmachinelearningresidualbootstrapandquantileregressionforcashflowanalysis |