IMPROVING ACCURACY OF PREDICTION INTERVALS OF HOUSEHOLD INCOME USING QUANTILE REGRESSION FOREST AND SELECTION OF EXPLANATORY VARIABLES
Quantile regression forest (QRF) is a non-parametric method for estimating the distribution function of response by using the random forest algorithm and constructing conditional quantile prediction intervals. However, if the explanatory factors (covariates) are highly correlated, the quantile regre...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Universitas Pattimura
2023-12-01
|
| Series: | Barekeng |
| Subjects: | |
| Online Access: | https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/8974 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849405635670573056 |
|---|---|
| author | Asrirawan Asrirawan Khairil Anwar Notodiputro Bagus Sartono |
| author_facet | Asrirawan Asrirawan Khairil Anwar Notodiputro Bagus Sartono |
| author_sort | Asrirawan Asrirawan |
| collection | DOAJ |
| description | Quantile regression forest (QRF) is a non-parametric method for estimating the distribution function of response by using the random forest algorithm and constructing conditional quantile prediction intervals. However, if the explanatory factors (covariates) are highly correlated, the quantile regression forest's performance will decrease, resulting in low accuracy of prediction intervals for the outcome variable. The selection of explanatory variables in quantile regression forest is investigated and addressed in this paper, using several selection scenarios that consist of the full model, forward selection, LASSO, ridge regression, and random forest to improve the accuracy of household income data prediction. This data was obtained from National Labour Force Survey in 2021. The results indicate that the random forest method outperforms other methods for explanatory selection utilizing RMSE metrics. With regard to the criteria of average coverage value just above the 95% target and statistical test results, the RF-QRF and Forward-QRF methods outperform the QRF, LASSO-QRF, and Ridge-QRF methods for constructing prediction intervals. |
| format | Article |
| id | doaj-art-fa83d5d06cb9420886143e4f4c87c6fd |
| institution | Kabale University |
| issn | 1978-7227 2615-3017 |
| language | English |
| publishDate | 2023-12-01 |
| publisher | Universitas Pattimura |
| record_format | Article |
| series | Barekeng |
| spelling | doaj-art-fa83d5d06cb9420886143e4f4c87c6fd2025-08-20T03:36:37ZengUniversitas PattimuraBarekeng1978-72272615-30172023-12-011741915192610.30598/barekengvol17iss4pp1915-19268974IMPROVING ACCURACY OF PREDICTION INTERVALS OF HOUSEHOLD INCOME USING QUANTILE REGRESSION FOREST AND SELECTION OF EXPLANATORY VARIABLESAsrirawan Asrirawan0Khairil Anwar Notodiputro1Bagus Sartono2Department of Statistics, Faculty of Mathematics and Natural Sciences, University of West Sulawesi, IndonesiaDepartment of Statistics and Data Science, Faculty of Mathematics and Natural Sciences, IPB UniversityDepartment of Statistics and Data Science, Faculty of Mathematics and Natural Sciences, IPB UniversityQuantile regression forest (QRF) is a non-parametric method for estimating the distribution function of response by using the random forest algorithm and constructing conditional quantile prediction intervals. However, if the explanatory factors (covariates) are highly correlated, the quantile regression forest's performance will decrease, resulting in low accuracy of prediction intervals for the outcome variable. The selection of explanatory variables in quantile regression forest is investigated and addressed in this paper, using several selection scenarios that consist of the full model, forward selection, LASSO, ridge regression, and random forest to improve the accuracy of household income data prediction. This data was obtained from National Labour Force Survey in 2021. The results indicate that the random forest method outperforms other methods for explanatory selection utilizing RMSE metrics. With regard to the criteria of average coverage value just above the 95% target and statistical test results, the RF-QRF and Forward-QRF methods outperform the QRF, LASSO-QRF, and Ridge-QRF methods for constructing prediction intervals.https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/8974household incomequantile regression forestrandom forestprediction interval |
| spellingShingle | Asrirawan Asrirawan Khairil Anwar Notodiputro Bagus Sartono IMPROVING ACCURACY OF PREDICTION INTERVALS OF HOUSEHOLD INCOME USING QUANTILE REGRESSION FOREST AND SELECTION OF EXPLANATORY VARIABLES Barekeng household income quantile regression forest random forest prediction interval |
| title | IMPROVING ACCURACY OF PREDICTION INTERVALS OF HOUSEHOLD INCOME USING QUANTILE REGRESSION FOREST AND SELECTION OF EXPLANATORY VARIABLES |
| title_full | IMPROVING ACCURACY OF PREDICTION INTERVALS OF HOUSEHOLD INCOME USING QUANTILE REGRESSION FOREST AND SELECTION OF EXPLANATORY VARIABLES |
| title_fullStr | IMPROVING ACCURACY OF PREDICTION INTERVALS OF HOUSEHOLD INCOME USING QUANTILE REGRESSION FOREST AND SELECTION OF EXPLANATORY VARIABLES |
| title_full_unstemmed | IMPROVING ACCURACY OF PREDICTION INTERVALS OF HOUSEHOLD INCOME USING QUANTILE REGRESSION FOREST AND SELECTION OF EXPLANATORY VARIABLES |
| title_short | IMPROVING ACCURACY OF PREDICTION INTERVALS OF HOUSEHOLD INCOME USING QUANTILE REGRESSION FOREST AND SELECTION OF EXPLANATORY VARIABLES |
| title_sort | improving accuracy of prediction intervals of household income using quantile regression forest and selection of explanatory variables |
| topic | household income quantile regression forest random forest prediction interval |
| url | https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/8974 |
| work_keys_str_mv | AT asrirawanasrirawan improvingaccuracyofpredictionintervalsofhouseholdincomeusingquantileregressionforestandselectionofexplanatoryvariables AT khairilanwarnotodiputro improvingaccuracyofpredictionintervalsofhouseholdincomeusingquantileregressionforestandselectionofexplanatoryvariables AT bagussartono improvingaccuracyofpredictionintervalsofhouseholdincomeusingquantileregressionforestandselectionofexplanatoryvariables |