Research on memory failure prediction based on ensemble learning.
Timely prediction of memory failures is crucial for the stable operation of data centers. However, existing methods often rely on a single classifier, which can lead to inaccurate or unstable predictions. To address this, we propose a new ensemble model for predicting CE-driven memory failures, wher...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0321954 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849315991666819072 |
|---|---|
| author | Peng Zhang Jialiang Zhang Yi Li |
| author_facet | Peng Zhang Jialiang Zhang Yi Li |
| author_sort | Peng Zhang |
| collection | DOAJ |
| description | Timely prediction of memory failures is crucial for the stable operation of data centers. However, existing methods often rely on a single classifier, which can lead to inaccurate or unstable predictions. To address this, we propose a new ensemble model for predicting CE-driven memory failures, where failures occur due to a surge of correctable errors (CEs) in memory, causing server downtime. Our model combines several strong-performing classifiers, such as Random Forest, LightGBM, and XGBoost, and assigns different weights to each based on its performance. By optimizing the decision-making process, the model improves prediction accuracy. We validate the model using in-memory data from Alibaba's data center, and the results show an accuracy of over 84%, outperforming existing single and dual-classifier models, further confirming its excellent predictive performance. |
| format | Article |
| id | doaj-art-792b2273fcce493ba907a834c3a8b3ab |
| institution | Kabale University |
| issn | 1932-6203 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-792b2273fcce493ba907a834c3a8b3ab2025-08-20T03:51:59ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01204e032195410.1371/journal.pone.0321954Research on memory failure prediction based on ensemble learning.Peng ZhangJialiang ZhangYi LiTimely prediction of memory failures is crucial for the stable operation of data centers. However, existing methods often rely on a single classifier, which can lead to inaccurate or unstable predictions. To address this, we propose a new ensemble model for predicting CE-driven memory failures, where failures occur due to a surge of correctable errors (CEs) in memory, causing server downtime. Our model combines several strong-performing classifiers, such as Random Forest, LightGBM, and XGBoost, and assigns different weights to each based on its performance. By optimizing the decision-making process, the model improves prediction accuracy. We validate the model using in-memory data from Alibaba's data center, and the results show an accuracy of over 84%, outperforming existing single and dual-classifier models, further confirming its excellent predictive performance.https://doi.org/10.1371/journal.pone.0321954 |
| spellingShingle | Peng Zhang Jialiang Zhang Yi Li Research on memory failure prediction based on ensemble learning. PLoS ONE |
| title | Research on memory failure prediction based on ensemble learning. |
| title_full | Research on memory failure prediction based on ensemble learning. |
| title_fullStr | Research on memory failure prediction based on ensemble learning. |
| title_full_unstemmed | Research on memory failure prediction based on ensemble learning. |
| title_short | Research on memory failure prediction based on ensemble learning. |
| title_sort | research on memory failure prediction based on ensemble learning |
| url | https://doi.org/10.1371/journal.pone.0321954 |
| work_keys_str_mv | AT pengzhang researchonmemoryfailurepredictionbasedonensemblelearning AT jialiangzhang researchonmemoryfailurepredictionbasedonensemblelearning AT yili researchonmemoryfailurepredictionbasedonensemblelearning |