A new hybrid method for data analysis when a significant percentage of data is missing
This article aims to compare the efficiency of different imputation methods with missing data. In this way we use mean, median, Expected-Maximization (EM), regression imputation(RI) and multiple imputations (MI) to replace missing data.In fact, we employ three proposed combination methods, namely EM...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
University of Mohaghegh Ardabili
2024-12-01
|
| Series: | Journal of Hyperstructures |
| Subjects: | |
| Online Access: | https://jhs.uma.ac.ir/article_3534_e8b573ee79ad84dc2a9cd6f296b7afb8.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849427786866884608 |
|---|---|
| author | Behrouz Fathi-Vajargah Ahmad Nouraldin |
| author_facet | Behrouz Fathi-Vajargah Ahmad Nouraldin |
| author_sort | Behrouz Fathi-Vajargah |
| collection | DOAJ |
| description | This article aims to compare the efficiency of different imputation methods with missing data. In this way we use mean, median, Expected-Maximization (EM), regression imputation(RI) and multiple imputations (MI) to replace missing data.In fact, we employ three proposed combination methods, namely EM imputation with MI imputation (EMMI), EM imputation with regression imputation (EMR), and regression imputation with MIimputation (MI). In this paper, we compare these methods using an example study of Waterborne Container Trade by the US Customs Port (2000-2017) where the methods with different missing percent-ages. Several criteria, are used to compare estimations efficiency, such as mean, Standard Deviation (SD), and Mean Squared Error (MSE). The results show that the efficiency of composite imputation methods in almost all situations, in terms of MSE, RMI imputation method outperforms other methods. Nevertheless, when the missing percentage is small, the EMR imputation method performs better. In terms of the SD criterion, we find that the MI method is better than the other methods, where the RMI method is good when the missing percentage is large. When the missing percentage is in the range (40-50%), the EMR and RMI imputation methods give a better MSE. |
| format | Article |
| id | doaj-art-eb7eca76176842f79e7c40169f4d114c |
| institution | Kabale University |
| issn | 2251-8436 2322-1666 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | University of Mohaghegh Ardabili |
| record_format | Article |
| series | Journal of Hyperstructures |
| spelling | doaj-art-eb7eca76176842f79e7c40169f4d114c2025-08-20T03:28:54ZengUniversity of Mohaghegh ArdabiliJournal of Hyperstructures2251-84362322-16662024-12-0113229730410.22098/jhs.2024.15095.10153534A new hybrid method for data analysis when a significant percentage of data is missingBehrouz Fathi-Vajargah0Ahmad Nouraldin1Department of Statistics, Faculty of Mathematical Sciences, University of Guilan, Rasht, IranDep. of Applied Maths, University of Guilan, Rasht, IranThis article aims to compare the efficiency of different imputation methods with missing data. In this way we use mean, median, Expected-Maximization (EM), regression imputation(RI) and multiple imputations (MI) to replace missing data.In fact, we employ three proposed combination methods, namely EM imputation with MI imputation (EMMI), EM imputation with regression imputation (EMR), and regression imputation with MIimputation (MI). In this paper, we compare these methods using an example study of Waterborne Container Trade by the US Customs Port (2000-2017) where the methods with different missing percent-ages. Several criteria, are used to compare estimations efficiency, such as mean, Standard Deviation (SD), and Mean Squared Error (MSE). The results show that the efficiency of composite imputation methods in almost all situations, in terms of MSE, RMI imputation method outperforms other methods. Nevertheless, when the missing percentage is small, the EMR imputation method performs better. In terms of the SD criterion, we find that the MI method is better than the other methods, where the RMI method is good when the missing percentage is large. When the missing percentage is in the range (40-50%), the EMR and RMI imputation methods give a better MSE.https://jhs.uma.ac.ir/article_3534_e8b573ee79ad84dc2a9cd6f296b7afb8.pdfmissing dataimputationmean square errormeanstandard deviation |
| spellingShingle | Behrouz Fathi-Vajargah Ahmad Nouraldin A new hybrid method for data analysis when a significant percentage of data is missing Journal of Hyperstructures missing data imputation mean square error mean standard deviation |
| title | A new hybrid method for data analysis when a significant percentage of data is missing |
| title_full | A new hybrid method for data analysis when a significant percentage of data is missing |
| title_fullStr | A new hybrid method for data analysis when a significant percentage of data is missing |
| title_full_unstemmed | A new hybrid method for data analysis when a significant percentage of data is missing |
| title_short | A new hybrid method for data analysis when a significant percentage of data is missing |
| title_sort | new hybrid method for data analysis when a significant percentage of data is missing |
| topic | missing data imputation mean square error mean standard deviation |
| url | https://jhs.uma.ac.ir/article_3534_e8b573ee79ad84dc2a9cd6f296b7afb8.pdf |
| work_keys_str_mv | AT behrouzfathivajargah anewhybridmethodfordataanalysiswhenasignificantpercentageofdataismissing AT ahmadnouraldin anewhybridmethodfordataanalysiswhenasignificantpercentageofdataismissing AT behrouzfathivajargah newhybridmethodfordataanalysiswhenasignificantpercentageofdataismissing AT ahmadnouraldin newhybridmethodfordataanalysiswhenasignificantpercentageofdataismissing |