Comparison between Statistical Approaches and Data Mining Algorithms for Outlier Detection

Outliers are observation values that are very different from most observations. The presence of outliers in data can have a negative impact on research but can contain important information for other research. So, identifying outliers before conducting data analysis is a crucial thing to do. Outlier...

Full description

Saved in:
Bibliographic Details
Main Authors: Annisa Putri Utami, Anwar Fitrianto, Khairil Anwar Notodiputro
Format: Article
Language:English
Published: Mathematics Department UIN Maulana Malik Ibrahim Malang 2024-05-01
Series:Cauchy: Jurnal Matematika Murni dan Aplikasi
Subjects:
Online Access:https://ejournal.uin-malang.ac.id/index.php/Math/article/view/25450
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850198763227840512
author Annisa Putri Utami
Anwar Fitrianto
Khairil Anwar Notodiputro
author_facet Annisa Putri Utami
Anwar Fitrianto
Khairil Anwar Notodiputro
author_sort Annisa Putri Utami
collection DOAJ
description Outliers are observation values that are very different from most observations. The presence of outliers in data can have a negative impact on research but can contain important information for other research. So, identifying outliers before conducting data analysis is a crucial thing to do. Outlier detection methods/techniques were first pioneered by researchers in statistics. However, due to rapid technological advances which have an impact on the ease of collecting extensive data, the development of outlier detection techniques is now handled mainly by researchers in the field of computer science (data mining) using computing facilities. This research aims to examine the results of simulation studies by comparing methods for identifying several outliers using statistical approaches and data mining algorithm approaches in various predetermined data scenarios. Based on the scenario carried out, the outlier detection method using a statistical approach is generally better than the outlier detection method using a data mining-based approach. Suggestions for further research are to improve the data mining method by focusing more on statistical analysis apart from focusing on data processing computing time so that the expected results of outlier detection are faster and more precise.
format Article
id doaj-art-be03888a9f8f4be9bdf72eec3231da3b
institution OA Journals
issn 2086-0382
2477-3344
language English
publishDate 2024-05-01
publisher Mathematics Department UIN Maulana Malik Ibrahim Malang
record_format Article
series Cauchy: Jurnal Matematika Murni dan Aplikasi
spelling doaj-art-be03888a9f8f4be9bdf72eec3231da3b2025-08-20T02:12:46ZengMathematics Department UIN Maulana Malik Ibrahim MalangCauchy: Jurnal Matematika Murni dan Aplikasi2086-03822477-33442024-05-019111912810.18860/ca.v9i1.254507813Comparison between Statistical Approaches and Data Mining Algorithms for Outlier DetectionAnnisa Putri Utami0Anwar Fitrianto1Khairil Anwar Notodiputro2Department of Statistics, IPB UniversityDepartment of Statistics, IPB UniversityDepartment of Statistics, IPB UniversityOutliers are observation values that are very different from most observations. The presence of outliers in data can have a negative impact on research but can contain important information for other research. So, identifying outliers before conducting data analysis is a crucial thing to do. Outlier detection methods/techniques were first pioneered by researchers in statistics. However, due to rapid technological advances which have an impact on the ease of collecting extensive data, the development of outlier detection techniques is now handled mainly by researchers in the field of computer science (data mining) using computing facilities. This research aims to examine the results of simulation studies by comparing methods for identifying several outliers using statistical approaches and data mining algorithm approaches in various predetermined data scenarios. Based on the scenario carried out, the outlier detection method using a statistical approach is generally better than the outlier detection method using a data mining-based approach. Suggestions for further research are to improve the data mining method by focusing more on statistical analysis apart from focusing on data processing computing time so that the expected results of outlier detection are faster and more precise.https://ejournal.uin-malang.ac.id/index.php/Math/article/view/25450distance-based methodsmaskingoutlieroutlier detection methodswamping
spellingShingle Annisa Putri Utami
Anwar Fitrianto
Khairil Anwar Notodiputro
Comparison between Statistical Approaches and Data Mining Algorithms for Outlier Detection
Cauchy: Jurnal Matematika Murni dan Aplikasi
distance-based methods
masking
outlier
outlier detection method
swamping
title Comparison between Statistical Approaches and Data Mining Algorithms for Outlier Detection
title_full Comparison between Statistical Approaches and Data Mining Algorithms for Outlier Detection
title_fullStr Comparison between Statistical Approaches and Data Mining Algorithms for Outlier Detection
title_full_unstemmed Comparison between Statistical Approaches and Data Mining Algorithms for Outlier Detection
title_short Comparison between Statistical Approaches and Data Mining Algorithms for Outlier Detection
title_sort comparison between statistical approaches and data mining algorithms for outlier detection
topic distance-based methods
masking
outlier
outlier detection method
swamping
url https://ejournal.uin-malang.ac.id/index.php/Math/article/view/25450
work_keys_str_mv AT annisaputriutami comparisonbetweenstatisticalapproachesanddataminingalgorithmsforoutlierdetection
AT anwarfitrianto comparisonbetweenstatisticalapproachesanddataminingalgorithmsforoutlierdetection
AT khairilanwarnotodiputro comparisonbetweenstatisticalapproachesanddataminingalgorithmsforoutlierdetection