PERFORMANCE COMPARISON OF DECISION TREE AND LOGISTIC REGRESSION METHODS FOR CLASSIFICATION OF SNP GENETIC DATA
This research was conducted to compare the accuracy when decision tree and logistic regression methods are used on some data. Decision tree is one method of classification techniques in data mining. In the decision tree method, very large data samples will be represented as smaller rules, and logist...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Universitas Pattimura
2024-03-01
|
| Series: | Barekeng |
| Subjects: | |
| Online Access: | https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/10450 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849402509261537280 |
|---|---|
| author | Adi Setiawan Febi Setivani Tundjung Mahatma |
| author_facet | Adi Setiawan Febi Setivani Tundjung Mahatma |
| author_sort | Adi Setiawan |
| collection | DOAJ |
| description | This research was conducted to compare the accuracy when decision tree and logistic regression methods are used on some data. Decision tree is one method of classification techniques in data mining. In the decision tree method, very large data samples will be represented as smaller rules, and logistic regression is a method that aims to determine the effect of an independent variable on other variables, namely dichotomous dependent variables. Both algorithms were written and analyzed using R software to see which method is better between the decision tree method and the logistic regression method applied to SNP (Single Nucleotide Polymorphism) genetic data, namely Asthma data. SNP Genetic Data was obtained from R software with the package name "SNPassoc" and the data name "asthma". Asthma data has 57 features, namely Country, Gender, Age, BMI, Smoke, Case control, and SNP (Single Nucleotide Polymorphism) genetic code. Comparative analysis was carried out based on the results of the accuracy values obtained in the two methods. Variations in the proportion of the test data used were 40%, 30%, 20% and 10% and were simulated 1000 times on the grounds of obtaining a better accuracy value. The results obtained show that the decision tree method obtains an accuracy value of 0.5793, 0.5777, 0.5745, 0.5526, respectively, while the logistic regression method is 0.7696, 0.7729, 0.7763, 0.7788, respectively and they are achieved at the proportion of test data of 40%, 30%, 20%, 10%. Thus it can be concluded that in this case the logistic regression method is better than the decision tree method in classifying Asthma data. |
| format | Article |
| id | doaj-art-abd8a15dc00945db8e5a2f34fc983475 |
| institution | Kabale University |
| issn | 1978-7227 2615-3017 |
| language | English |
| publishDate | 2024-03-01 |
| publisher | Universitas Pattimura |
| record_format | Article |
| series | Barekeng |
| spelling | doaj-art-abd8a15dc00945db8e5a2f34fc9834752025-08-20T03:37:31ZengUniversitas PattimuraBarekeng1978-72272615-30172024-03-011810403041210.30598/barekengvol18iss1pp0403-041210450PERFORMANCE COMPARISON OF DECISION TREE AND LOGISTIC REGRESSION METHODS FOR CLASSIFICATION OF SNP GENETIC DATAAdi Setiawan0Febi Setivani1Tundjung Mahatma2Department of Data Science, Faculty of Science and Mathematics, Satya Wacana Christian University, IndonesiaDepartment of Mathematics, Faculty of Science and Mathematics, Satya Wacana Christian University, IndonesiaDepartment of Mathematics, Faculty of Science and Mathematics, Satya Wacana Christian University, IndonesiaThis research was conducted to compare the accuracy when decision tree and logistic regression methods are used on some data. Decision tree is one method of classification techniques in data mining. In the decision tree method, very large data samples will be represented as smaller rules, and logistic regression is a method that aims to determine the effect of an independent variable on other variables, namely dichotomous dependent variables. Both algorithms were written and analyzed using R software to see which method is better between the decision tree method and the logistic regression method applied to SNP (Single Nucleotide Polymorphism) genetic data, namely Asthma data. SNP Genetic Data was obtained from R software with the package name "SNPassoc" and the data name "asthma". Asthma data has 57 features, namely Country, Gender, Age, BMI, Smoke, Case control, and SNP (Single Nucleotide Polymorphism) genetic code. Comparative analysis was carried out based on the results of the accuracy values obtained in the two methods. Variations in the proportion of the test data used were 40%, 30%, 20% and 10% and were simulated 1000 times on the grounds of obtaining a better accuracy value. The results obtained show that the decision tree method obtains an accuracy value of 0.5793, 0.5777, 0.5745, 0.5526, respectively, while the logistic regression method is 0.7696, 0.7729, 0.7763, 0.7788, respectively and they are achieved at the proportion of test data of 40%, 30%, 20%, 10%. Thus it can be concluded that in this case the logistic regression method is better than the decision tree method in classifying Asthma data.https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/10450accuracydecision treelogistic regression |
| spellingShingle | Adi Setiawan Febi Setivani Tundjung Mahatma PERFORMANCE COMPARISON OF DECISION TREE AND LOGISTIC REGRESSION METHODS FOR CLASSIFICATION OF SNP GENETIC DATA Barekeng accuracy decision tree logistic regression |
| title | PERFORMANCE COMPARISON OF DECISION TREE AND LOGISTIC REGRESSION METHODS FOR CLASSIFICATION OF SNP GENETIC DATA |
| title_full | PERFORMANCE COMPARISON OF DECISION TREE AND LOGISTIC REGRESSION METHODS FOR CLASSIFICATION OF SNP GENETIC DATA |
| title_fullStr | PERFORMANCE COMPARISON OF DECISION TREE AND LOGISTIC REGRESSION METHODS FOR CLASSIFICATION OF SNP GENETIC DATA |
| title_full_unstemmed | PERFORMANCE COMPARISON OF DECISION TREE AND LOGISTIC REGRESSION METHODS FOR CLASSIFICATION OF SNP GENETIC DATA |
| title_short | PERFORMANCE COMPARISON OF DECISION TREE AND LOGISTIC REGRESSION METHODS FOR CLASSIFICATION OF SNP GENETIC DATA |
| title_sort | performance comparison of decision tree and logistic regression methods for classification of snp genetic data |
| topic | accuracy decision tree logistic regression |
| url | https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/10450 |
| work_keys_str_mv | AT adisetiawan performancecomparisonofdecisiontreeandlogisticregressionmethodsforclassificationofsnpgeneticdata AT febisetivani performancecomparisonofdecisiontreeandlogisticregressionmethodsforclassificationofsnpgeneticdata AT tundjungmahatma performancecomparisonofdecisiontreeandlogisticregressionmethodsforclassificationofsnpgeneticdata |