IDENTIFYING IMPORTANT GENES IN OVARIAN CANCER FROM HIGH-DIMENSIONAL MICROARRAY DATA USING SIFS-CART METHOD

Ovarian cancer can be identified from microarray data using machine learning. Many studies only focus on improving the machine learning classification algorithms to achieve higher performance. The purpose of classification is not only to obtain high performance but also to seek new knowledge from th...

Full description

Saved in:
Bibliographic Details
Main Authors: Ni Kadek Emik Sapitri, Umu Sa'adah, Nur Shofianah
Format: Article
Language:English
Published: Universitas Pattimura 2024-07-01
Series:Barekeng
Subjects:
Online Access:https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/12580
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849713260965658624
author Ni Kadek Emik Sapitri
Umu Sa'adah
Nur Shofianah
author_facet Ni Kadek Emik Sapitri
Umu Sa'adah
Nur Shofianah
author_sort Ni Kadek Emik Sapitri
collection DOAJ
description Ovarian cancer can be identified from microarray data using machine learning. Many studies only focus on improving the machine learning classification algorithms to achieve higher performance. The purpose of classification is not only to obtain high performance but also to seek new knowledge from the results. This research focuses on both. By using a hybrid Supervised Infinite Feature Selection (SIFS) method with Classification and Regression Tree (CART) or SIFS-CART, this research aims to predict ovarian cancer and identify potential genes for ovarian cancer cases. The data used is the OVA_ovary dataset. SIFS in the best SIFS-CART model reduced 10935 genes in the initial OVA_ovary dataset to 1000 genes. Then, CART was built with these 1000 genes. Based on the balanced accuracy (BA) metric for imbalanced microarray data, the best SIFS-CART model achieves 85.7% BA in training and 83.2% in testing. The optimal CART in the best SIFS-CART model only needs four genes from 1000 selected genes to build it. Those genes are STAR, WT1, PEG3, and ASPN. Based on studies of several pieces of literature in the medical field, it can be concluded that STAR, WT1, and PEG3 play an important role in ovarian cancer cases. However, the relationship between ASPN and ovarian cancer in more detail has not been studied by medical researchers.
format Article
id doaj-art-d5229479411243968d4a3a67bacbbfce
institution DOAJ
issn 1978-7227
2615-3017
language English
publishDate 2024-07-01
publisher Universitas Pattimura
record_format Article
series Barekeng
spelling doaj-art-d5229479411243968d4a3a67bacbbfce2025-08-20T03:14:00ZengUniversitas PattimuraBarekeng1978-72272615-30172024-07-011831909191810.30598/barekengvol18iss3pp1909-191812580IDENTIFYING IMPORTANT GENES IN OVARIAN CANCER FROM HIGH-DIMENSIONAL MICROARRAY DATA USING SIFS-CART METHODNi Kadek Emik Sapitri0Umu Sa'adah1Nur Shofianah2Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Brawijaya, IndonesiaDepartment of Mathematics, Faculty of Mathematics and Natural Science, Universitas Brawijaya, IndonesiaDepartment of Mathematics, Faculty of Mathematics and Natural Science, Universitas Brawijaya, IndonesiaOvarian cancer can be identified from microarray data using machine learning. Many studies only focus on improving the machine learning classification algorithms to achieve higher performance. The purpose of classification is not only to obtain high performance but also to seek new knowledge from the results. This research focuses on both. By using a hybrid Supervised Infinite Feature Selection (SIFS) method with Classification and Regression Tree (CART) or SIFS-CART, this research aims to predict ovarian cancer and identify potential genes for ovarian cancer cases. The data used is the OVA_ovary dataset. SIFS in the best SIFS-CART model reduced 10935 genes in the initial OVA_ovary dataset to 1000 genes. Then, CART was built with these 1000 genes. Based on the balanced accuracy (BA) metric for imbalanced microarray data, the best SIFS-CART model achieves 85.7% BA in training and 83.2% in testing. The optimal CART in the best SIFS-CART model only needs four genes from 1000 selected genes to build it. Those genes are STAR, WT1, PEG3, and ASPN. Based on studies of several pieces of literature in the medical field, it can be concluded that STAR, WT1, and PEG3 play an important role in ovarian cancer cases. However, the relationship between ASPN and ovarian cancer in more detail has not been studied by medical researchers.https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/12580cartimportant genesmachine learningmicroarray dataovarian cancersifs
spellingShingle Ni Kadek Emik Sapitri
Umu Sa'adah
Nur Shofianah
IDENTIFYING IMPORTANT GENES IN OVARIAN CANCER FROM HIGH-DIMENSIONAL MICROARRAY DATA USING SIFS-CART METHOD
Barekeng
cart
important genes
machine learning
microarray data
ovarian cancer
sifs
title IDENTIFYING IMPORTANT GENES IN OVARIAN CANCER FROM HIGH-DIMENSIONAL MICROARRAY DATA USING SIFS-CART METHOD
title_full IDENTIFYING IMPORTANT GENES IN OVARIAN CANCER FROM HIGH-DIMENSIONAL MICROARRAY DATA USING SIFS-CART METHOD
title_fullStr IDENTIFYING IMPORTANT GENES IN OVARIAN CANCER FROM HIGH-DIMENSIONAL MICROARRAY DATA USING SIFS-CART METHOD
title_full_unstemmed IDENTIFYING IMPORTANT GENES IN OVARIAN CANCER FROM HIGH-DIMENSIONAL MICROARRAY DATA USING SIFS-CART METHOD
title_short IDENTIFYING IMPORTANT GENES IN OVARIAN CANCER FROM HIGH-DIMENSIONAL MICROARRAY DATA USING SIFS-CART METHOD
title_sort identifying important genes in ovarian cancer from high dimensional microarray data using sifs cart method
topic cart
important genes
machine learning
microarray data
ovarian cancer
sifs
url https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/12580
work_keys_str_mv AT nikadekemiksapitri identifyingimportantgenesinovariancancerfromhighdimensionalmicroarraydatausingsifscartmethod
AT umusaadah identifyingimportantgenesinovariancancerfromhighdimensionalmicroarraydatausingsifscartmethod
AT nurshofianah identifyingimportantgenesinovariancancerfromhighdimensionalmicroarraydatausingsifscartmethod