Big Data Analysis of Lung Cancer Dataset Using Classification

Background of this study was among cancers; lung cancer is a major killer on a global scale. It is essential to accurately classify cancer subtypes in order to determine effective therapy options for lung cancer, a common and fatal disease. The methods used in the study were classification algorithm...

Full description

Saved in:
Bibliographic Details
Main Authors: Tannady Hendy, Fernandes Andry Johanes, Susanto William, Tannady Tan Henny, Bin Rakiman Umol Syamsyul
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:E3S Web of Conferences
Subjects:
Online Access:https://www.e3s-conferences.org/articles/e3sconf/pdf/2025/19/e3sconf_icsget2025_03010.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849736949054570496
author Tannady Hendy
Fernandes Andry Johanes
Susanto William
Tannady Tan Henny
Bin Rakiman Umol Syamsyul
author_facet Tannady Hendy
Fernandes Andry Johanes
Susanto William
Tannady Tan Henny
Bin Rakiman Umol Syamsyul
author_sort Tannady Hendy
collection DOAJ
description Background of this study was among cancers; lung cancer is a major killer on a global scale. It is essential to accurately classify cancer subtypes in order to determine effective therapy options for lung cancer, a common and fatal disease. The methods used in the study were classification algorithms for analysing data of lung cancer cases. Lung cancer detection, treatment, and prevention have all come a long way in the last several years, the enhancement of big data method and analysis helps several previous studies that discussed about how big data took important role in medical and health sector. This research was conducted to facilitate the detection of lung cancer based on the symptoms experienced by patients. Result or finding from the study show that RapidMiner’s decision tree algorithm achieved an impressively high level of accuracy, with a Kappa score of 74.32%. This finding proves that the study’s data is reliable enough to identify lung cancer. Result of this study was also stressed the need for habit and symptom-based early detection and diagnosis of lung cancer.
format Article
id doaj-art-45f01e47ded746c2b3ace57a294b4b70
institution DOAJ
issn 2267-1242
language English
publishDate 2025-01-01
publisher EDP Sciences
record_format Article
series E3S Web of Conferences
spelling doaj-art-45f01e47ded746c2b3ace57a294b4b702025-08-20T03:07:06ZengEDP SciencesE3S Web of Conferences2267-12422025-01-016190301010.1051/e3sconf/202561903010e3sconf_icsget2025_03010Big Data Analysis of Lung Cancer Dataset Using ClassificationTannady Hendy0Fernandes Andry Johanes1Susanto William2Tannady Tan Henny3Bin Rakiman Umol Syamsyul4*Department of Management, Universitas Esa UnggulDepartment of Information System, Universitas Bunda MuliaDepartment of Information System, Universitas Bunda MuliaDepartment of Internal Medicine, Universitas Kristen Krida WacanaDepartment of Business Management, Universiti Teknologi MARABackground of this study was among cancers; lung cancer is a major killer on a global scale. It is essential to accurately classify cancer subtypes in order to determine effective therapy options for lung cancer, a common and fatal disease. The methods used in the study were classification algorithms for analysing data of lung cancer cases. Lung cancer detection, treatment, and prevention have all come a long way in the last several years, the enhancement of big data method and analysis helps several previous studies that discussed about how big data took important role in medical and health sector. This research was conducted to facilitate the detection of lung cancer based on the symptoms experienced by patients. Result or finding from the study show that RapidMiner’s decision tree algorithm achieved an impressively high level of accuracy, with a Kappa score of 74.32%. This finding proves that the study’s data is reliable enough to identify lung cancer. Result of this study was also stressed the need for habit and symptom-based early detection and diagnosis of lung cancer.https://www.e3s-conferences.org/articles/e3sconf/pdf/2025/19/e3sconf_icsget2025_03010.pdfbig datalung cancerclassificationdecision tree
spellingShingle Tannady Hendy
Fernandes Andry Johanes
Susanto William
Tannady Tan Henny
Bin Rakiman Umol Syamsyul
Big Data Analysis of Lung Cancer Dataset Using Classification
E3S Web of Conferences
big data
lung cancer
classification
decision tree
title Big Data Analysis of Lung Cancer Dataset Using Classification
title_full Big Data Analysis of Lung Cancer Dataset Using Classification
title_fullStr Big Data Analysis of Lung Cancer Dataset Using Classification
title_full_unstemmed Big Data Analysis of Lung Cancer Dataset Using Classification
title_short Big Data Analysis of Lung Cancer Dataset Using Classification
title_sort big data analysis of lung cancer dataset using classification
topic big data
lung cancer
classification
decision tree
url https://www.e3s-conferences.org/articles/e3sconf/pdf/2025/19/e3sconf_icsget2025_03010.pdf
work_keys_str_mv AT tannadyhendy bigdataanalysisoflungcancerdatasetusingclassification
AT fernandesandryjohanes bigdataanalysisoflungcancerdatasetusingclassification
AT susantowilliam bigdataanalysisoflungcancerdatasetusingclassification
AT tannadytanhenny bigdataanalysisoflungcancerdatasetusingclassification
AT binrakimanumolsyamsyul bigdataanalysisoflungcancerdatasetusingclassification