A robust machine learning model based on ribosomal‐subunit‐derived piRNAs for diagnostic potential of nonsmall cell lung cancer across multicentre, large‐scale of sequencing data

Abstract Nonsmall cell lung cancer (NSCLC) is a lethal cancer and lacks robust biomarkers for noninvasive clinical diagnosis. Detecting NSCLC at the early stage can decrease the mortality rate and minimise harm caused by various treatments. We curated 2050 samples from public tissue and plasma datas...

Full description

Saved in:
Bibliographic Details
Main Authors: Zitong Gao, Masaki Nasu, Gehan Devendra, Ayman A. Abdul‐Ghani, Anthony J. Herrera, Jeffrey A. Borgia, Christopher W. Seder, Donna Lee Kuehu, Zhuokun Feng, Yu Chen, Ting Gong, Zao Zhang, Owen Chan, Hua Yang, Jianhua Yu, Yuanyuan Fu, Lang Wu, Youping Deng
Format: Article
Language:English
Published: Wiley 2025-08-01
Series:Clinical and Translational Medicine
Subjects:
Online Access:https://doi.org/10.1002/ctm2.70418
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849223293995843584
author Zitong Gao
Masaki Nasu
Gehan Devendra
Ayman A. Abdul‐Ghani
Anthony J. Herrera
Jeffrey A. Borgia
Christopher W. Seder
Donna Lee Kuehu
Zhuokun Feng
Yu Chen
Ting Gong
Zao Zhang
Owen Chan
Hua Yang
Jianhua Yu
Yuanyuan Fu
Lang Wu
Youping Deng
author_facet Zitong Gao
Masaki Nasu
Gehan Devendra
Ayman A. Abdul‐Ghani
Anthony J. Herrera
Jeffrey A. Borgia
Christopher W. Seder
Donna Lee Kuehu
Zhuokun Feng
Yu Chen
Ting Gong
Zao Zhang
Owen Chan
Hua Yang
Jianhua Yu
Yuanyuan Fu
Lang Wu
Youping Deng
author_sort Zitong Gao
collection DOAJ
description Abstract Nonsmall cell lung cancer (NSCLC) is a lethal cancer and lacks robust biomarkers for noninvasive clinical diagnosis. Detecting NSCLC at the early stage can decrease the mortality rate and minimise harm caused by various treatments. We curated 2050 samples from public tissue and plasma datasets including both invasive and noninvasive types, then supplemented with in‐house pooled plasma and exosome samples. Eleven independent transcriptome datasets were utilised to develop a new machine learning model by integrating PIWI‐interacting RNA (piRNA) to predict NSCLC. Five piRNA signatures derived from ribosomal subunits identified to be tumour‐specific exhibited robust diagnostic ability and were combined into a piRNA‐Based Tumour Probability Index (pi‐TPI) risk evaluation model. pi‐TPI effectively distinguished NSCLC patients from healthy individuals and showed efficacy in identifying early‐stage cancers with Area under the ROC Curve (AUC) values over .80. Plasma cohorts exhibited the diagnosis efficacy of pi‐TPI with an AUC value of .85. Experimental exosomal data enhances the accuracy of diagnosing noncancerous, benign, and cancer cases. The pi‐TPI marker in the noncancer/cancer subgroup exhibited superior predictive performance with an AUC value of .96. These findings underscore the significant clinical potential of the five piRNA signatures as a powerful diagnostic tool for NSCLC, particularly of noninvasive cancer diagnostics.
format Article
id doaj-art-2c128a9f05b340fca55c09b6cceed79e
institution Kabale University
issn 2001-1326
language English
publishDate 2025-08-01
publisher Wiley
record_format Article
series Clinical and Translational Medicine
spelling doaj-art-2c128a9f05b340fca55c09b6cceed79e2025-08-25T18:28:44ZengWileyClinical and Translational Medicine2001-13262025-08-01158n/an/a10.1002/ctm2.70418A robust machine learning model based on ribosomal‐subunit‐derived piRNAs for diagnostic potential of nonsmall cell lung cancer across multicentre, large‐scale of sequencing dataZitong Gao0Masaki Nasu1Gehan Devendra2Ayman A. Abdul‐Ghani3Anthony J. Herrera4Jeffrey A. Borgia5Christopher W. Seder6Donna Lee Kuehu7Zhuokun Feng8Yu Chen9Ting Gong10Zao Zhang11Owen Chan12Hua Yang13Jianhua Yu14Yuanyuan Fu15Lang Wu16Youping Deng17Department of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USADepartment of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USADepartment of Medicine John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USACardiothoracic Surgery The Queen's Medical Center, Honolulu Honolulu Hawaiʻi USAInterventional Radiology The Queen's Medical Center Honolulu Hawaiʻi USADepartments of Anatomy & Cell Biology and Pathology RUSH University Cancer Center Chicago Illinois USACardiothoracic Residency Program RUSH University Chicago Illinois USADepartment of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USADepartment of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USADepartment of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USADepartment of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USAHospitalist Medicine The Queen's Medical Center Honolulu Hawaiʻi USAPathology Core Shared Resource University of Hawaii Cancer Center Honolulu Hawaiʻi USADepartment of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USAInstitute for Precision Cancer Therapeutics and Immuno‐Oncology Chao Family Comprehensive Cancer Center University of California Irvine California USADepartment of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USAPacific Center for Genome Research University of Hawaii Cancer Center Honolulu Hawaiʻi USADepartment of Quantitative Health Sciences John A. Burns School of Medicine University of Hawaii at Manoa Honolulu Hawaiʻi USAAbstract Nonsmall cell lung cancer (NSCLC) is a lethal cancer and lacks robust biomarkers for noninvasive clinical diagnosis. Detecting NSCLC at the early stage can decrease the mortality rate and minimise harm caused by various treatments. We curated 2050 samples from public tissue and plasma datasets including both invasive and noninvasive types, then supplemented with in‐house pooled plasma and exosome samples. Eleven independent transcriptome datasets were utilised to develop a new machine learning model by integrating PIWI‐interacting RNA (piRNA) to predict NSCLC. Five piRNA signatures derived from ribosomal subunits identified to be tumour‐specific exhibited robust diagnostic ability and were combined into a piRNA‐Based Tumour Probability Index (pi‐TPI) risk evaluation model. pi‐TPI effectively distinguished NSCLC patients from healthy individuals and showed efficacy in identifying early‐stage cancers with Area under the ROC Curve (AUC) values over .80. Plasma cohorts exhibited the diagnosis efficacy of pi‐TPI with an AUC value of .85. Experimental exosomal data enhances the accuracy of diagnosing noncancerous, benign, and cancer cases. The pi‐TPI marker in the noncancer/cancer subgroup exhibited superior predictive performance with an AUC value of .96. These findings underscore the significant clinical potential of the five piRNA signatures as a powerful diagnostic tool for NSCLC, particularly of noninvasive cancer diagnostics.https://doi.org/10.1002/ctm2.70418machine learningnoninvasive diagnosisnonsmall cell lung cancerPIWI‐interacting RNAsmall noncoding RNA
spellingShingle Zitong Gao
Masaki Nasu
Gehan Devendra
Ayman A. Abdul‐Ghani
Anthony J. Herrera
Jeffrey A. Borgia
Christopher W. Seder
Donna Lee Kuehu
Zhuokun Feng
Yu Chen
Ting Gong
Zao Zhang
Owen Chan
Hua Yang
Jianhua Yu
Yuanyuan Fu
Lang Wu
Youping Deng
A robust machine learning model based on ribosomal‐subunit‐derived piRNAs for diagnostic potential of nonsmall cell lung cancer across multicentre, large‐scale of sequencing data
Clinical and Translational Medicine
machine learning
noninvasive diagnosis
nonsmall cell lung cancer
PIWI‐interacting RNA
small noncoding RNA
title A robust machine learning model based on ribosomal‐subunit‐derived piRNAs for diagnostic potential of nonsmall cell lung cancer across multicentre, large‐scale of sequencing data
title_full A robust machine learning model based on ribosomal‐subunit‐derived piRNAs for diagnostic potential of nonsmall cell lung cancer across multicentre, large‐scale of sequencing data
title_fullStr A robust machine learning model based on ribosomal‐subunit‐derived piRNAs for diagnostic potential of nonsmall cell lung cancer across multicentre, large‐scale of sequencing data
title_full_unstemmed A robust machine learning model based on ribosomal‐subunit‐derived piRNAs for diagnostic potential of nonsmall cell lung cancer across multicentre, large‐scale of sequencing data
title_short A robust machine learning model based on ribosomal‐subunit‐derived piRNAs for diagnostic potential of nonsmall cell lung cancer across multicentre, large‐scale of sequencing data
title_sort robust machine learning model based on ribosomal subunit derived pirnas for diagnostic potential of nonsmall cell lung cancer across multicentre large scale of sequencing data
topic machine learning
noninvasive diagnosis
nonsmall cell lung cancer
PIWI‐interacting RNA
small noncoding RNA
url https://doi.org/10.1002/ctm2.70418
work_keys_str_mv AT zitonggao arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT masakinasu arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT gehandevendra arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT aymanaabdulghani arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT anthonyjherrera arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT jeffreyaborgia arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT christopherwseder arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT donnaleekuehu arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT zhuokunfeng arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT yuchen arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT tinggong arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT zaozhang arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT owenchan arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT huayang arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT jianhuayu arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT yuanyuanfu arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT langwu arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT youpingdeng arobustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT zitonggao robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT masakinasu robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT gehandevendra robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT aymanaabdulghani robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT anthonyjherrera robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT jeffreyaborgia robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT christopherwseder robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT donnaleekuehu robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT zhuokunfeng robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT yuchen robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT tinggong robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT zaozhang robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT owenchan robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT huayang robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT jianhuayu robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT yuanyuanfu robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT langwu robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata
AT youpingdeng robustmachinelearningmodelbasedonribosomalsubunitderivedpirnasfordiagnosticpotentialofnonsmallcelllungcanceracrossmulticentrelargescaleofsequencingdata