IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions

IntroductionIntrinsically disordered regions (IDRs) of proteins have traditionally been overlooked as drug targets. However, with growing recognition of their crucial role in biological activity and their involvement in various diseases, IDRs have emerged as promising targets for drug discovery. Des...

Full description

Saved in:
Bibliographic Details
Main Authors: Clara Shionyu-Mitusyama, Satoshi Ohmori, Subaru Hirata, Hirokazu Ishida, Tsuyoshi Shirai
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-07-01
Series:Frontiers in Bioinformatics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fbinf.2025.1627836/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849430332308193280
author Clara Shionyu-Mitusyama
Satoshi Ohmori
Subaru Hirata
Hirokazu Ishida
Tsuyoshi Shirai
Tsuyoshi Shirai
author_facet Clara Shionyu-Mitusyama
Satoshi Ohmori
Subaru Hirata
Hirokazu Ishida
Tsuyoshi Shirai
Tsuyoshi Shirai
author_sort Clara Shionyu-Mitusyama
collection DOAJ
description IntroductionIntrinsically disordered regions (IDRs) of proteins have traditionally been overlooked as drug targets. However, with growing recognition of their crucial role in biological activity and their involvement in various diseases, IDRs have emerged as promising targets for drug discovery. Despite this potential, rational methodologies for IDR-targeted drug discovery remain underdeveloped, primarily due to a lack of reference experimental data.MethodsThis study explores a machine learning approach to predict IDR functions, drug interaction sites, and interacting molecular substructures within IDR sequences. To address the data gap, stepwise transfer learning was employed. IDRdecoder sequentially generate predictions for IDR classification, interaction sites, and interacting ligand substructures. In the first step, the neural net was trained as autoencoder by using 26,480,862 predicted IDR sequences. Then it was trained against 57,692 ligand-binding PDB sequences with higher IDR tendency via transfer learning for predict ligand interacting sites and ligand types.ResultsIDRdecoder was evaluated against 9 IDR sequences, which were experimentally detailed as drug targets. In the encoding space, specific GO terms related to the hypothesized functions of the evaluation IDR sequences were highly enriched. The model’s prediction performance for drug interacting sites and ligand types demonstrated the area under the curve (AUC) of 0.616 and 0.702, respectively. The performance was compared with existing methods including ProteinBERT, and IDRdecoder demonstrated moderately improved performance.DiscussionIDRdecoder is the first application for predicting drug interaction sites and ligands in IDR sequences. Analysis of the prediction results revealed characteristics beneficial for IDR-drug design; for instance, Tyr and Ala are preferred target sites, while flexible substructures, such as alkyl groups, are favored in ligand molecules.
format Article
id doaj-art-e78ce32e8bbd48a0ad0c1393d83bf39d
institution Kabale University
issn 2673-7647
language English
publishDate 2025-07-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Bioinformatics
spelling doaj-art-e78ce32e8bbd48a0ad0c1393d83bf39d2025-08-20T03:28:01ZengFrontiers Media S.A.Frontiers in Bioinformatics2673-76472025-07-01510.3389/fbinf.2025.16278361627836IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regionsClara Shionyu-Mitusyama0Satoshi Ohmori1Subaru Hirata2Hirokazu Ishida3Tsuyoshi Shirai4Tsuyoshi Shirai5Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, JapanDepartment of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, JapanFaculty of Data Science, Shiga University 1-1-1 Banba, Hikone, Shiga, JapanDepartment of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, JapanDepartment of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, JapanFaculty of Data Science, Shiga University 1-1-1 Banba, Hikone, Shiga, JapanIntroductionIntrinsically disordered regions (IDRs) of proteins have traditionally been overlooked as drug targets. However, with growing recognition of their crucial role in biological activity and their involvement in various diseases, IDRs have emerged as promising targets for drug discovery. Despite this potential, rational methodologies for IDR-targeted drug discovery remain underdeveloped, primarily due to a lack of reference experimental data.MethodsThis study explores a machine learning approach to predict IDR functions, drug interaction sites, and interacting molecular substructures within IDR sequences. To address the data gap, stepwise transfer learning was employed. IDRdecoder sequentially generate predictions for IDR classification, interaction sites, and interacting ligand substructures. In the first step, the neural net was trained as autoencoder by using 26,480,862 predicted IDR sequences. Then it was trained against 57,692 ligand-binding PDB sequences with higher IDR tendency via transfer learning for predict ligand interacting sites and ligand types.ResultsIDRdecoder was evaluated against 9 IDR sequences, which were experimentally detailed as drug targets. In the encoding space, specific GO terms related to the hypothesized functions of the evaluation IDR sequences were highly enriched. The model’s prediction performance for drug interacting sites and ligand types demonstrated the area under the curve (AUC) of 0.616 and 0.702, respectively. The performance was compared with existing methods including ProteinBERT, and IDRdecoder demonstrated moderately improved performance.DiscussionIDRdecoder is the first application for predicting drug interaction sites and ligands in IDR sequences. Analysis of the prediction results revealed characteristics beneficial for IDR-drug design; for instance, Tyr and Ala are preferred target sites, while flexible substructures, such as alkyl groups, are favored in ligand molecules.https://www.frontiersin.org/articles/10.3389/fbinf.2025.1627836/fullintrinsically disordered proteinsneural netsequence-based prediction methodstructural bioinformaticsdrug design
spellingShingle Clara Shionyu-Mitusyama
Satoshi Ohmori
Subaru Hirata
Hirokazu Ishida
Tsuyoshi Shirai
Tsuyoshi Shirai
IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions
Frontiers in Bioinformatics
intrinsically disordered proteins
neural net
sequence-based prediction method
structural bioinformatics
drug design
title IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions
title_full IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions
title_fullStr IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions
title_full_unstemmed IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions
title_short IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions
title_sort idrdecoder a machine learning approach for rational drug discovery toward intrinsically disordered regions
topic intrinsically disordered proteins
neural net
sequence-based prediction method
structural bioinformatics
drug design
url https://www.frontiersin.org/articles/10.3389/fbinf.2025.1627836/full
work_keys_str_mv AT clarashionyumitusyama idrdecoderamachinelearningapproachforrationaldrugdiscoverytowardintrinsicallydisorderedregions
AT satoshiohmori idrdecoderamachinelearningapproachforrationaldrugdiscoverytowardintrinsicallydisorderedregions
AT subaruhirata idrdecoderamachinelearningapproachforrationaldrugdiscoverytowardintrinsicallydisorderedregions
AT hirokazuishida idrdecoderamachinelearningapproachforrationaldrugdiscoverytowardintrinsicallydisorderedregions
AT tsuyoshishirai idrdecoderamachinelearningapproachforrationaldrugdiscoverytowardintrinsicallydisorderedregions
AT tsuyoshishirai idrdecoderamachinelearningapproachforrationaldrugdiscoverytowardintrinsicallydisorderedregions