Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug Metabolism

Abstract We collected and organized a detailed dataset encompassing both substrates and non-substrates for six principal cytochrome P450 (CYP450) isozymes, responsible for 90% of Phase I drug metabolism in humans. These isozymes, specifically CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4, play...

Full description

Saved in:
Bibliographic Details
Main Authors: Yu-Hao Ni, Yu-Wen Su, Shang-Chen Yang, Jia-Cheng Hong, Po-Wen Allen Du, Yu-Ting Hsu, Tien-Chueh Kuo, Yufeng Jane Tseng
Format: Article
Language:English
Published: Nature Portfolio 2025-08-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-05753-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849387930124025856
author Yu-Hao Ni
Yu-Wen Su
Shang-Chen Yang
Jia-Cheng Hong
Po-Wen Allen Du
Yu-Ting Hsu
Tien-Chueh Kuo
Yufeng Jane Tseng
author_facet Yu-Hao Ni
Yu-Wen Su
Shang-Chen Yang
Jia-Cheng Hong
Po-Wen Allen Du
Yu-Ting Hsu
Tien-Chueh Kuo
Yufeng Jane Tseng
author_sort Yu-Hao Ni
collection DOAJ
description Abstract We collected and organized a detailed dataset encompassing both substrates and non-substrates for six principal cytochrome P450 (CYP450) isozymes, responsible for 90% of Phase I drug metabolism in humans. These isozymes, specifically CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4, play critical roles in the detoxification and metabolic processing of therapeutic compounds. The dataset, meticulously assembled, includes interactions with approximately 2000 compounds per enzyme, ensuring comprehensive coverage and high accuracy. Employing a combination of conventional machine learning techniques alongside advanced methodologies such as Graph Convolutional Networks (GCN), robust models have been developed to elucidate these drug-enzyme interactions. The dataset is poised to significantly contribute to fields requiring pharmacokinetic modeling, furthering drug development efforts and toxicological studies by providing an essential resource for the accurate prediction of metabolic pathways, thereby enhancing drug safety and efficacy assessments.
format Article
id doaj-art-a9896a41ccbf4b7e8bfa3952e4ae35fe
institution Kabale University
issn 2052-4463
language English
publishDate 2025-08-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-a9896a41ccbf4b7e8bfa3952e4ae35fe2025-08-20T03:42:26ZengNature PortfolioScientific Data2052-44632025-08-011211810.1038/s41597-025-05753-8Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug MetabolismYu-Hao Ni0Yu-Wen Su1Shang-Chen Yang2Jia-Cheng Hong3Po-Wen Allen Du4Yu-Ting Hsu5Tien-Chueh Kuo6Yufeng Jane Tseng7Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan UniversityGraduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan UniversitySchool of Medicine, National Taiwan UniversityGraduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan UniversityGraduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan UniversityGraduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan UniversityGraduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan UniversityGraduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan UniversityAbstract We collected and organized a detailed dataset encompassing both substrates and non-substrates for six principal cytochrome P450 (CYP450) isozymes, responsible for 90% of Phase I drug metabolism in humans. These isozymes, specifically CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4, play critical roles in the detoxification and metabolic processing of therapeutic compounds. The dataset, meticulously assembled, includes interactions with approximately 2000 compounds per enzyme, ensuring comprehensive coverage and high accuracy. Employing a combination of conventional machine learning techniques alongside advanced methodologies such as Graph Convolutional Networks (GCN), robust models have been developed to elucidate these drug-enzyme interactions. The dataset is poised to significantly contribute to fields requiring pharmacokinetic modeling, furthering drug development efforts and toxicological studies by providing an essential resource for the accurate prediction of metabolic pathways, thereby enhancing drug safety and efficacy assessments.https://doi.org/10.1038/s41597-025-05753-8
spellingShingle Yu-Hao Ni
Yu-Wen Su
Shang-Chen Yang
Jia-Cheng Hong
Po-Wen Allen Du
Yu-Ting Hsu
Tien-Chueh Kuo
Yufeng Jane Tseng
Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug Metabolism
Scientific Data
title Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug Metabolism
title_full Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug Metabolism
title_fullStr Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug Metabolism
title_full_unstemmed Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug Metabolism
title_short Curated CYP450 Interaction Dataset: Covering the Majority of Phase I Drug Metabolism
title_sort curated cyp450 interaction dataset covering the majority of phase i drug metabolism
url https://doi.org/10.1038/s41597-025-05753-8
work_keys_str_mv AT yuhaoni curatedcyp450interactiondatasetcoveringthemajorityofphaseidrugmetabolism
AT yuwensu curatedcyp450interactiondatasetcoveringthemajorityofphaseidrugmetabolism
AT shangchenyang curatedcyp450interactiondatasetcoveringthemajorityofphaseidrugmetabolism
AT jiachenghong curatedcyp450interactiondatasetcoveringthemajorityofphaseidrugmetabolism
AT powenallendu curatedcyp450interactiondatasetcoveringthemajorityofphaseidrugmetabolism
AT yutinghsu curatedcyp450interactiondatasetcoveringthemajorityofphaseidrugmetabolism
AT tienchuehkuo curatedcyp450interactiondatasetcoveringthemajorityofphaseidrugmetabolism
AT yufengjanetseng curatedcyp450interactiondatasetcoveringthemajorityofphaseidrugmetabolism