Cycle-configuration descriptors: a novel graph-theoretic approach to enhancing molecular inference
Abstract Inference of molecules with desired activities/properties is one of the key and challenging issues in cheminformatics and bioinformatics. For that purpose, our research group has recently developed a state-of-the-art framework mol-infer for molecular inference. This framework first construc...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-08-01
|
| Series: | Journal of Cheminformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s13321-025-01042-z |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849735953130717184 |
|---|---|
| author | Bowen Song Jianshen Zhu Naveed Ahmed Azam Kazuya Haraguchi Liang Zhao Tatsuya Akutsu |
| author_facet | Bowen Song Jianshen Zhu Naveed Ahmed Azam Kazuya Haraguchi Liang Zhao Tatsuya Akutsu |
| author_sort | Bowen Song |
| collection | DOAJ |
| description | Abstract Inference of molecules with desired activities/properties is one of the key and challenging issues in cheminformatics and bioinformatics. For that purpose, our research group has recently developed a state-of-the-art framework mol-infer for molecular inference. This framework first constructs a prediction function for a fixed property using machine learning models, which is then simulated by mixed-integer linear programming to infer desired molecules. The accuracy of the framework heavily relies on the representation power of the descriptors. In this study, we highlight a typical class of non-isomorphic chemical graphs with reasonably different property values that cannot be distinguished by the standard “two-layered (2L) model" of mol-infer. To address this distinguishability problem of the 2L model, we propose a novel family of descriptors, named cycle-configuration (CC), which captures the notion of ortho/meta/para patterns that appear in aromatic rings, which was impossible in the framework so far. Extensive computational experiments show that with the new descriptors, we can construct prediction functions with similar or better performance for all 44 tested chemical properties, including 27 regression datasets and 17 classification datasets comparing with our previous studies, confirming the effectiveness of the CC descriptors. For inference, we also provide a system of linear constraints to formulate the CC descriptors as linear constraints. We demonstrate that a chemical graph with up to 50 non-hydrogen vertices can be inferred within a practical time frame. |
| format | Article |
| id | doaj-art-49636484070b4e34bb1e770e677f389f |
| institution | DOAJ |
| issn | 1758-2946 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | BMC |
| record_format | Article |
| series | Journal of Cheminformatics |
| spelling | doaj-art-49636484070b4e34bb1e770e677f389f2025-08-20T03:07:24ZengBMCJournal of Cheminformatics1758-29462025-08-0117112710.1186/s13321-025-01042-zCycle-configuration descriptors: a novel graph-theoretic approach to enhancing molecular inferenceBowen Song0Jianshen Zhu1Naveed Ahmed Azam2Kazuya Haraguchi3Liang Zhao4Tatsuya Akutsu5Graduate School of Informatics, Kyoto UniversityGraduate School of Informatics, Kyoto UniversityDepartment of Mathematics, Quaid-i-Azam UniversityGraduate School of Informatics, Kyoto UniversityGraduate School of Advanced Integrated Studies in Human Survivability, Kyoto UniversityBioinformatics Center, Institute for Chemical Research, Kyoto UniversityAbstract Inference of molecules with desired activities/properties is one of the key and challenging issues in cheminformatics and bioinformatics. For that purpose, our research group has recently developed a state-of-the-art framework mol-infer for molecular inference. This framework first constructs a prediction function for a fixed property using machine learning models, which is then simulated by mixed-integer linear programming to infer desired molecules. The accuracy of the framework heavily relies on the representation power of the descriptors. In this study, we highlight a typical class of non-isomorphic chemical graphs with reasonably different property values that cannot be distinguished by the standard “two-layered (2L) model" of mol-infer. To address this distinguishability problem of the 2L model, we propose a novel family of descriptors, named cycle-configuration (CC), which captures the notion of ortho/meta/para patterns that appear in aromatic rings, which was impossible in the framework so far. Extensive computational experiments show that with the new descriptors, we can construct prediction functions with similar or better performance for all 44 tested chemical properties, including 27 regression datasets and 17 classification datasets comparing with our previous studies, confirming the effectiveness of the CC descriptors. For inference, we also provide a system of linear constraints to formulate the CC descriptors as linear constraints. We demonstrate that a chemical graph with up to 50 non-hydrogen vertices can be inferred within a practical time frame.https://doi.org/10.1186/s13321-025-01042-zInverse QSAR/QSPRMolecular inferenceDescriptor designMixed integer linear programmingMachine learning |
| spellingShingle | Bowen Song Jianshen Zhu Naveed Ahmed Azam Kazuya Haraguchi Liang Zhao Tatsuya Akutsu Cycle-configuration descriptors: a novel graph-theoretic approach to enhancing molecular inference Journal of Cheminformatics Inverse QSAR/QSPR Molecular inference Descriptor design Mixed integer linear programming Machine learning |
| title | Cycle-configuration descriptors: a novel graph-theoretic approach to enhancing molecular inference |
| title_full | Cycle-configuration descriptors: a novel graph-theoretic approach to enhancing molecular inference |
| title_fullStr | Cycle-configuration descriptors: a novel graph-theoretic approach to enhancing molecular inference |
| title_full_unstemmed | Cycle-configuration descriptors: a novel graph-theoretic approach to enhancing molecular inference |
| title_short | Cycle-configuration descriptors: a novel graph-theoretic approach to enhancing molecular inference |
| title_sort | cycle configuration descriptors a novel graph theoretic approach to enhancing molecular inference |
| topic | Inverse QSAR/QSPR Molecular inference Descriptor design Mixed integer linear programming Machine learning |
| url | https://doi.org/10.1186/s13321-025-01042-z |
| work_keys_str_mv | AT bowensong cycleconfigurationdescriptorsanovelgraphtheoreticapproachtoenhancingmolecularinference AT jianshenzhu cycleconfigurationdescriptorsanovelgraphtheoreticapproachtoenhancingmolecularinference AT naveedahmedazam cycleconfigurationdescriptorsanovelgraphtheoreticapproachtoenhancingmolecularinference AT kazuyaharaguchi cycleconfigurationdescriptorsanovelgraphtheoreticapproachtoenhancingmolecularinference AT liangzhao cycleconfigurationdescriptorsanovelgraphtheoreticapproachtoenhancingmolecularinference AT tatsuyaakutsu cycleconfigurationdescriptorsanovelgraphtheoreticapproachtoenhancingmolecularinference |