Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion
Objective: Osteosarcoma is a prevalent primary malignant bone tumor in children and adolescents, accounting for approximately 5 % of childhood malignancies. Because of its rarity and biological complexity, treatment breakthroughs for osteosarcoma have been limited. To advance research in this field,...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-05-01
|
| Series: | Intelligent Medicine |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2667102625000269 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850123603428769792 |
|---|---|
| author | Lulu Zhang Weisong Zhao Zhiwei Cheng Yafei Jiang Kai Tian Jia Shi Zhenyu Jiang Yingqi Hua |
| author_facet | Lulu Zhang Weisong Zhao Zhiwei Cheng Yafei Jiang Kai Tian Jia Shi Zhenyu Jiang Yingqi Hua |
| author_sort | Lulu Zhang |
| collection | DOAJ |
| description | Objective: Osteosarcoma is a prevalent primary malignant bone tumor in children and adolescents, accounting for approximately 5 % of childhood malignancies. Because of its rarity and biological complexity, treatment breakthroughs for osteosarcoma have been limited. To advance research in this field, we aimed to construct the first comprehensive osteosarcoma knowledge graph (OSKG) using the PubMed database. Methods: A systematic search of PubMed (2003–2023) using the keyword “osteosarcoma” yielded 25,415 abstracts. Leveraging BioBERT, pretrained on biomedical corpora and fine-tuned with osteosarcoma-specific manual annotations, we identified 16 entity types and 17 biological relationships. The extracted elements were synthesized to create the OSKG, resulting in a deep learning-based knowledge base to explore osteosarcoma pathogenesis and molecular mechanisms. We then developed a specialized question-answering system (knowledge graph question answering (KGQA)) powered by ChatGLM3. This system employs advanced natural language processing and incorporates the OSKG to ensure optimal response quality and accuracy. Results: The pretrained BioBERT averaged > 92 % accuracy in entity and relationship training. Evaluation using 100 pairs of gold-standard quizzes showed that the final quiz system outperformed other large language models in accuracy and robustness. Conclusion: The system is designed to provide accurate disease-related queries and answers, effectively facilitating knowledge acquisition and reasoning in medical research and clinical practice. This project offers a robust tool for osteosarcoma research and promotes the deep integration of knowledge graphs and artificial intelligence technologies in the medical field. |
| format | Article |
| id | doaj-art-3aa50d87b7b04f36bc66716d20d8e810 |
| institution | OA Journals |
| issn | 2667-1026 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Intelligent Medicine |
| spelling | doaj-art-3aa50d87b7b04f36bc66716d20d8e8102025-08-20T02:34:33ZengElsevierIntelligent Medicine2667-10262025-05-01529911010.1016/j.imed.2024.12.001Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusionLulu Zhang0Weisong Zhao1Zhiwei Cheng2Yafei Jiang3Kai Tian4Jia Shi5Zhenyu Jiang6Yingqi Hua7School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; Department of Orthopedic Oncology, Shanghai Bone Tumor Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, ChinaDepartment of Orthopedic Oncology, Shanghai Bone Tumor Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, ChinaDepartment of Orthopedic Oncology, Shanghai Bone Tumor Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, ChinaDepartment of Orthopedic Oncology, Shanghai Bone Tumor Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, ChinaDepartment of Orthopedic Oncology, Shanghai Bone Tumor Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, ChinaDepartment of Orthopedic Oncology, Shanghai Bone Tumor Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, ChinaSchool of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, ChinaDepartment of Orthopedic Oncology, Shanghai Bone Tumor Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, China; Corresponding author: Yingqi Hua, Department of Orthopedic Oncology, Shanghai Bone Tumor Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, China.Objective: Osteosarcoma is a prevalent primary malignant bone tumor in children and adolescents, accounting for approximately 5 % of childhood malignancies. Because of its rarity and biological complexity, treatment breakthroughs for osteosarcoma have been limited. To advance research in this field, we aimed to construct the first comprehensive osteosarcoma knowledge graph (OSKG) using the PubMed database. Methods: A systematic search of PubMed (2003–2023) using the keyword “osteosarcoma” yielded 25,415 abstracts. Leveraging BioBERT, pretrained on biomedical corpora and fine-tuned with osteosarcoma-specific manual annotations, we identified 16 entity types and 17 biological relationships. The extracted elements were synthesized to create the OSKG, resulting in a deep learning-based knowledge base to explore osteosarcoma pathogenesis and molecular mechanisms. We then developed a specialized question-answering system (knowledge graph question answering (KGQA)) powered by ChatGLM3. This system employs advanced natural language processing and incorporates the OSKG to ensure optimal response quality and accuracy. Results: The pretrained BioBERT averaged > 92 % accuracy in entity and relationship training. Evaluation using 100 pairs of gold-standard quizzes showed that the final quiz system outperformed other large language models in accuracy and robustness. Conclusion: The system is designed to provide accurate disease-related queries and answers, effectively facilitating knowledge acquisition and reasoning in medical research and clinical practice. This project offers a robust tool for osteosarcoma research and promotes the deep integration of knowledge graphs and artificial intelligence technologies in the medical field.http://www.sciencedirect.com/science/article/pii/S2667102625000269OsteosarcomaKnowledge graphLarge language modelText mining |
| spellingShingle | Lulu Zhang Weisong Zhao Zhiwei Cheng Yafei Jiang Kai Tian Jia Shi Zhenyu Jiang Yingqi Hua Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion Intelligent Medicine Osteosarcoma Knowledge graph Large language model Text mining |
| title | Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion |
| title_full | Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion |
| title_fullStr | Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion |
| title_full_unstemmed | Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion |
| title_short | Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion |
| title_sort | osteosarcoma knowledge graph question answering system deep learning based knowledge graph and large language model fusion |
| topic | Osteosarcoma Knowledge graph Large language model Text mining |
| url | http://www.sciencedirect.com/science/article/pii/S2667102625000269 |
| work_keys_str_mv | AT luluzhang osteosarcomaknowledgegraphquestionansweringsystemdeeplearningbasedknowledgegraphandlargelanguagemodelfusion AT weisongzhao osteosarcomaknowledgegraphquestionansweringsystemdeeplearningbasedknowledgegraphandlargelanguagemodelfusion AT zhiweicheng osteosarcomaknowledgegraphquestionansweringsystemdeeplearningbasedknowledgegraphandlargelanguagemodelfusion AT yafeijiang osteosarcomaknowledgegraphquestionansweringsystemdeeplearningbasedknowledgegraphandlargelanguagemodelfusion AT kaitian osteosarcomaknowledgegraphquestionansweringsystemdeeplearningbasedknowledgegraphandlargelanguagemodelfusion AT jiashi osteosarcomaknowledgegraphquestionansweringsystemdeeplearningbasedknowledgegraphandlargelanguagemodelfusion AT zhenyujiang osteosarcomaknowledgegraphquestionansweringsystemdeeplearningbasedknowledgegraphandlargelanguagemodelfusion AT yingqihua osteosarcomaknowledgegraphquestionansweringsystemdeeplearningbasedknowledgegraphandlargelanguagemodelfusion |