Multimodal deep learning for cephalometric landmark detection and treatment prediction

Abstract In orthodontics and maxillofacial surgery, accurate cephalometric analysis and treatment outcome prediction are critical for clinical decision-making. Traditional approaches rely on manual landmark identification, which is time-consuming and subject to inter-observer variability, while exis...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fei Gao, Yulong Tang
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-07-01
Series:	Scientific Reports
Subjects:	Cephalometric analysis Multi-modal deep learning Landmark detection Treatment outcome prediction Attention mechanism Orthodontics
Online Access:	https://doi.org/10.1038/s41598-025-06229-w
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849333275480293376
author	Fei Gao Yulong Tang
author_facet	Fei Gao Yulong Tang
author_sort	Fei Gao
collection	DOAJ
description	Abstract In orthodontics and maxillofacial surgery, accurate cephalometric analysis and treatment outcome prediction are critical for clinical decision-making. Traditional approaches rely on manual landmark identification, which is time-consuming and subject to inter-observer variability, while existing automated methods typically utilize single imaging modalities with limited accuracy. This paper presents DeepFuse, a novel multi-modal deep learning framework that integrates information from lateral cephalograms, CBCT volumes, and digital dental models to simultaneously perform landmark detection and treatment outcome prediction. The framework employs modality-specific encoders, an attention-guided fusion mechanism, and dual-task decoders to leverage complementary information across imaging techniques. Extensive experiments on three clinical datasets demonstrate that DeepFuse achieves a mean radial error of 1.21 mm for landmark detection, representing a 13% improvement over state-of-the-art methods, with a clinical acceptability rate of 92.4% at the 2 mm threshold. For treatment outcome prediction, the framework attains an overall accuracy of 85.6%, significantly outperforming both conventional prediction models and experienced clinicians. The proposed approach enhances diagnostic precision and treatment planning while providing interpretable visualization of decision factors, demonstrating significant potential for clinical integration in orthodontic and maxillofacial practice.
format	Article
id	doaj-art-27f67df121b141f0b2aa3564c067cbe5
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-07-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-27f67df121b141f0b2aa3564c067cbe52025-08-20T03:45:55ZengNature PortfolioScientific Reports2045-23222025-07-0115111810.1038/s41598-025-06229-wMultimodal deep learning for cephalometric landmark detection and treatment predictionFei Gao0Yulong Tang1Department of Stomatology, General Hospital of PLA Northern Theater CommandDepartment of Stomatology, General Hospital of PLA Northern Theater CommandAbstract In orthodontics and maxillofacial surgery, accurate cephalometric analysis and treatment outcome prediction are critical for clinical decision-making. Traditional approaches rely on manual landmark identification, which is time-consuming and subject to inter-observer variability, while existing automated methods typically utilize single imaging modalities with limited accuracy. This paper presents DeepFuse, a novel multi-modal deep learning framework that integrates information from lateral cephalograms, CBCT volumes, and digital dental models to simultaneously perform landmark detection and treatment outcome prediction. The framework employs modality-specific encoders, an attention-guided fusion mechanism, and dual-task decoders to leverage complementary information across imaging techniques. Extensive experiments on three clinical datasets demonstrate that DeepFuse achieves a mean radial error of 1.21 mm for landmark detection, representing a 13% improvement over state-of-the-art methods, with a clinical acceptability rate of 92.4% at the 2 mm threshold. For treatment outcome prediction, the framework attains an overall accuracy of 85.6%, significantly outperforming both conventional prediction models and experienced clinicians. The proposed approach enhances diagnostic precision and treatment planning while providing interpretable visualization of decision factors, demonstrating significant potential for clinical integration in orthodontic and maxillofacial practice.https://doi.org/10.1038/s41598-025-06229-wCephalometric analysisMulti-modal deep learningLandmark detectionTreatment outcome predictionAttention mechanismOrthodontics
spellingShingle	Fei Gao Yulong Tang Multimodal deep learning for cephalometric landmark detection and treatment prediction Scientific Reports Cephalometric analysis Multi-modal deep learning Landmark detection Treatment outcome prediction Attention mechanism Orthodontics
title	Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_full	Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_fullStr	Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_full_unstemmed	Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_short	Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_sort	multimodal deep learning for cephalometric landmark detection and treatment prediction
topic	Cephalometric analysis Multi-modal deep learning Landmark detection Treatment outcome prediction Attention mechanism Orthodontics
url	https://doi.org/10.1038/s41598-025-06229-w
work_keys_str_mv	AT feigao multimodaldeeplearningforcephalometriclandmarkdetectionandtreatmentprediction AT yulongtang multimodaldeeplearningforcephalometriclandmarkdetectionandtreatmentprediction

Multimodal deep learning for cephalometric landmark detection and treatment prediction

Similar Items