Multimodal deep learning for cephalometric landmark detection and treatment prediction

Abstract In orthodontics and maxillofacial surgery, accurate cephalometric analysis and treatment outcome prediction are critical for clinical decision-making. Traditional approaches rely on manual landmark identification, which is time-consuming and subject to inter-observer variability, while exis...

Full description

Saved in:
Bibliographic Details
Main Authors: Fei Gao, Yulong Tang
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-06229-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849333275480293376
author Fei Gao
Yulong Tang
author_facet Fei Gao
Yulong Tang
author_sort Fei Gao
collection DOAJ
description Abstract In orthodontics and maxillofacial surgery, accurate cephalometric analysis and treatment outcome prediction are critical for clinical decision-making. Traditional approaches rely on manual landmark identification, which is time-consuming and subject to inter-observer variability, while existing automated methods typically utilize single imaging modalities with limited accuracy. This paper presents DeepFuse, a novel multi-modal deep learning framework that integrates information from lateral cephalograms, CBCT volumes, and digital dental models to simultaneously perform landmark detection and treatment outcome prediction. The framework employs modality-specific encoders, an attention-guided fusion mechanism, and dual-task decoders to leverage complementary information across imaging techniques. Extensive experiments on three clinical datasets demonstrate that DeepFuse achieves a mean radial error of 1.21 mm for landmark detection, representing a 13% improvement over state-of-the-art methods, with a clinical acceptability rate of 92.4% at the 2 mm threshold. For treatment outcome prediction, the framework attains an overall accuracy of 85.6%, significantly outperforming both conventional prediction models and experienced clinicians. The proposed approach enhances diagnostic precision and treatment planning while providing interpretable visualization of decision factors, demonstrating significant potential for clinical integration in orthodontic and maxillofacial practice.
format Article
id doaj-art-27f67df121b141f0b2aa3564c067cbe5
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-27f67df121b141f0b2aa3564c067cbe52025-08-20T03:45:55ZengNature PortfolioScientific Reports2045-23222025-07-0115111810.1038/s41598-025-06229-wMultimodal deep learning for cephalometric landmark detection and treatment predictionFei Gao0Yulong Tang1Department of Stomatology, General Hospital of PLA Northern Theater CommandDepartment of Stomatology, General Hospital of PLA Northern Theater CommandAbstract In orthodontics and maxillofacial surgery, accurate cephalometric analysis and treatment outcome prediction are critical for clinical decision-making. Traditional approaches rely on manual landmark identification, which is time-consuming and subject to inter-observer variability, while existing automated methods typically utilize single imaging modalities with limited accuracy. This paper presents DeepFuse, a novel multi-modal deep learning framework that integrates information from lateral cephalograms, CBCT volumes, and digital dental models to simultaneously perform landmark detection and treatment outcome prediction. The framework employs modality-specific encoders, an attention-guided fusion mechanism, and dual-task decoders to leverage complementary information across imaging techniques. Extensive experiments on three clinical datasets demonstrate that DeepFuse achieves a mean radial error of 1.21 mm for landmark detection, representing a 13% improvement over state-of-the-art methods, with a clinical acceptability rate of 92.4% at the 2 mm threshold. For treatment outcome prediction, the framework attains an overall accuracy of 85.6%, significantly outperforming both conventional prediction models and experienced clinicians. The proposed approach enhances diagnostic precision and treatment planning while providing interpretable visualization of decision factors, demonstrating significant potential for clinical integration in orthodontic and maxillofacial practice.https://doi.org/10.1038/s41598-025-06229-wCephalometric analysisMulti-modal deep learningLandmark detectionTreatment outcome predictionAttention mechanismOrthodontics
spellingShingle Fei Gao
Yulong Tang
Multimodal deep learning for cephalometric landmark detection and treatment prediction
Scientific Reports
Cephalometric analysis
Multi-modal deep learning
Landmark detection
Treatment outcome prediction
Attention mechanism
Orthodontics
title Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_full Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_fullStr Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_full_unstemmed Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_short Multimodal deep learning for cephalometric landmark detection and treatment prediction
title_sort multimodal deep learning for cephalometric landmark detection and treatment prediction
topic Cephalometric analysis
Multi-modal deep learning
Landmark detection
Treatment outcome prediction
Attention mechanism
Orthodontics
url https://doi.org/10.1038/s41598-025-06229-w
work_keys_str_mv AT feigao multimodaldeeplearningforcephalometriclandmarkdetectionandtreatmentprediction
AT yulongtang multimodaldeeplearningforcephalometriclandmarkdetectionandtreatmentprediction