Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.

Mural image recognition plays a critical role in the digital preservation of cultural heritage; however, it faces cross-cultural and multi-period style generalization challenges, compounded by limited sample sizes and intricate details, such as losses caused by natural weathering of mural surfaces a...

Full description

Saved in:
Bibliographic Details
Main Authors: Shulan Wang, Siyu Liu, Mengting Jin, Pingmei Fan
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0328507
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849390956384616448
author Shulan Wang
Siyu Liu
Mengting Jin
Pingmei Fan
author_facet Shulan Wang
Siyu Liu
Mengting Jin
Pingmei Fan
author_sort Shulan Wang
collection DOAJ
description Mural image recognition plays a critical role in the digital preservation of cultural heritage; however, it faces cross-cultural and multi-period style generalization challenges, compounded by limited sample sizes and intricate details, such as losses caused by natural weathering of mural surfaces and complex artistic patterns.This paper proposes a deep learning model based on DenseNet201-FPN, incorporating a Bidirectional Convolutional Block Attention Module (Bi-CBAM), dynamic focal distillation loss, and convex regularization. First, a lightweight Feature Pyramid Network (FPN) is embedded into DenseNet201 to fuse multi-scale texture features (28 × 28 × 256, 14 × 14 × 512, 7 × 7 × 1024). Second, a bidirectional LSTM-driven attention module iteratively optimizes channel and spatial weights, enhancing detail perception for low-frequency categories. Third, a dynamic temperature distillation strategy (T = 3 → 1) balances supervision from teacher models (ResNeXt101) and ground truth, improving the F1-score of rare classes by 6.1%. Experimental results on a self-constructed mural dataset (2,000 images,26 subcategories.) demonstrate 87.9% accuracy (+3.7% over DenseNet201) and real-time inference on edge devices (63ms/frame at 8.1W on Jetson TX2). This study provides a cost-effective solution for large-scale mural digitization in resource-constrained environments.
format Article
id doaj-art-95d4e3ae654d45e7acfb800cf1772837
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-95d4e3ae654d45e7acfb800cf17728372025-08-20T03:41:14ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01208e032850710.1371/journal.pone.0328507Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.Shulan WangSiyu LiuMengting JinPingmei FanMural image recognition plays a critical role in the digital preservation of cultural heritage; however, it faces cross-cultural and multi-period style generalization challenges, compounded by limited sample sizes and intricate details, such as losses caused by natural weathering of mural surfaces and complex artistic patterns.This paper proposes a deep learning model based on DenseNet201-FPN, incorporating a Bidirectional Convolutional Block Attention Module (Bi-CBAM), dynamic focal distillation loss, and convex regularization. First, a lightweight Feature Pyramid Network (FPN) is embedded into DenseNet201 to fuse multi-scale texture features (28 × 28 × 256, 14 × 14 × 512, 7 × 7 × 1024). Second, a bidirectional LSTM-driven attention module iteratively optimizes channel and spatial weights, enhancing detail perception for low-frequency categories. Third, a dynamic temperature distillation strategy (T = 3 → 1) balances supervision from teacher models (ResNeXt101) and ground truth, improving the F1-score of rare classes by 6.1%. Experimental results on a self-constructed mural dataset (2,000 images,26 subcategories.) demonstrate 87.9% accuracy (+3.7% over DenseNet201) and real-time inference on edge devices (63ms/frame at 8.1W on Jetson TX2). This study provides a cost-effective solution for large-scale mural digitization in resource-constrained environments.https://doi.org/10.1371/journal.pone.0328507
spellingShingle Shulan Wang
Siyu Liu
Mengting Jin
Pingmei Fan
Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.
PLoS ONE
title Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.
title_full Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.
title_fullStr Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.
title_full_unstemmed Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.
title_short Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.
title_sort multi scale feature pyramid network with bidirectional attention for efficient mural image classification
url https://doi.org/10.1371/journal.pone.0328507
work_keys_str_mv AT shulanwang multiscalefeaturepyramidnetworkwithbidirectionalattentionforefficientmuralimageclassification
AT siyuliu multiscalefeaturepyramidnetworkwithbidirectionalattentionforefficientmuralimageclassification
AT mengtingjin multiscalefeaturepyramidnetworkwithbidirectionalattentionforefficientmuralimageclassification
AT pingmeifan multiscalefeaturepyramidnetworkwithbidirectionalattentionforefficientmuralimageclassification