Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.
Mural image recognition plays a critical role in the digital preservation of cultural heritage; however, it faces cross-cultural and multi-period style generalization challenges, compounded by limited sample sizes and intricate details, such as losses caused by natural weathering of mural surfaces a...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0328507 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849390956384616448 |
|---|---|
| author | Shulan Wang Siyu Liu Mengting Jin Pingmei Fan |
| author_facet | Shulan Wang Siyu Liu Mengting Jin Pingmei Fan |
| author_sort | Shulan Wang |
| collection | DOAJ |
| description | Mural image recognition plays a critical role in the digital preservation of cultural heritage; however, it faces cross-cultural and multi-period style generalization challenges, compounded by limited sample sizes and intricate details, such as losses caused by natural weathering of mural surfaces and complex artistic patterns.This paper proposes a deep learning model based on DenseNet201-FPN, incorporating a Bidirectional Convolutional Block Attention Module (Bi-CBAM), dynamic focal distillation loss, and convex regularization. First, a lightweight Feature Pyramid Network (FPN) is embedded into DenseNet201 to fuse multi-scale texture features (28 × 28 × 256, 14 × 14 × 512, 7 × 7 × 1024). Second, a bidirectional LSTM-driven attention module iteratively optimizes channel and spatial weights, enhancing detail perception for low-frequency categories. Third, a dynamic temperature distillation strategy (T = 3 → 1) balances supervision from teacher models (ResNeXt101) and ground truth, improving the F1-score of rare classes by 6.1%. Experimental results on a self-constructed mural dataset (2,000 images,26 subcategories.) demonstrate 87.9% accuracy (+3.7% over DenseNet201) and real-time inference on edge devices (63ms/frame at 8.1W on Jetson TX2). This study provides a cost-effective solution for large-scale mural digitization in resource-constrained environments. |
| format | Article |
| id | doaj-art-95d4e3ae654d45e7acfb800cf1772837 |
| institution | Kabale University |
| issn | 1932-6203 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-95d4e3ae654d45e7acfb800cf17728372025-08-20T03:41:14ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01208e032850710.1371/journal.pone.0328507Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification.Shulan WangSiyu LiuMengting JinPingmei FanMural image recognition plays a critical role in the digital preservation of cultural heritage; however, it faces cross-cultural and multi-period style generalization challenges, compounded by limited sample sizes and intricate details, such as losses caused by natural weathering of mural surfaces and complex artistic patterns.This paper proposes a deep learning model based on DenseNet201-FPN, incorporating a Bidirectional Convolutional Block Attention Module (Bi-CBAM), dynamic focal distillation loss, and convex regularization. First, a lightweight Feature Pyramid Network (FPN) is embedded into DenseNet201 to fuse multi-scale texture features (28 × 28 × 256, 14 × 14 × 512, 7 × 7 × 1024). Second, a bidirectional LSTM-driven attention module iteratively optimizes channel and spatial weights, enhancing detail perception for low-frequency categories. Third, a dynamic temperature distillation strategy (T = 3 → 1) balances supervision from teacher models (ResNeXt101) and ground truth, improving the F1-score of rare classes by 6.1%. Experimental results on a self-constructed mural dataset (2,000 images,26 subcategories.) demonstrate 87.9% accuracy (+3.7% over DenseNet201) and real-time inference on edge devices (63ms/frame at 8.1W on Jetson TX2). This study provides a cost-effective solution for large-scale mural digitization in resource-constrained environments.https://doi.org/10.1371/journal.pone.0328507 |
| spellingShingle | Shulan Wang Siyu Liu Mengting Jin Pingmei Fan Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification. PLoS ONE |
| title | Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification. |
| title_full | Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification. |
| title_fullStr | Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification. |
| title_full_unstemmed | Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification. |
| title_short | Multi-scale feature pyramid network with bidirectional attention for efficient mural image classification. |
| title_sort | multi scale feature pyramid network with bidirectional attention for efficient mural image classification |
| url | https://doi.org/10.1371/journal.pone.0328507 |
| work_keys_str_mv | AT shulanwang multiscalefeaturepyramidnetworkwithbidirectionalattentionforefficientmuralimageclassification AT siyuliu multiscalefeaturepyramidnetworkwithbidirectionalattentionforefficientmuralimageclassification AT mengtingjin multiscalefeaturepyramidnetworkwithbidirectionalattentionforefficientmuralimageclassification AT pingmeifan multiscalefeaturepyramidnetworkwithbidirectionalattentionforefficientmuralimageclassification |