Feature Extraction Model of SE-CMT Semantic Information Supplement
In image classification, beneficial semantic information supplementation can efficiently capture key regions and improve classification performance. To obtain beneficial image semantic information, an SE-CMT (SE-Networks CNN Meet Transformer) model is proposed. The model is based on the simple CNN f...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
Harbin University of Science and Technology Publications
2024-12-01
|
| Series: | Journal of Harbin University of Science and Technology |
| Subjects: | |
| Online Access: | https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=2384 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In image classification, beneficial semantic information supplementation can efficiently capture key regions and improve classification performance. To obtain beneficial image semantic information, an SE-CMT (SE-Networks CNN Meet Transformer) model is proposed. The model is based on the simple CNN feature extraction theory, where the input image is rescaled by the SE-CMT Stem to the previously extracted features, and then the features are enhanced by the deep convolutional layer in the SE-CMT Block. The model uses SE-CNN (Squeeze-and-Excitation Networks-CNN) to extract low-level features, enhance localization, and combine with Transformer to establish long-range dependencies to improve feature extraction performance by fusing SE-CNN and Transformer structures. The experimental results on ImageNet and CIFAR-10 datasets show that the classification accuracy of the SE-CMT model reaches 85. 47% and 87. 16% top-1 accuracy, respectively, and the experiments show that the method outperforms the baseline models CMT and Vision Transformer. Therefore, the proposed SE-CMT model in this study is an effective method for image feature extraction. |
|---|---|
| ISSN: | 1007-2683 |