Semantically-Based Animal Pose Estimation in the Wild

Accurate animal pose estimation in the wild is potentially useful for many downstream applications such as wildlife conservation. Currently, the main approach to assessing animal poses is based on identifying keypoints of the body and constructing the skeleton. However, a direct application of frame...

Full description

Saved in:
Bibliographic Details
Main Authors: M. N. Favorskaya, D. N. Natalenko
Format: Article
Language:English
Published: Copernicus Publications 2024-12-01
Series:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://isprs-archives.copernicus.org/articles/XLVIII-2-W5-2024/33/2024/isprs-archives-XLVIII-2-W5-2024-33-2024.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850064130792226816
author M. N. Favorskaya
D. N. Natalenko
author_facet M. N. Favorskaya
D. N. Natalenko
author_sort M. N. Favorskaya
collection DOAJ
description Accurate animal pose estimation in the wild is potentially useful for many downstream applications such as wildlife conservation. Currently, the main approach to assessing animal poses is based on identifying keypoints of the body and constructing the skeleton. However, a direct application of frameworks to human pose estimation is not successful due to the features of the skeletal structure of humans and mammals. In this study, we propose a two-stage method: coarse-tuning with animal detection using a bounding box, as is done in most similar methods, and fine-tuning with semantic segmentation of animal. The YOLOv8 Pose Estimation and Pose Keypoint Classification model was chosen as the base model for keypoint extraction. Extensive training experiments were conducted using the AwA2 dataset (with a small number of samples from own dataset), the AP-10K dataset, and the Tiger-Pose dataset. The trained model was tested on own dataset collected from camera traps in the Ergaki National Park, Russia. Experimental results show that the proposed algorithm using additional semantic segmentation increases the accuracy of animal pose estimation by 3.6–4.8% on samples of the Ergaki dataset.
format Article
id doaj-art-caca2fb2a185457997f5b2148ea6dca8
institution DOAJ
issn 1682-1750
2194-9034
language English
publishDate 2024-12-01
publisher Copernicus Publications
record_format Article
series The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
spelling doaj-art-caca2fb2a185457997f5b2148ea6dca82025-08-20T02:49:23ZengCopernicus PublicationsThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences1682-17502194-90342024-12-01XLVIII-2-W5-2024334010.5194/isprs-archives-XLVIII-2-W5-2024-33-2024Semantically-Based Animal Pose Estimation in the WildM. N. Favorskaya0D. N. Natalenko1Reshetnev Siberian State University of Science and Technology, Institute of Informatics and Telecommunications, 31, Krasnoyarsky Rabochy ave., Krasnoyarsk, 660037, Russian FederationReshetnev Siberian State University of Science and Technology, Institute of Informatics and Telecommunications, 31, Krasnoyarsky Rabochy ave., Krasnoyarsk, 660037, Russian FederationAccurate animal pose estimation in the wild is potentially useful for many downstream applications such as wildlife conservation. Currently, the main approach to assessing animal poses is based on identifying keypoints of the body and constructing the skeleton. However, a direct application of frameworks to human pose estimation is not successful due to the features of the skeletal structure of humans and mammals. In this study, we propose a two-stage method: coarse-tuning with animal detection using a bounding box, as is done in most similar methods, and fine-tuning with semantic segmentation of animal. The YOLOv8 Pose Estimation and Pose Keypoint Classification model was chosen as the base model for keypoint extraction. Extensive training experiments were conducted using the AwA2 dataset (with a small number of samples from own dataset), the AP-10K dataset, and the Tiger-Pose dataset. The trained model was tested on own dataset collected from camera traps in the Ergaki National Park, Russia. Experimental results show that the proposed algorithm using additional semantic segmentation increases the accuracy of animal pose estimation by 3.6–4.8% on samples of the Ergaki dataset.https://isprs-archives.copernicus.org/articles/XLVIII-2-W5-2024/33/2024/isprs-archives-XLVIII-2-W5-2024-33-2024.pdf
spellingShingle M. N. Favorskaya
D. N. Natalenko
Semantically-Based Animal Pose Estimation in the Wild
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
title Semantically-Based Animal Pose Estimation in the Wild
title_full Semantically-Based Animal Pose Estimation in the Wild
title_fullStr Semantically-Based Animal Pose Estimation in the Wild
title_full_unstemmed Semantically-Based Animal Pose Estimation in the Wild
title_short Semantically-Based Animal Pose Estimation in the Wild
title_sort semantically based animal pose estimation in the wild
url https://isprs-archives.copernicus.org/articles/XLVIII-2-W5-2024/33/2024/isprs-archives-XLVIII-2-W5-2024-33-2024.pdf
work_keys_str_mv AT mnfavorskaya semanticallybasedanimalposeestimationinthewild
AT dnnatalenko semanticallybasedanimalposeestimationinthewild