MPAR-RCNN: a multi-task network for multiple person detection with attribute recognition

Multi-label attribute recognition is a critical task in computer vision, with applications ranging across diverse fields. This problem often involves detecting objects with multiple attributes, necessitating sophisticated models capable of both high-level differentiation and fine-grained feature ext...

Full description

Saved in:
Bibliographic Details
Main Authors: S. Raghavendra, S. K. Abhilash, Venu Madhav Nookala, Jayashree Shetty, Praveen Gurunath Bharathi
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-02-01
Series:Frontiers in Artificial Intelligence
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frai.2025.1454488/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825206722170454016
author S. Raghavendra
S. K. Abhilash
Venu Madhav Nookala
Jayashree Shetty
Praveen Gurunath Bharathi
author_facet S. Raghavendra
S. K. Abhilash
Venu Madhav Nookala
Jayashree Shetty
Praveen Gurunath Bharathi
author_sort S. Raghavendra
collection DOAJ
description Multi-label attribute recognition is a critical task in computer vision, with applications ranging across diverse fields. This problem often involves detecting objects with multiple attributes, necessitating sophisticated models capable of both high-level differentiation and fine-grained feature extraction. The integration of object detection and attribute recognition typically relies on approaches such as dual-stage networks, where accurate predictions depend on advanced feature extraction techniques, such as Region of Interest (RoI) pooling. To meet these demands, an efficient method that achieves both reliable detection and attribute classification in a unified framework is essential. This study introduces an innovative MTL framework designed to incorporate Multi-Person Attribute Recognition (MPAR) within a single-model architecture. Named MPAR-RCNN, this framework unifies object detection and attribute recognition tasks through a spatially aware, shared backbone, facilitating efficient and accurate multi-label prediction. Unlike the traditional Fast Region-based Convolutional Neural Network (R-CNN), which separately manages person detection and attribute classification with a dual-stage network, the MPAR-RCNN architecture optimizes both tasks within a single structure. Validated on the WIDER (Web Image Dataset for Event Recognition) dataset, the proposed model demonstrates an improvement over current state-of-the-art (SOTA) architectures, showcasing its potential in advancing multi-label attribute recognition.
format Article
id doaj-art-77d593a4310f427599239968b5052400
institution Kabale University
issn 2624-8212
language English
publishDate 2025-02-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Artificial Intelligence
spelling doaj-art-77d593a4310f427599239968b50524002025-02-07T06:49:49ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122025-02-01810.3389/frai.2025.14544881454488MPAR-RCNN: a multi-task network for multiple person detection with attribute recognitionS. Raghavendra0S. K. Abhilash1Venu Madhav Nookala2Jayashree Shetty3Praveen Gurunath Bharathi4Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, IndiaKPIT Technologies, Bengaluru, IndiaKPIT Technologies, Bengaluru, IndiaDepartment of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, IndiaNuclear Medicine and Molecular Imaging, Department of Radiology, Stanford Medicine, Palo Alto, CA, United StatesMulti-label attribute recognition is a critical task in computer vision, with applications ranging across diverse fields. This problem often involves detecting objects with multiple attributes, necessitating sophisticated models capable of both high-level differentiation and fine-grained feature extraction. The integration of object detection and attribute recognition typically relies on approaches such as dual-stage networks, where accurate predictions depend on advanced feature extraction techniques, such as Region of Interest (RoI) pooling. To meet these demands, an efficient method that achieves both reliable detection and attribute classification in a unified framework is essential. This study introduces an innovative MTL framework designed to incorporate Multi-Person Attribute Recognition (MPAR) within a single-model architecture. Named MPAR-RCNN, this framework unifies object detection and attribute recognition tasks through a spatially aware, shared backbone, facilitating efficient and accurate multi-label prediction. Unlike the traditional Fast Region-based Convolutional Neural Network (R-CNN), which separately manages person detection and attribute classification with a dual-stage network, the MPAR-RCNN architecture optimizes both tasks within a single structure. Validated on the WIDER (Web Image Dataset for Event Recognition) dataset, the proposed model demonstrates an improvement over current state-of-the-art (SOTA) architectures, showcasing its potential in advancing multi-label attribute recognition.https://www.frontiersin.org/articles/10.3389/frai.2025.1454488/fullattribute recognitionconvolution neural networkhuman attribute recognitionmulti-task learningobject detection
spellingShingle S. Raghavendra
S. K. Abhilash
Venu Madhav Nookala
Jayashree Shetty
Praveen Gurunath Bharathi
MPAR-RCNN: a multi-task network for multiple person detection with attribute recognition
Frontiers in Artificial Intelligence
attribute recognition
convolution neural network
human attribute recognition
multi-task learning
object detection
title MPAR-RCNN: a multi-task network for multiple person detection with attribute recognition
title_full MPAR-RCNN: a multi-task network for multiple person detection with attribute recognition
title_fullStr MPAR-RCNN: a multi-task network for multiple person detection with attribute recognition
title_full_unstemmed MPAR-RCNN: a multi-task network for multiple person detection with attribute recognition
title_short MPAR-RCNN: a multi-task network for multiple person detection with attribute recognition
title_sort mpar rcnn a multi task network for multiple person detection with attribute recognition
topic attribute recognition
convolution neural network
human attribute recognition
multi-task learning
object detection
url https://www.frontiersin.org/articles/10.3389/frai.2025.1454488/full
work_keys_str_mv AT sraghavendra mparrcnnamultitasknetworkformultiplepersondetectionwithattributerecognition
AT skabhilash mparrcnnamultitasknetworkformultiplepersondetectionwithattributerecognition
AT venumadhavnookala mparrcnnamultitasknetworkformultiplepersondetectionwithattributerecognition
AT jayashreeshetty mparrcnnamultitasknetworkformultiplepersondetectionwithattributerecognition
AT praveengurunathbharathi mparrcnnamultitasknetworkformultiplepersondetectionwithattributerecognition