GT-SRR: A Structured Method for Social Relation Recognition with GGNN-Based Transformer

Social relationship recognition (SRR) holds significant value in fields such as behavior analysis and intelligent social systems. However, existing methods primarily focus on modeling individual visual traits, interaction patterns, and scene-level contextual cues, often failing to capture the comple...

Full description

Saved in:
Bibliographic Details
Main Authors: Dejiao Huang, Menglei Xia, Ruyi Chang, Xiaohan Kong, Shuai Guo
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/10/2992
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Social relationship recognition (SRR) holds significant value in fields such as behavior analysis and intelligent social systems. However, existing methods primarily focus on modeling individual visual traits, interaction patterns, and scene-level contextual cues, often failing to capture the complex dependencies among these features and the hierarchical structure of social groups, which are crucial for effective reasoning. In order to overcome these restrictions, this essay suggests a SRR model that integrates Gated Graph Neural Network (GGNN) and Transformer. The task for SRR in this model is image-based. Specifically, the purpose of a novel and robust hybrid feature extraction module is to capture individual characteristics, relative positional information, and group-level cues, which are used to construct relation nodes and group nodes. A modified GGNN is then employed to model the logical dependencies between features. Nevertheless, GGNN alone lacks the capacity to dynamically adjust feature importance, which may result in ambiguous relationship representations. The Transformer’s multi-head self-attention (MSA) mechanism is integrated to improve feature interaction modeling, allowing the model to capture global context and higher-order dependencies effectively. By fusing pairwise features, graph-structured features, and group-level information. Experimental results on public datasets such as PISC demonstrate that the proposed approach outperforms comparison models including Dual-Glance, GRM, GRRN, Graph-BERT, and SRT in terms of accuracy and mean average precision (mAP), validating its effectiveness in multi-feature representation learning and global reasoning.
ISSN:1424-8220