CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature Fusion

Social bots increasingly mimic real users and collaborate in large-scale influence campaigns, distorting public perception and making their detection both critical and challenging. Traditional bot detection methods, constrained by single-source features, often fail to capture the complete behavioral...

Full description

Saved in:
Bibliographic Details
Main Authors: Meng Cheng, Yuzhi Xiao, Tao Huang, Chao Lei, Chuang Zhang
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/11/3549
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850129398056878080
author Meng Cheng
Yuzhi Xiao
Tao Huang
Chao Lei
Chuang Zhang
author_facet Meng Cheng
Yuzhi Xiao
Tao Huang
Chao Lei
Chuang Zhang
author_sort Meng Cheng
collection DOAJ
description Social bots increasingly mimic real users and collaborate in large-scale influence campaigns, distorting public perception and making their detection both critical and challenging. Traditional bot detection methods, constrained by single-source features, often fail to capture the complete behavioral and contextual characteristics of social bots, especially their dynamic behavioral evolution and group coordination tactics, resulting in feature incompleteness and reduced detection performance. To address this challenge, we propose CB-MTE, a social bot detection framework based on multi-source heterogeneous feature fusion. CB-MTE adopts a hierarchical architecture: user metadata is used to construct behavioral portraits, deep semantic representations are extracted from textual content via DistilBERT, and community-aware graph embeddings are learned through a combination of random walk and Skip-gram modeling. To mitigate feature redundancy and preserve structural consistency, manifold learning is applied for nonlinear dimensionality reduction, ensuring both local and global topology are maintained. Finally, a CatBoost-based collaborative reasoning mechanism enhances model robustness through ordered target encoding and symmetric tree structures. Experiments on the TwiBot-22 benchmark dataset demonstrate that CB-MTE significantly outperforms mainstream detection models in recognizing dynamic behavioral traits and detecting collaborative bot activities. These results confirm the framework’s capability to capture the complete behavioral and contextual characteristics of social bots through multi-source feature integration.
format Article
id doaj-art-c99f32d3b618424ebe8a593717dac6cf
institution OA Journals
issn 1424-8220
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-c99f32d3b618424ebe8a593717dac6cf2025-08-20T02:33:00ZengMDPI AGSensors1424-82202025-06-012511354910.3390/s25113549CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature FusionMeng Cheng0Yuzhi Xiao1Tao Huang2Chao Lei3Chuang Zhang4School of Computer Science, Qinghai Normal University, Xining 810008, ChinaSchool of Computer Science, Qinghai Normal University, Xining 810008, ChinaSchool of Computer Science, Qinghai Normal University, Xining 810008, ChinaSchool of Computer Science, Qinghai Normal University, Xining 810008, ChinaSchool of Computer Science, Qinghai Normal University, Xining 810008, ChinaSocial bots increasingly mimic real users and collaborate in large-scale influence campaigns, distorting public perception and making their detection both critical and challenging. Traditional bot detection methods, constrained by single-source features, often fail to capture the complete behavioral and contextual characteristics of social bots, especially their dynamic behavioral evolution and group coordination tactics, resulting in feature incompleteness and reduced detection performance. To address this challenge, we propose CB-MTE, a social bot detection framework based on multi-source heterogeneous feature fusion. CB-MTE adopts a hierarchical architecture: user metadata is used to construct behavioral portraits, deep semantic representations are extracted from textual content via DistilBERT, and community-aware graph embeddings are learned through a combination of random walk and Skip-gram modeling. To mitigate feature redundancy and preserve structural consistency, manifold learning is applied for nonlinear dimensionality reduction, ensuring both local and global topology are maintained. Finally, a CatBoost-based collaborative reasoning mechanism enhances model robustness through ordered target encoding and symmetric tree structures. Experiments on the TwiBot-22 benchmark dataset demonstrate that CB-MTE significantly outperforms mainstream detection models in recognizing dynamic behavioral traits and detecting collaborative bot activities. These results confirm the framework’s capability to capture the complete behavioral and contextual characteristics of social bots through multi-source feature integration.https://www.mdpi.com/1424-8220/25/11/3549social bot detectionheterogeneous feature fusiongraph embeddingDistilBERTmanifold learningCB-MTE
spellingShingle Meng Cheng
Yuzhi Xiao
Tao Huang
Chao Lei
Chuang Zhang
CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature Fusion
Sensors
social bot detection
heterogeneous feature fusion
graph embedding
DistilBERT
manifold learning
CB-MTE
title CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature Fusion
title_full CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature Fusion
title_fullStr CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature Fusion
title_full_unstemmed CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature Fusion
title_short CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature Fusion
title_sort cb mte social bot detection via multi source heterogeneous feature fusion
topic social bot detection
heterogeneous feature fusion
graph embedding
DistilBERT
manifold learning
CB-MTE
url https://www.mdpi.com/1424-8220/25/11/3549
work_keys_str_mv AT mengcheng cbmtesocialbotdetectionviamultisourceheterogeneousfeaturefusion
AT yuzhixiao cbmtesocialbotdetectionviamultisourceheterogeneousfeaturefusion
AT taohuang cbmtesocialbotdetectionviamultisourceheterogeneousfeaturefusion
AT chaolei cbmtesocialbotdetectionviamultisourceheterogeneousfeaturefusion
AT chuangzhang cbmtesocialbotdetectionviamultisourceheterogeneousfeaturefusion