Clustering of unknown protocol messages based on format comparison

Protocol reverse is a solution for detecting and analyzing location or proprietary protocols, and packet clustering for protocol formats is the basic way to identify unknown protocol packets. In this paper, we propose an Unknown Protocol Packet Clustering MethodBased on Format Matching (CUPFC), whic...

Full description

Saved in:
Bibliographic Details
Main Author: ZHANG Mingyuan, LIU Xiaolei, WU Xiaohu, ZHANG Xiaojian
Format: Article
Language:zho
Published: Editorial Office of Command Control and Simulation 2025-06-01
Series:Zhihui kongzhi yu fangzhen
Subjects:
Online Access:https://www.zhkzyfz.cn/fileup/1673-3819/PDF/1748399454763-1178042456.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Protocol reverse is a solution for detecting and analyzing location or proprietary protocols, and packet clustering for protocol formats is the basic way to identify unknown protocol packets. In this paper, we propose an Unknown Protocol Packet Clustering MethodBased on Format Matching (CUPFC), which introduces the enhanced Barcos paradigm, defines Token Format Distance (TFD) and Message Format Distance (MFD) to represent the format similarity of Token and packets, and introduces Jaccard distance and an optimized sequence alignment algorithm to calculate them. Then, the MFD is used to establish a distance matrix and input it into the DBSCAN model to cluster unknown protocol packets into classes of different formats. On the two simulation datasets, the harmonic mean v measure of clustering is above 0.91, and the FMI and coverage are not less than 0.97, which has great advantages compared with previous work.
ISSN:1673-3819