Robust Malware identification via deep temporal convolutional network with symmetric cross entropy learning
Abstract Recent developments in the field of Internet of things (IoT) have aroused growing attention to the security of smart devices. Specifically, there is an increasing number of malicious software (Malware) on IoT systems. Nowadays, researchers have made many efforts concerning supervised machin...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2023-08-01
|
Series: | IET Software |
Subjects: | |
Online Access: | https://doi.org/10.1049/sfw2.12137 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832547377003626496 |
---|---|
author | Jiankun Sun Xiong Luo Weiping Wang Yang Gao Wenbing Zhao |
author_facet | Jiankun Sun Xiong Luo Weiping Wang Yang Gao Wenbing Zhao |
author_sort | Jiankun Sun |
collection | DOAJ |
description | Abstract Recent developments in the field of Internet of things (IoT) have aroused growing attention to the security of smart devices. Specifically, there is an increasing number of malicious software (Malware) on IoT systems. Nowadays, researchers have made many efforts concerning supervised machine learning methods to identify malicious attacks. High‐quality labels are of great importance for supervised machine learning, but noises widely exist due to the non‐deterministic production environment. Therefore, learning from noisy labels is significant for machine learning‐enabled Malware identification. In this study, motivated by the symmetric cross entropy with satisfactory noise robustness, the authors propose a robust Malware identification method using temporal convolutional network (TCN). Moreover, word embedding techniques are generally utilised to understand the contextual relationship between the input operation code (opcode) and application programming interface function names. Here, considering the numerous unlabelled samples in real‐world intelligent environments, the authors pre‐train the TCN model on an unlabelled set using a word embedding method, that is, Word2Vec. In the experiments, the proposed method is compared with several traditional statistical methods and more recent neural networks on a synthetic Malware dataset and a real‐world dataset. The performance comparisons demonstrate the better performance and noise robustness of their proposed method, especially that the proposed method can yield the best identification accuracy of 98.75% in real‐world scenarios. |
format | Article |
id | doaj-art-9ac01bd31dd54f1a9379eb52a530acd8 |
institution | Kabale University |
issn | 1751-8806 1751-8814 |
language | English |
publishDate | 2023-08-01 |
publisher | Wiley |
record_format | Article |
series | IET Software |
spelling | doaj-art-9ac01bd31dd54f1a9379eb52a530acd82025-02-03T06:45:11ZengWileyIET Software1751-88061751-88142023-08-0117439240410.1049/sfw2.12137Robust Malware identification via deep temporal convolutional network with symmetric cross entropy learningJiankun Sun0Xiong Luo1Weiping Wang2Yang Gao3Wenbing Zhao4School of Computer and Communication Engineering University of Science and Technology Beijing Beijing ChinaSchool of Computer and Communication Engineering University of Science and Technology Beijing Beijing ChinaSchool of Computer and Communication Engineering University of Science and Technology Beijing Beijing ChinaChina Information Technology Security Evaluation Center Beijing ChinaDepartment of Electrical Engineering and Computer Science Cleveland State University Cleveland Ohio USAAbstract Recent developments in the field of Internet of things (IoT) have aroused growing attention to the security of smart devices. Specifically, there is an increasing number of malicious software (Malware) on IoT systems. Nowadays, researchers have made many efforts concerning supervised machine learning methods to identify malicious attacks. High‐quality labels are of great importance for supervised machine learning, but noises widely exist due to the non‐deterministic production environment. Therefore, learning from noisy labels is significant for machine learning‐enabled Malware identification. In this study, motivated by the symmetric cross entropy with satisfactory noise robustness, the authors propose a robust Malware identification method using temporal convolutional network (TCN). Moreover, word embedding techniques are generally utilised to understand the contextual relationship between the input operation code (opcode) and application programming interface function names. Here, considering the numerous unlabelled samples in real‐world intelligent environments, the authors pre‐train the TCN model on an unlabelled set using a word embedding method, that is, Word2Vec. In the experiments, the proposed method is compared with several traditional statistical methods and more recent neural networks on a synthetic Malware dataset and a real‐world dataset. The performance comparisons demonstrate the better performance and noise robustness of their proposed method, especially that the proposed method can yield the best identification accuracy of 98.75% in real‐world scenarios.https://doi.org/10.1049/sfw2.12137learning (artificial intelligence)security of datasystems analysis |
spellingShingle | Jiankun Sun Xiong Luo Weiping Wang Yang Gao Wenbing Zhao Robust Malware identification via deep temporal convolutional network with symmetric cross entropy learning IET Software learning (artificial intelligence) security of data systems analysis |
title | Robust Malware identification via deep temporal convolutional network with symmetric cross entropy learning |
title_full | Robust Malware identification via deep temporal convolutional network with symmetric cross entropy learning |
title_fullStr | Robust Malware identification via deep temporal convolutional network with symmetric cross entropy learning |
title_full_unstemmed | Robust Malware identification via deep temporal convolutional network with symmetric cross entropy learning |
title_short | Robust Malware identification via deep temporal convolutional network with symmetric cross entropy learning |
title_sort | robust malware identification via deep temporal convolutional network with symmetric cross entropy learning |
topic | learning (artificial intelligence) security of data systems analysis |
url | https://doi.org/10.1049/sfw2.12137 |
work_keys_str_mv | AT jiankunsun robustmalwareidentificationviadeeptemporalconvolutionalnetworkwithsymmetriccrossentropylearning AT xiongluo robustmalwareidentificationviadeeptemporalconvolutionalnetworkwithsymmetriccrossentropylearning AT weipingwang robustmalwareidentificationviadeeptemporalconvolutionalnetworkwithsymmetriccrossentropylearning AT yanggao robustmalwareidentificationviadeeptemporalconvolutionalnetworkwithsymmetriccrossentropylearning AT wenbingzhao robustmalwareidentificationviadeeptemporalconvolutionalnetworkwithsymmetriccrossentropylearning |