Multimodal Deep Learning for Android Malware Classification

This study investigates the integration of diverse data modalities within deep learning ensembles for Android malware classification. Android applications can be represented as binary images and function call graphs, each offering complementary perspectives on the executable. We synthesise these mod...

Full description

Saved in:
Bibliographic Details
Main Authors: James Arrowsmith, Teo Susnjak, Julian Jang-Jaccard
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/7/1/23
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849341311303286784
author James Arrowsmith
Teo Susnjak
Julian Jang-Jaccard
author_facet James Arrowsmith
Teo Susnjak
Julian Jang-Jaccard
author_sort James Arrowsmith
collection DOAJ
description This study investigates the integration of diverse data modalities within deep learning ensembles for Android malware classification. Android applications can be represented as binary images and function call graphs, each offering complementary perspectives on the executable. We synthesise these modalities by combining predictions from convolutional and graph neural networks with a multilayer perceptron. Empirical results demonstrate that multimodal models outperform their unimodal counterparts while remaining highly efficient. For instance, integrating a plain CNN with 83.1% accuracy and a GCN with 80.6% accuracy boosts overall accuracy to 88.3%. DenseNet-GIN achieves 90.6% accuracy, with no further improvement obtained by expanding this ensemble to four models. Based on our findings, we advocate for the flexible development of modalities to capture distinct aspects of applications and for the design of algorithms that effectively integrate this information.
format Article
id doaj-art-7eec08de0ba3414f8cc18a72d6a3db11
institution Kabale University
issn 2504-4990
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Machine Learning and Knowledge Extraction
spelling doaj-art-7eec08de0ba3414f8cc18a72d6a3db112025-08-20T03:43:40ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902025-02-01712310.3390/make7010023Multimodal Deep Learning for Android Malware ClassificationJames Arrowsmith0Teo Susnjak1Julian Jang-Jaccard2School of Mathematical and Computational Sciences, Massey University, Auckland 0632, New ZealandSchool of Mathematical and Computational Sciences, Massey University, Auckland 0632, New ZealandSchool of Mathematical and Computational Sciences, Massey University, Auckland 0632, New ZealandThis study investigates the integration of diverse data modalities within deep learning ensembles for Android malware classification. Android applications can be represented as binary images and function call graphs, each offering complementary perspectives on the executable. We synthesise these modalities by combining predictions from convolutional and graph neural networks with a multilayer perceptron. Empirical results demonstrate that multimodal models outperform their unimodal counterparts while remaining highly efficient. For instance, integrating a plain CNN with 83.1% accuracy and a GCN with 80.6% accuracy boosts overall accuracy to 88.3%. DenseNet-GIN achieves 90.6% accuracy, with no further improvement obtained by expanding this ensemble to four models. Based on our findings, we advocate for the flexible development of modalities to capture distinct aspects of applications and for the design of algorithms that effectively integrate this information.https://www.mdpi.com/2504-4990/7/1/23multimodal deep learning for Android malware detectionenhanced malware analysisgraph neural networksfunction call graphs (FCG)efficient multimodal late fusionCNN GNN Ensemble
spellingShingle James Arrowsmith
Teo Susnjak
Julian Jang-Jaccard
Multimodal Deep Learning for Android Malware Classification
Machine Learning and Knowledge Extraction
multimodal deep learning for Android malware detection
enhanced malware analysis
graph neural networks
function call graphs (FCG)
efficient multimodal late fusion
CNN GNN Ensemble
title Multimodal Deep Learning for Android Malware Classification
title_full Multimodal Deep Learning for Android Malware Classification
title_fullStr Multimodal Deep Learning for Android Malware Classification
title_full_unstemmed Multimodal Deep Learning for Android Malware Classification
title_short Multimodal Deep Learning for Android Malware Classification
title_sort multimodal deep learning for android malware classification
topic multimodal deep learning for Android malware detection
enhanced malware analysis
graph neural networks
function call graphs (FCG)
efficient multimodal late fusion
CNN GNN Ensemble
url https://www.mdpi.com/2504-4990/7/1/23
work_keys_str_mv AT jamesarrowsmith multimodaldeeplearningforandroidmalwareclassification
AT teosusnjak multimodaldeeplearningforandroidmalwareclassification
AT julianjangjaccard multimodaldeeplearningforandroidmalwareclassification