Evaluating Lightweight Transformers With Local Explainability for Android Malware Detection
Mobile phones have evolved into powerful handheld computers, fostering a vast application ecosystem but also increasing security and privacy risks. Traditional deep learning-based Android malware detection, reliant on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), struggl...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11028131/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Mobile phones have evolved into powerful handheld computers, fostering a vast application ecosystem but also increasing security and privacy risks. Traditional deep learning-based Android malware detection, reliant on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), struggles to capture long-range dependencies, which are critical for identifying complex malware patterns. Transformers, with their self-attention mechanism, offer a promising alternative but are often computationally intensive for mobile deployment. To tackle this gap, this study assesses ten models—five customized architectures and five fine-tuned lightweight transformers (DistilBERT, CodeBERT, TinyBERT, MobileBERT, ALBERT), using a real-world dataset of 100K Android applications from Koodous, with API calls and permissions as features. The fine-tuned DistilBERT achieves an accuracy of 91.6% and an AUC of 96.5%, outperforming the customized variants (up to 90.5% accuracy), thereby highlighting the advantage of transfer learning. It remains competitive compared to AutoGluon leaderboard models (90–92% accuracy). With an average inference time of <inline-formula> <tex-math notation="LaTeX">$4.46 \pm 0.43$ </tex-math></inline-formula> ms and a 275 MB memory footprint, it balances efficiency and accuracy better than heavier transformers. Local Interpretable Model-Agnostic Explanations (LIME) are further integrated, with explanations aligning closely with VirusTotal’s malware descriptions. The findings demonstrate the viability of lightweight transformers for near-real-time Android malware detection, balancing accuracy, efficiency, and interpretability. |
|---|---|
| ISSN: | 2169-3536 |