AppAuth: Authorship Attribution for Android App Clones

Android app clone detection has been extensively studied in our community, and a number of effective approaches and frameworks were proposed and released. However, there still remains one open challenge that has not been well addressed in previous work, <italic>i.e.</italic>, <italic&...

Full description

Saved in:
Bibliographic Details
Main Authors: Guoai Xu, Chengpeng Zhang, Bowen Sun, Xinyu Yang, Yanhui Guo, Chengze Li, Haoyu Wang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8853275/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841536174258126848
author Guoai Xu
Chengpeng Zhang
Bowen Sun
Xinyu Yang
Yanhui Guo
Chengze Li
Haoyu Wang
author_facet Guoai Xu
Chengpeng Zhang
Bowen Sun
Xinyu Yang
Yanhui Guo
Chengze Li
Haoyu Wang
author_sort Guoai Xu
collection DOAJ
description Android app clone detection has been extensively studied in our community, and a number of effective approaches and frameworks were proposed and released. However, there still remains one open challenge that has not been well addressed in previous work, <italic>i.e.</italic>, <italic>the authorship attribution for the detected app clones</italic>. Although state-of-the-art approaches could accurately identify repackaged apps in one way or another, no convincing method has been proposed to identify the original app and the authentic author from the repackaged app pairs, which greatly limits the usage scenario of app clone detection techniques. For example, app market maintainers have to manually confirm the identified repackaged app pairs, while in most cases, it is challenging for them to make an accurate decision. In this paper, we propose AppAuth, a novel learning-based approach to predict the authorship of app clones. To be specific, for a given Android app clone pair (or a group of repackaged apps identified), AppAuth could accurately infer the original author of the plagiarized apps. Our approach is motivated by the traditional authorship attribution studies on binary files. AppAuth first extracts a number of coding-style-related features from the executable <italic>.apk</italic> files, and then relies on machine learning techniques to train a classification model. We have conducted extensive experiments to evaluate the effectiveness of AppAuth. The experiment results suggest that we are able to infer the authorship for Android app clones with high precision. Our work is the first one that tackles the problem systematically and we believe our efforts could positively contribute to the research community and boost the research of app repacking detection and authorship attribution studies.
format Article
id doaj-art-59f7f62fb2be42c9ae082080f68e70df
institution Kabale University
issn 2169-3536
language English
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-59f7f62fb2be42c9ae082080f68e70df2025-01-15T00:01:08ZengIEEEIEEE Access2169-35362019-01-01714185014186710.1109/ACCESS.2019.29446848853275AppAuth: Authorship Attribution for Android App ClonesGuoai Xu0Chengpeng Zhang1https://orcid.org/0000-0001-9068-0828Bowen Sun2Xinyu Yang3Yanhui Guo4Chengze Li5Haoyu Wang6School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, ChinaSchool of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, ChinaSchool of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, ChinaSchool of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, ChinaSchool of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, ChinaNational Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing, ChinaSchool of Computer Science, Beijing University of Posts and Telecommunications, Beijing, ChinaAndroid app clone detection has been extensively studied in our community, and a number of effective approaches and frameworks were proposed and released. However, there still remains one open challenge that has not been well addressed in previous work, <italic>i.e.</italic>, <italic>the authorship attribution for the detected app clones</italic>. Although state-of-the-art approaches could accurately identify repackaged apps in one way or another, no convincing method has been proposed to identify the original app and the authentic author from the repackaged app pairs, which greatly limits the usage scenario of app clone detection techniques. For example, app market maintainers have to manually confirm the identified repackaged app pairs, while in most cases, it is challenging for them to make an accurate decision. In this paper, we propose AppAuth, a novel learning-based approach to predict the authorship of app clones. To be specific, for a given Android app clone pair (or a group of repackaged apps identified), AppAuth could accurately infer the original author of the plagiarized apps. Our approach is motivated by the traditional authorship attribution studies on binary files. AppAuth first extracts a number of coding-style-related features from the executable <italic>.apk</italic> files, and then relies on machine learning techniques to train a classification model. We have conducted extensive experiments to evaluate the effectiveness of AppAuth. The experiment results suggest that we are able to infer the authorship for Android app clones with high precision. Our work is the first one that tackles the problem systematically and we believe our efforts could positively contribute to the research community and boost the research of app repacking detection and authorship attribution studies.https://ieeexplore.ieee.org/document/8853275/Androidauthorship attributionapp repackageapp clone
spellingShingle Guoai Xu
Chengpeng Zhang
Bowen Sun
Xinyu Yang
Yanhui Guo
Chengze Li
Haoyu Wang
AppAuth: Authorship Attribution for Android App Clones
IEEE Access
Android
authorship attribution
app repackage
app clone
title AppAuth: Authorship Attribution for Android App Clones
title_full AppAuth: Authorship Attribution for Android App Clones
title_fullStr AppAuth: Authorship Attribution for Android App Clones
title_full_unstemmed AppAuth: Authorship Attribution for Android App Clones
title_short AppAuth: Authorship Attribution for Android App Clones
title_sort appauth authorship attribution for android app clones
topic Android
authorship attribution
app repackage
app clone
url https://ieeexplore.ieee.org/document/8853275/
work_keys_str_mv AT guoaixu appauthauthorshipattributionforandroidappclones
AT chengpengzhang appauthauthorshipattributionforandroidappclones
AT bowensun appauthauthorshipattributionforandroidappclones
AT xinyuyang appauthauthorshipattributionforandroidappclones
AT yanhuiguo appauthauthorshipattributionforandroidappclones
AT chengzeli appauthauthorshipattributionforandroidappclones
AT haoyuwang appauthauthorshipattributionforandroidappclones