Using Multimodal Foundation Models for Detecting Fake Images on the Internet with Explanations
Generative AI and multimodal foundation models have fueled a proliferation of fake content on the Internet. This paper investigates if foundation models help detect and thereby contain the spread of fake images. The task of detecting fake images is a formidable challenge owing to its visual nature a...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-11-01
|
| Series: | Future Internet |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1999-5903/16/12/432 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850050252073074688 |
|---|---|
| author | Vishnu S. Pendyala Ashwin Chintalapati |
| author_facet | Vishnu S. Pendyala Ashwin Chintalapati |
| author_sort | Vishnu S. Pendyala |
| collection | DOAJ |
| description | Generative AI and multimodal foundation models have fueled a proliferation of fake content on the Internet. This paper investigates if foundation models help detect and thereby contain the spread of fake images. The task of detecting fake images is a formidable challenge owing to its visual nature and intricate analysis. This paper details experiments using four multimodal foundation models, Llava, CLIP, Moondream2, and Gemini 1.5 Flash, to detect fake images. Explainable AI techniques such as Local Interpretable Model-Agnostic Explanations (LIME) and removal-based explanations are used to gain insights into the detection process. The dataset used comprised real images and fake images generated by a generative artificial intelligence tool called MidJourney. Results show that the models can achieve up to a 69% accuracy rate in detecting fake images in an intuitively explainable way, as confirmed by multiple techniques and metrics. |
| format | Article |
| id | doaj-art-6f067ae72f65480eb292ccae52e55cc3 |
| institution | DOAJ |
| issn | 1999-5903 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Future Internet |
| spelling | doaj-art-6f067ae72f65480eb292ccae52e55cc32025-08-20T02:53:30ZengMDPI AGFuture Internet1999-59032024-11-01161243210.3390/fi16120432Using Multimodal Foundation Models for Detecting Fake Images on the Internet with ExplanationsVishnu S. Pendyala0Ashwin Chintalapati1Department of Applied Data Science, San Jose State University, San Jose, CA 95192, USADepartment of Computer Science, Purdue University, West Lafayette, IN 47907, USAGenerative AI and multimodal foundation models have fueled a proliferation of fake content on the Internet. This paper investigates if foundation models help detect and thereby contain the spread of fake images. The task of detecting fake images is a formidable challenge owing to its visual nature and intricate analysis. This paper details experiments using four multimodal foundation models, Llava, CLIP, Moondream2, and Gemini 1.5 Flash, to detect fake images. Explainable AI techniques such as Local Interpretable Model-Agnostic Explanations (LIME) and removal-based explanations are used to gain insights into the detection process. The dataset used comprised real images and fake images generated by a generative artificial intelligence tool called MidJourney. Results show that the models can achieve up to a 69% accuracy rate in detecting fake images in an intuitively explainable way, as confirmed by multiple techniques and metrics.https://www.mdpi.com/1999-5903/16/12/432misinformation containmentlarge multimodal modelsexplainable AIimage processing |
| spellingShingle | Vishnu S. Pendyala Ashwin Chintalapati Using Multimodal Foundation Models for Detecting Fake Images on the Internet with Explanations Future Internet misinformation containment large multimodal models explainable AI image processing |
| title | Using Multimodal Foundation Models for Detecting Fake Images on the Internet with Explanations |
| title_full | Using Multimodal Foundation Models for Detecting Fake Images on the Internet with Explanations |
| title_fullStr | Using Multimodal Foundation Models for Detecting Fake Images on the Internet with Explanations |
| title_full_unstemmed | Using Multimodal Foundation Models for Detecting Fake Images on the Internet with Explanations |
| title_short | Using Multimodal Foundation Models for Detecting Fake Images on the Internet with Explanations |
| title_sort | using multimodal foundation models for detecting fake images on the internet with explanations |
| topic | misinformation containment large multimodal models explainable AI image processing |
| url | https://www.mdpi.com/1999-5903/16/12/432 |
| work_keys_str_mv | AT vishnuspendyala usingmultimodalfoundationmodelsfordetectingfakeimagesontheinternetwithexplanations AT ashwinchintalapati usingmultimodalfoundationmodelsfordetectingfakeimagesontheinternetwithexplanations |