Examining embedded lies through computational text analysis

Abstract Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. How...

Full description

Saved in:
Bibliographic Details
Main Authors: Riccardo Loconte, Bennett Kleinberg
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-11327-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849333177422708736
author Riccardo Loconte
Bennett Kleinberg
author_facet Riccardo Loconte
Bennett Kleinberg
author_sort Riccardo Loconte
collection DOAJ
description Abstract Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. However, research on embedded lies has been lagging behind. We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies. Using a counterbalanced within-subjects design, participants provided two versions of an autobiographical event. One was described truthfully, and the other one deceptively by including embedded lies. Participants later highlighted those embedded lies and judged them on lie centrality, deceptiveness, and source. We show that a fine-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies significantly above the chance level (64% accuracy). Individual differences, linguistic properties, and explainability analysis suggest that the challenge of moving the dial towards embedded lies stems from their resemblance to truthful statements. Typical deceptive statements consisted of 2/3 truthful information and 1/3 embedded lies, largely derived from past personal experiences and with minimal linguistic differences from their truthful counterparts. We present this dataset as a novel resource to address this challenge and foster research on embedded lies in verbal deception detection.
format Article
id doaj-art-599e5a7d00eb45eb95bc83af01c430f4
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-599e5a7d00eb45eb95bc83af01c430f42025-08-20T03:45:57ZengNature PortfolioScientific Reports2045-23222025-07-0115111610.1038/s41598-025-11327-wExamining embedded lies through computational text analysisRiccardo Loconte0Bennett Kleinberg1Molecular Mind Lab, IMT School of Advanced Studies LuccaDepartment of Methodology and Statistics, Tilburg UniversityAbstract Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. However, research on embedded lies has been lagging behind. We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies. Using a counterbalanced within-subjects design, participants provided two versions of an autobiographical event. One was described truthfully, and the other one deceptively by including embedded lies. Participants later highlighted those embedded lies and judged them on lie centrality, deceptiveness, and source. We show that a fine-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies significantly above the chance level (64% accuracy). Individual differences, linguistic properties, and explainability analysis suggest that the challenge of moving the dial towards embedded lies stems from their resemblance to truthful statements. Typical deceptive statements consisted of 2/3 truthful information and 1/3 embedded lies, largely derived from past personal experiences and with minimal linguistic differences from their truthful counterparts. We present this dataset as a novel resource to address this challenge and foster research on embedded lies in verbal deception detection.https://doi.org/10.1038/s41598-025-11327-wDeceptionEmbedded liesLying profileNatural Language processingIndividual differences
spellingShingle Riccardo Loconte
Bennett Kleinberg
Examining embedded lies through computational text analysis
Scientific Reports
Deception
Embedded lies
Lying profile
Natural Language processing
Individual differences
title Examining embedded lies through computational text analysis
title_full Examining embedded lies through computational text analysis
title_fullStr Examining embedded lies through computational text analysis
title_full_unstemmed Examining embedded lies through computational text analysis
title_short Examining embedded lies through computational text analysis
title_sort examining embedded lies through computational text analysis
topic Deception
Embedded lies
Lying profile
Natural Language processing
Individual differences
url https://doi.org/10.1038/s41598-025-11327-w
work_keys_str_mv AT riccardoloconte examiningembeddedliesthroughcomputationaltextanalysis
AT bennettkleinberg examiningembeddedliesthroughcomputationaltextanalysis