Weaponizing cognitive bias in autonomous systems: a framework for black-box inference attacks

Autonomous systems operating in high-dimensional environments increasingly rely on prioritization heuristics to allocate attention and assess risk, yet these mechanisms can introduce cognitive biases such as salience, spatial framing, and temporal familiarity that influence decision-making without a...

Full description

Saved in:
Bibliographic Details
Main Authors: Shiyong Chu, Yuwei Chen
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-08-01
Series:Frontiers in Artificial Intelligence
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frai.2025.1623573/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Autonomous systems operating in high-dimensional environments increasingly rely on prioritization heuristics to allocate attention and assess risk, yet these mechanisms can introduce cognitive biases such as salience, spatial framing, and temporal familiarity that influence decision-making without altering the input or accessing internal states. This study presents Priority Inversion via Operational Reasoning (PRIOR), a black-box, non-perturbative diagnostic framework that employs structurally biased but semantically neutral scenario cues to probe inference-level vulnerabilities without modifying pixel-level, statistical, or surface semantic properties. Given the limited accessibility of embodied vision-based systems, we evaluate PRIOR using large language models (LLMs) as abstract reasoning proxies to simulate cognitive prioritization in constrained textual surveillance scenarios inspired by Unmanned Aerial Vehicle (UAV) operations. Controlled experiments demonstrate that minimal structural cues can consistently induce priority inversions across multiple models, and joint analysis of model justifications and confidence estimates reveals systematic distortions in inferred threat relevance even when inputs are symmetrical. These findings expose the fragility of inference-level reasoning in black-box systems and motivate the development of evaluation strategies that extend beyond output correctness to interrogate internal prioritization logic, with implications for dynamic, embodied, and visually grounded agents in real-world deployments.
ISSN:2624-8212