Machines that halt resolve the undecidability of artificial intelligence alignment

Abstract The inner alignment problem, which asserts whether an arbitrary artificial intelligence (AI) model satisfices a non-trivial alignment function of its outputs given its inputs, is undecidable. This is rigorously proved by Rice’s theorem, which is also equivalent to a reduction to Turing’s Ha...

Full description

Saved in:
Bibliographic Details
Main Authors: Gabriel A. Melo, Marcos R. O. A. Máximo, Nei Y. Soma, Paulo A. L. Castro
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-99060-2
Tags: Add Tag
No Tags, Be the first to tag this record!