Machines that halt resolve the undecidability of artificial intelligence alignment

Abstract The inner alignment problem, which asserts whether an arbitrary artificial intelligence (AI) model satisfices a non-trivial alignment function of its outputs given its inputs, is undecidable. This is rigorously proved by Rice’s theorem, which is also equivalent to a reduction to Turing’s Ha...

Full description

Saved in:

Bibliographic Details
Main Authors:	Gabriel A. Melo, Marcos R. O. A. Máximo, Nei Y. Soma, Paulo A. L. Castro
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-05-01
Series:	Scientific Reports
Subjects:	Artificial intelligence Ai safety Computability Decidability Halting problem
Online Access:	https://doi.org/10.1038/s41598-025-99060-2
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://doi.org/10.1038/s41598-025-99060-2

Machines that halt resolve the undecidability of artificial intelligence alignment

Internet

Similar Items