Reversing the logic of generative AI alignment: a pragmatic approach for public interest

The alignment of artificial intelligence (AI) systems with societal values and the public interest is a critical challenge in the field of AI ethics and governance. Traditional approaches, such as Reinforcement Learning with Human Feedback (RLHF) and Constitutional AI, often rely on pre-defined high...

Full description

Saved in:
Bibliographic Details
Main Author: Gleb Papyshev
Format: Article
Language:English
Published: Cambridge University Press 2025-01-01
Series:Data & Policy
Subjects:
Online Access:https://www.cambridge.org/core/product/identifier/S2632324925000094/type/journal_article
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The alignment of artificial intelligence (AI) systems with societal values and the public interest is a critical challenge in the field of AI ethics and governance. Traditional approaches, such as Reinforcement Learning with Human Feedback (RLHF) and Constitutional AI, often rely on pre-defined high-level ethical principles. This article critiques these conventional alignment frameworks through the philosophical perspectives of pragmatism and public interest theory, arguing against their rigidity and disconnect with practical impacts. It proposes an alternative alignment strategy that reverses the traditional logic, focusing on empirical evidence and the real-world effects of AI systems. By emphasizing practical outcomes and continuous adaptation, this pragmatic approach aims to ensure that AI technologies are developed according to the principles that are derived from the observable impacts produced by technology applications.
ISSN:2632-3249