←
Back to Annotated Bibliography
Alignment
Linked from
Annotated Bibliography
/
Alignment
Inner alignment
Risks from Learned Optimization in Advanced Machine Learning Systems
Goal Misgeneralization in Deep Reinforcement Learning
Goal Misgeneralization: Why Correct Specifications Aren’t Enough For Correct Goals
Agents and Devices: A Relative Definition of Agency