[general_dat] Seminario AISAR – The Theoretical Foundations of Reward Learning

Wed Oct 8 13:11:51 -03 2025

Desde el Programa de Becas AISAR en AI Safety tenemos el placer de
invitarlos a la próxima charla de nuestro seminario online, con la
participación de investigadores del área.

📌 Fecha y hora: Lunes 13 de octubre, 10:00 hs (ARG).
🎤 Orador: Joar Skalse – PhD @ University of Oxford | Director @ DEDUCTO
📖 Título: The Theoretical Foundations of Reward Learning

Charla online: Para asistir a la charla, registrate acá:
https://luma.com/nostkcsb

Abstract: In this talk, I will provide an overview of my research on how to
build a theoretical foundation for the field of reward learning, including
my main motivations for pursuing this research, and some of my core results.

This research agenda involves answering questions such as: What is the
right method for expressing goals and instructions to AI systems? How
similar must two different goal specifications be in order to not be
hackable? What is the right way to quantify the differences and
similarities between different goal specifications in a given specification
language? What happens if you execute a task specification that is not
close to the “ideal” specification? Which specification learning algorithms
are guaranteed to converge to a good specification? How sensitive are these
specification learning algorithms to misspecification? If we have a bound
on the error in a specification (under some metric), can we devise safe
methods for optimising it?

Encontrá más detalles en: https://www.lesswrong.com/s/TEybbkyHpMEB2HTv3
Equipo AISAR
http://scholarship.aisafety.ar/
<http://scholarship.aisafety.ar/?utm_source=chatgpt.com>