[academica_dat] Recordatorio - Seminario AISAR – Real-Time Detection of Hallucinated Entities in Long-Form Generation

Mon Sep 29 07:42:04 -03 2025

Recordatorio. Esto es hoy !

On Sun, Sep 28, 2025 at 17:05 Agustín Martinez Suñé <
agusmartinez92 at gmail.com> wrote:

> Recordatorio:
>
> Desde el Programa de Becas AISAR en AI Safety tenemos el placer de
> invitarlos a la próxima charla de nuestro seminario online, con la
> participación de investigadores del área.
>
> 📌 Fecha y hora: Lunes 29 de septiembre, 9:00 hs (ARG).
> 🎤 Orador: Oscar Balcells Obeso – Master’s student, ETH Zurich / Scholar,
> MATS.
> 📖 Título: Real-Time Detection of Hallucinated Entities in Long-Form
> Generation
>
> [image: 🌐] [image: 👉] Charla online, Inscripción: Para asistir a la
> charla, por favor indicá tu nombre en el siguiente formulario (No es
> necesario que completes este formulario si ya indicaste "Quiero que me
> avisen por correo electrónico cuando haya nuevas charlas de AISAR" en un
> formulario previo): https://forms.gle/f127kJPZYDbhujaL8
>
> Abstract: Large language models are now routinely used in high-stakes
> applications where hallucinations can cause serious harm, such as medical
> consultations or legal advice. Existing hallucination detection methods,
> however, are impractical for real-world use, as they are either limited to
> short factual queries or require costly external verification. We present a
> cheap, scalable method for real-time identification of hallucinated tokens
> in long-form generations, and scale it effectively to 70B parameter models.
> Our approach targets \emph{entity-level hallucinations} -- e.g., fabricated
> names, dates, citations -- rather than claim-level, thereby naturally
> mapping to token-level labels and enabling streaming detection. We develop
> an annotation methodology that leverages web search to annotate model
> responses with grounded labels indicating which tokens correspond to
> fabricated entities. This dataset enables us to train effective
> hallucination classifiers with simple and efficient methods such as linear
> probes. Evaluating across four model families, our classifiers consistently
> outperform baselines on long-form responses, including more expensive
> methods such as semantic entropy (e.g., AUC 0.90 vs 0.71 for
> Llama-3.3-70B), and are also an improvement in short-form
> question-answering settings. Moreover, despite being trained only with
> entity-level labels, our probes effectively detect incorrect answers in
> mathematical reasoning tasks, indicating generalization beyond entities.
> While our annotation methodology is expensive, we find that annotated
> responses from one model can be used to train effective classifiers on
> other models; accordingly, we publicly release our datasets to facilitate
> reuse. Overall, our work suggests a promising new approach for scalable,
> real-world hallucination detection.
>
> Encontrá el paper acá: https://arxiv.org/abs/2509.03531
>
> Equipo AISAR
> http://scholarship.aisafety.ar/
> <http://scholarship.aisafety.ar/?utm_source=chatgpt.com>
>