• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

24-25 october

Moscow, HSE University

Call for postersRegistration

Workshop "Understanding How LLMs Work: An Introduction to Interpretability", AIRI

Alexey Dontsov
AIRI

Elena Tutubalina
AIRI

We will cover the foundations of neural network interpretation, focusing on Sparse Autoencoders - a simple but powerful concept that has brought us closer to understanding the inner workings of large language models. We'll explore why self-attention works, viewed through the lens of information flow, what circuits are and how models leverage them to solve problems, and how these insights are advancing our understanding of LLM architecture.

Slides