Scaling towards AGI | JILA Physics Frontier Center

Speaker Name/Affiliation

Nikolas Tezak / Open AI

When

Wed, Mar 19 2025, 4 - 5pm

Seminar Type

Physics Department Colloquium

Location (Room)

JILA Auditorium

Event Details & Abstracts

Abstract: In this talk, I will take you on a tour of large language models, tracing their evolution from Recurrent Neural Networks (RNNs) to the Transformer architecture. We will explore how Transformers elegantly sidestep the vanishing and exploding gradient issues that plagued RNNs. I will introduce neural scaling laws—empirical relationships reminiscent of scaling behaviors common in physics—that predict how model performance improves with increased computational investment. We will also discuss different training paradigms and the key stages a model undergoes, from pretraining through deployment. I will illustrate some of the primary complexities encountered when scaling up large-model training, focusing on performance, resilience, and correctness. Finally, zooming out, I will share my personal perspective on our trajectory toward artificial general intelligence (AGI) and what to expect in the near term.