LIVESTREAM

Virtual Event: Learning Powerful Models: From Transformers to Reasoners and Beyond

About the Talk:

Many exciting and wonderful things are happening in AI all the time — but how can we understand the bigger picture? Why are these breakthroughs happening now, and what core research is driving them?

In this talk, OpenAI Research Scientist Łukasz Kaiser presents a simple model for thinking about progress: how we are making increasingly computationally powerful systems more learnable. Like any model, it has its flaws, but it has guided Łukasz from co-authoring the Transformer architecture in Attention Is All You Need (2017), to advancing model-based reinforcement learning for Atari (2019), and most recently to co-developing contemporary reasoning models at OpenAI.

Lukasz will share how this framework helps him organize his thinking and research about AI — and what it may mean for the future.

Don’t miss the opportunity to learn alongside one of the most influential AI research scientists of our time as he explores how OpenAI is building tools to help humanity solve its hardest problems.

About the Speakers:

Lukasz Kaiser

Lukasz is a deep learning researcher at OpenAI and was previously part of the Google Brain team. He works on fundamental aspects of deep learning and natural language processing. He has co-invented Transformers, reasoning models and other neural sequence models and co-authored the TensorFlow system and the Tensor2Tensor and Trax libraries. Before working on machine learning, Lukasz was a tenured researcher at University Paris Diderot and worked on logic and automata theory. He received his PhD from RWTH Aachen University in 2008 and his MSc from the University of Wroclaw, Poland

Notable Research:

Attention Is All You Need (Vaswani et al. 2017) — the original Transformer paper arXiv Google Research

Model-Based Reinforcement Learning for atari (Kaiser, Babaeizadeh, Milos, Osinski et al. 2019) arXivGoogle Research

Reformer: The Efficient Transformer (Kitaev, Kaiser & Levskaya, 2020) — an efficient long-sequence Transformer variant arXiv

Image Transformer (Parmar, Vaswani, Uszkoreit, Kaiser, Shazeer, Ku, Tran, 2018) — applying the Transformer architecture to image generation Proceedings of Machine Learning Research arXiv

Can Active Memory Replace Attention? (Kaiser & Bengio, NeurIPS 2016) — exploring alternatives to attention mechanisms in neural sequence models NeurIPS Papers

Natalie Cone

Natalie Cone leads OpenAI’s interdisciplinary community, the Forum, a community designed to unite thoughtful contributors from a diverse array of backgrounds, skill sets, and domain expertise to enable discourse related to the intersection of AI and an array of academic, professional, and societal domains. Before joining OpenAI, Natalie managed and stewarded Scale’s ML/AI community of practice. She has a background in the Arts, with a degree in History of Art from UC, Berkeley, and has served as Director of Operations and Programs, as well as on the board of directors for the radical performing arts center, CounterPulse, and led visitor experience at Yerba Buena Center for the Arts.