Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

For more information about Stanford’s Artificial Intelligence programs visit: https://stanford.io/ai

This lecture provides a concise overview of building a ChatGPT-like model, covering both pretraining (language modeling) and post-training (SFT/RLHF). For each component, it explores common practices in data collection, algorithms, and evaluation methods. This guest lecture was delivered by Yann Dubois in Stanford’s CS229: Machine Learning course, in Summer 2024.

Yann Dubois
PhD Student at Stanford
https://yanndubs.github.io/

About the speaker: Yann Dubois is a fourth-year CS PhD student advised by Percy Liang and Tatsu Hashimoto. His research focuses on improving the effectiveness of AI when resources are scarce. Most recently, he has been part of the Alpaca team, working on training and evaluating language models more efficiently using other LLMs.

To view all online courses and programs offered by Stanford, visit: http://online.stanford.edu

Watch/Read More