# NanoChat: The best ChatGPT that $100 can buy
Table of Contents
NanoChat: Andrej Karpathy’s new project
Andrej Karpathy, a prominent voice in the AI community, has unveiled his latest project, NanoChat, with a bold claim: you can train a ChatGPT-like model from scratch for just $100. This isn’t about creating the next GPT-4, but about demystifying the end-to-end process of building a large language model, making it accessible to developers and researchers without million-dollar budgets.
The $100 Speedrun
The core idea behind NanoChat is the “speedrun”—a complete, four-stage pipeline that takes you from raw data to a functional chat interface in under four hours on an 8xH100 cloud instance. This entire workflow is orchestrated by a single script, speedrun.sh, which handles everything:
-
Tokenization: First, it trains a custom, high-performance BPE tokenizer on the FineWeb-EDU dataset. As the comparison tables show, this custom tokenizer holds its own, outperforming the standard GPT-2 tokenizer on most text types and even competing with GPT-4’s tokenizer on the FineWeb dataset it was trained on.
-
Pretraining: This is the most compute-intensive phase, where the ~560M parameter model is trained on 11.2 billion tokens of web text. The training progress, visualized in the provided charts, shows a steady decrease in the validation loss (val/bpb) and a corresponding increase in the
core_metric, indicating the model is effectively learning from the data. -
Instruction Tuning (Midtraining & SFT): After pretraining, the model is a powerful text predictor but not yet a chatbot. It undergoes two stages of fine-tuning. “Midtraining” adapts it to a conversational format and teaches it specialized skills like solving multiple-choice questions (critical for benchmarks like MMLU) and using a Python interpreter for math problems. This is followed by Supervised Finetuning (SFT) on high-quality conversational data to further refine its chat capabilities. The benchmark graphs for
mmlu_acc,arc_easy_acc, andgsm8k_accillustrate the model’s growing competence in knowledge, reasoning, and math throughout these stages. -
Reinforcement Learning (RL): The final, optional stage uses Reinforcement Learning to improve performance on specific tasks with clear reward signals, like the GSM8K math problems. The
rewardandpass@1/pass@8charts show the model getting progressively better at solving these problems as it receives feedback.
A Tool for Learning, Not Competing
The resulting model is, as Karpathy describes, modest. The screenshot of the NanoChat UI shows it can answer factual questions (“Why is the sky blue?”) and even write a simple poem, but it’s not designed to compete with state-of-the-art foundation models.
The true value of NanoChat is educational. It’s a clean, readable, and self-contained codebase that exposes the entire LLM training stack. It empowers developers to tinker with every component, from the tokenizer to the RL feedback loop, to truly understand what goes into making these powerful systems.
Get Your Hands Dirty
Ready to build your own model? The community has already made it easier than ever to get started.
- Train it Yourself: You can deploy a pre-configured environment on RunPod with a single click using this template. This template uses the
axiilay/nanochatDocker image, which can be found on Docker Hub. - Try it Now: If you want to skip the training and just chat with a pre-trained model, head over to the Nanochat Playground.