# NanoChat: The best ChatGPT that $100 can buy

Table of Contents

NanoChat: Andrej Karpathy’s new project

Andrej Karpathy, a prominent voice in the AI community, has unveiled his latest project, NanoChat, with a bold claim: you can train a ChatGPT-like model from scratch for just $100. This isn’t about creating the next GPT-4, but about demystifying the end-to-end process of building a large language model, making it accessible to developers and researchers without million-dollar budgets.

The $100 Speedrun

The core idea behind NanoChat is the “speedrun”—a complete, four-stage pipeline that takes you from raw data to a functional chat interface in under four hours on an 8xH100 cloud instance. This entire workflow is orchestrated by a single script, speedrun.sh, which handles everything:

  1. Tokenization: First, it trains a custom, high-performance BPE tokenizer on the FineWeb-EDU dataset. As the comparison tables show, this custom tokenizer holds its own, outperforming the standard GPT-2 tokenizer on most text types and even competing with GPT-4’s tokenizer on the FineWeb dataset it was trained on.

  2. Pretraining: This is the most compute-intensive phase, where the ~560M parameter model is trained on 11.2 billion tokens of web text. The training progress, visualized in the provided charts, shows a steady decrease in the validation loss (val/bpb) and a corresponding increase in the core_metric, indicating the model is effectively learning from the data.

  3. Instruction Tuning (Midtraining & SFT): After pretraining, the model is a powerful text predictor but not yet a chatbot. It undergoes two stages of fine-tuning. “Midtraining” adapts it to a conversational format and teaches it specialized skills like solving multiple-choice questions (critical for benchmarks like MMLU) and using a Python interpreter for math problems. This is followed by Supervised Finetuning (SFT) on high-quality conversational data to further refine its chat capabilities. The benchmark graphs for mmlu_acc, arc_easy_acc, and gsm8k_acc illustrate the model’s growing competence in knowledge, reasoning, and math throughout these stages.

  4. Reinforcement Learning (RL): The final, optional stage uses Reinforcement Learning to improve performance on specific tasks with clear reward signals, like the GSM8K math problems. The reward and pass@1/pass@8 charts show the model getting progressively better at solving these problems as it receives feedback.

A Tool for Learning, Not Competing

The resulting model is, as Karpathy describes, modest. The screenshot of the NanoChat UI shows it can answer factual questions (“Why is the sky blue?”) and even write a simple poem, but it’s not designed to compete with state-of-the-art foundation models.

The true value of NanoChat is educational. It’s a clean, readable, and self-contained codebase that exposes the entire LLM training stack. It empowers developers to tinker with every component, from the tokenizer to the RL feedback loop, to truly understand what goes into making these powerful systems.

Get Your Hands Dirty

Ready to build your own model? The community has already made it easier than ever to get started.

  • Train it Yourself: You can deploy a pre-configured environment on RunPod with a single click using this template. This template uses the axiilay/nanochat Docker image, which can be found on Docker Hub.
  • Try it Now: If you want to skip the training and just chat with a pre-trained model, head over to the Nanochat Playground.
My avatar

Thanks for reading my blog post! Feel free to check out my other posts or contact me via the social links in the footer.


More Posts

Comments