Nanochat - The $100 ChatGPT clone

Experience the nanochat d20 model of 561M parameters

💬 Start chatting now

💬 Try Nanochat Now

Ask questions, get help, or have a conversation - all powered by advanced AI

Loading Nanochat...

First load may take a moment, please be patient

📖What is Nanochat?

Nanochat is a revolutionary open-source AI chatbot that demonstrates how modern language models can be trained efficiently and affordably. This project showcases that building sophisticated AI assistants doesn't require massive budgets or corporate resources. With a training cost of just $100, Nanochat proves that AI technology is accessible to researchers, students, and developers worldwide.

Built on cutting-edge machine learning techniques, Nanochat uses a 20-layer Transformer architecture with 560 million parameters. The model is pre-trained on the FineWeb-EDU dataset, a carefully curated collection of educational web content, and then fine-tuned specifically for conversational interactions. This training methodology ensures that Nanochat can understand context, provide helpful responses, and engage in meaningful dialogue across a wide range of topics.

💬Natural Conversations

Experience fluid, context-aware conversations powered by advanced natural language processing. Nanochat understands nuance, maintains conversation history, and provides thoughtful responses that feel genuinely helpful and human-like.

🎓Educational Tool

Perfect for learning about AI and machine learning. The complete training pipeline is documented and available, making it an invaluable resource for students, researchers, and anyone interested in understanding how large language models work.

Fast & Efficient

Optimized for performance with quick response times. The model achieves impressive results while maintaining efficiency, demonstrating that effective AI doesn't always require the largest models or most expensive infrastructure.

🔓Open Source

Completely open source with MIT license. Every aspect of the training process is transparent and accessible, from tokenizer implementation to final deployment, enabling anyone to learn from, modify, or build upon this work.

🚀 Why Choose Nanochat?

Accessible AI Technology

In an era where AI development often seems exclusive to large tech companies with massive resources, Nanochat stands out as a beacon of accessibility. The entire training process costs approximately $100, using 8 H100 GPUs for just 3-4 hours. This democratizes AI development, making it feasible for:

Complete Learning Resource

Nanochat serves as a comprehensive educational platform for understanding modern AI systems. The project includes detailed documentation of every stage in the machine learning pipeline:

💡 Did you know? Nanochat achieves competitive performance on standard AI benchmarks while using a fraction of the resources required by commercial models. It scores 38.76% on ARC-Easy, 31.51% on MMLU, and demonstrates functional coding ability on HumanEval - all with just 560 million parameters!

Real-World Applications

While Nanochat was created as an educational project, its applications extend far beyond the classroom. The model demonstrates practical utility across various domains:

Technical Excellence

Behind Nanochat's accessible facade lies sophisticated engineering. The model employs state-of-the-art techniques in natural language processing:

Community and Open Source

Nanochat thrives on community involvement and open collaboration. The project embraces the principles of open science and reproducible research. By making every detail publicly available, from training scripts to model weights, Nanochat enables the entire AI community to:

🛠️Technical Specifications

Model Type Large Language Model (LLM)
Architecture 20-layer Transformer, 560M parameters
Training Cost ~$100 (8×H100 GPU, 3-4 hours)
Dataset FineWeb-EDU (100B tokens)
Tokenizer BPE with 65,536 vocabulary
Benchmarks ARC, MMLU, GSM8K, HumanEval
Framework PyTorch with custom optimizers
License MIT Open Source

Frequently Asked Questions

What can I use Nanochat for?

Nanochat is a versatile AI assistant that can help with a wide variety of tasks. You can use it for general conversation, asking questions about various topics, getting help with homework or research, brainstorming ideas, learning new concepts, practicing language skills, getting coding assistance, and much more. While it's designed as an educational demonstration, it provides genuinely useful responses across many domains including science, mathematics, history, literature, technology, and everyday problem-solving.

Is Nanochat really free to use?

Yes, Nanochat is completely free to use with no hidden costs, subscriptions, or usage limits. You don't need to create an account, provide payment information, or download any software. Simply start chatting directly on this page. The project is open source and funded by the AI research community's commitment to democratizing access to artificial intelligence technology.

How does Nanochat compare to other AI chatbots?

Nanochat is designed as a lightweight, educational AI model rather than a commercial product. While larger models like GPT-4 or Claude have more parameters and broader capabilities, Nanochat demonstrates impressive performance for its size. With 560 million parameters trained on high-quality educational data, it achieves competitive scores on standard benchmarks. The key advantage of Nanochat is its accessibility - the complete training process costs just $100 and takes only 3-4 hours, making it perfect for learning how modern language models work. It's particularly strong in educational contexts, clear explanations, and demonstrating fundamental AI capabilities.

Can I train my own version of Nanochat?

Absolutely! That's one of the primary goals of the Nanochat project. The complete source code, training scripts, and documentation are available under an MIT open source license. You can download everything you need, follow the step-by-step training guide, and create your own customized version. The process requires access to GPUs (8×H100 or equivalent) and costs approximately $100 for the full training pipeline. Many cloud providers offer GPU rentals at hourly rates, making this accessible to individuals and small organizations. The project includes detailed instructions for environment setup, data preparation, tokenizer training, pre-training, fine-tuning, and evaluation.

What makes Nanochat's training approach unique?

Nanochat's training methodology is remarkable for its efficiency and transparency. The project demonstrates that effective language models don't require massive corporate resources. Key innovations include:

  • Efficient scaling: Optimal data-to-parameter ratio following Chinchilla scaling laws
  • Quality over quantity: Training on curated FineWeb-EDU dataset rather than raw web scrapes
  • Custom tooling: Purpose-built Rust tokenizer for training efficiency
  • Complete pipeline: From tokenizer training through reinforcement learning
  • Reproducibility: Every step documented with exact configurations and hyperparameters

This approach proves that thoughtful engineering and quality data can achieve impressive results even with limited resources.

What are the limitations of Nanochat?

As an educational demonstration model, Nanochat has some limitations compared to larger commercial systems. It may occasionally produce inaccurate information, has a knowledge cutoff based on its training data, cannot access real-time information or browse the internet, and may struggle with very specialized or technical queries outside its training domain. The model has 560 million parameters, which is significantly smaller than frontier models, so its reasoning capabilities are more limited. However, these limitations are actually valuable for educational purposes, as they help users understand the current state and constraints of AI technology. Nanochat excels at demonstrating core language model capabilities and serving as a learning tool for AI education.

Who created Nanochat and why?

Nanochat was created as an open-source educational project to demonstrate that modern AI technology is accessible to everyone, not just large tech companies. The project aims to demystify large language model training by providing a complete, reproducible example that costs just $100. By making every aspect of the training process transparent and well-documented, the project serves as an invaluable learning resource for students, researchers, and developers worldwide. The goal is to advance AI education and enable more people to understand, experiment with, and contribute to the field of artificial intelligence.

How is my data handled when using Nanochat?

Your privacy is important. When you use Nanochat, your conversations are processed to generate responses but are not permanently stored or used for additional training. The chat interface operates through a secure connection, and no personal information is required or collected to use the service. As with any AI system, you should avoid sharing sensitive personal information, passwords, or confidential data in your conversations. The open-source nature of the project means you can review the code to understand exactly how data is handled, and you can even deploy your own private instance if you need guaranteed data isolation.

Can I use Nanochat for commercial purposes?

Yes! Nanochat is released under the MIT License, one of the most permissive open-source licenses. This means you can use the code, model, and training methodology for commercial purposes, modify it to suit your needs, integrate it into your products, and even create derivative works. The only requirement is that you include the original license text in your distribution. This makes Nanochat an excellent starting point for businesses wanting to build custom AI solutions, researchers developing new techniques, or developers creating AI-powered applications without licensing restrictions or ongoing fees.

What do I need to train my own language model?

Training your own Nanochat model requires several components:

  • Hardware: Access to 8 H100 GPUs (or equivalent) for approximately 3-4 hours. Cloud GPU rental services make this accessible.
  • Budget: Approximately $100 for GPU time, dataset storage, and related costs.
  • Software: Python environment with PyTorch, Rust compiler for the tokenizer, and standard ML libraries.
  • Data: The FineWeb-EDU dataset (automatically downloaded by the training scripts).
  • Knowledge: Basic understanding of Python and command-line operations. The detailed documentation guides you through each step.

The complete training pipeline includes tokenizer training (~1 minute), pre-training (~3 hours), mid-training (~7 minutes), and supervised fine-tuning (~7 minutes). All scripts are provided and well-documented.