FreeBuff (Free)
Ad-supported free tier. No subscription, no credits, no configuration. Uses optimized models with built-in web research and browser capabilities.
A multi-agent coding assistant that coordinates specialized AI agents to understand, plan, edit, and review your codebase.
Codebuff is an open-source, multi-agent coding assistant that coordinates specialized AI sub-agents — File Picker, Planner, Editor, Reviewer, Thinker, and Basher — to understand, plan, edit, and review your codebase from the terminal. Built on a deep agent framework and backed by Y Combinator (Fall 2024), it beats single-model approaches like Claude Code on complex coding tasks, scoring 61% vs 53% across 175+ real-world evals in BuffBench.
Ad-supported free tier. No subscription, no credits, no configuration. Uses optimized models with built-in web research and browser capabilities.
Full access to all modes (Default, Max, Plan, Lite) with standard usage limits. Multi-agent orchestration with Claude Opus 4.7, GPT-5.1, Kimi K2.6.
Higher usage limits for teams and power users.
Highest usage tier for heavy usage and teams.
500 free credits on signup. Credits consumed based on task complexity. 500 credits ≈ a few hours of coding.
Codebuff is an open-source, multi-agent coding assistant that doesn’t just throw one model at your code — it coordinates a team of specialized AI agents to understand, plan, edit, and review your codebase. Launched in June 2025 by a Y Combinator-backed team (W25) and hosted on GitHub under an Apache-2.0 license, Codebuff has quickly amassed over 6,100 stars and 6,700+ commits.
The core insight behind Codebuff is simple but powerful: different parts of a coding task benefit from different models and different agent strategies. Instead of using one LLM for everything — file discovery, planning, editing, reviewing — Codebuff spawns purpose-built agents for each role. A File Picker Agent (powered by Gemini 2.0 Flash) scans your codebase and identifies relevant files. A Planner Agent maps out the changes needed. An Editor Agent (running Claude Opus 4.7, GPT-5.1, or Kimi K2.6) makes precise edits. A Reviewer Agent catches issues before you see the result. And in Max mode, multiple editors run in parallel with different strategies, and a selector picks the best output.
This multi-agent approach doesn’t just sound impressive — it’s validated by BuffBench, Codebuff’s custom eval suite that tests configurations across 175+ real implementation tasks from open-source repos. Codebuff beats Claude Code 61% vs 53% on these evals while often completing tasks 100+ seconds faster on average. In real-world tests, a feature that took Claude Code 19 minutes and 37 seconds was completed by Codebuff in 6 minutes and 45 seconds.
Codebuff’s defining feature is its orchestrator-driven multi-agent system. The main orchestrator agent — named “Buffy” and running on Claude Opus 4.7 — reads your prompt, gathers context, and spawns specialized sub-agents:
| Agent | Model | Role |
|---|---|---|
| File Picker | Gemini 2.0 Flash | Scans codebase, finds relevant files |
| Code Searcher | — | Grep-style pattern matching |
| Researcher | Gemini 3.1 Flash Lite | Web and documentation lookup |
| Thinker | Claude Opus 4.7, GPT-5.4 | Works through hard problems |
| Editor | Claude Opus 4.7, GPT-5.1, Kimi K2.6 | Writes and modifies code |
| Reviewer | Claude Opus 4.7, Kimi K2.6 | Catches bugs and style issues |
| Basher | Gemini 3.1 Flash Lite | Runs terminal commands, tests, typechecks |
Each sub-agent has a narrow, focused toolset and purpose. The orchestrator keeps its own context clean by only incorporating the final output from spawned agents. Agents can spawn sub-agents with arbitrary nesting depth — unlike Claude Code, which only supports one level of sub-agents.
Traditional coding agents like Claude Code spend minutes grep-ing and reading file excerpts one at a time. Codebuff takes a fundamentally different approach:
The entire process takes just a few seconds. Codebuff often understands your project better after 2 seconds of scanning than a single-model tool does after 5 minutes of exploration.
Codebuff’s development is guided by BuffBench, a custom eval suite that tests agent configurations across 175+ real implementation tasks from open-source repositories. Unlike benchmarks like SWE Bench that pass predefined tests, BuffBench challenges agents to reimplement real git commits through multi-turn conversations. An AI judge scores implementations on completion, efficiency, code quality, and overall correctness — comparing against the ground truth commit.
This data-driven approach means every agent configuration change is measured against real-world performance. Only the highest-scoring, fastest, most cost-effective configurations ship to users.
Codebuff provides four modes, switchable mid-session with Shift+Tab or /mode: commands:
<PLAN> tags. No file writes. Use to scope work before implementing.FreeBuff (npm install -g freebuff) is Codebuff’s ad-supported free variant — no subscription, no credits, no configuration. Just install and start coding. It uses models optimized for fast, high-quality assistance and includes built-in web research and browser capabilities. Ads appear above the input box, and each impression earns you credits you can spend on more usage. Turn ads off at any time in settings.
Codebuff’s agent framework is exposed through the @codebuff/sdk npm package, letting you embed coding agent capabilities into your own applications. The same code that powers Codebuff powers your custom agents:
import { CodebuffClient } from '@codebuff/sdk'
const client = new CodebuffClient({
apiKey: 'your-api-key',
cwd: '/path/to/your/project',
onError: (error) => console.error('Codebuff error:', error.message),
})
// Run a coding task
const result = await client.run({
agent: 'base',
prompt: 'Add error handling to all API endpoints',
handleEvent: (event) => {
console.log('Progress', event)
},
}) You can define custom agents with TypeScript generators, create custom tools, and integrate with CI/CD pipelines.
Codebuff provides a full framework for creating and publishing your own agents. Running /init inside the CLI generates a project structure with agent definition files, TypeScript type definitions, and tool configurations. Agents are defined as TypeScript objects with:
Agents can compose other published agents from the Agent Store at codebuff.com/store , creating reusable, composable workflows.
Codebuff eliminates context window anxiety. After the prompt cache expires (5 minutes idle), the conversation is automatically compacted into non-lossy summaries that preserve 10-20 roundtrips with full details. After compaction, Codebuff re-reads any relevant files it needs. You never think about context limits — it just works.
Codebuff runs as a three-tier architecture: the CLI client, a stateless server, and the model providers.
The Pipeline:
The server is stateless — it streams requests to model providers (Anthropic, OpenAI, Google, xAI) over WebSockets. Your code stays local; only relevant context is sent to the APIs.
Key architectural innovation: Subagents can optionally inherit conversation history from their parent. Unlike Claude Code’s subagents (which always start with blank context), Codebuff agents can pick up where their parent left off. Combined with arbitrary nesting depth and the orchestrator pattern (an agent whose only tool is spawning other agents), this creates a uniquely flexible architecture.
npm install -g codebuff
# Verify installation
codebuff --version # No subscription, no credits, no configuration
npm install -g freebuff # Install as a dependency in your project
npm install @codebuff/sdk # Navigate to your project
cd /path/to/your-project
# Launch Codebuff
codebuff
# On first launch, you'll be guided through authentication
# Then just describe what you want to build # Inside Codebuff's CLI, run:
/init This creates project-specific configuration files including knowledge.md (project context for Codebuff) and the .agents/ directory structure for custom agent definitions.
# Launch in the current directory
codebuff
# Launch with a specific mode
codebuff --mode max
# Launch with debug logging
codebuff --debug | Action | Input |
|---|---|
| Switch modes |
|
| Initialize project |
|
| Suggest follow-ups | Click on suggested prompts after each response |
Once inside Codebuff, just describe what you want in natural language:
> "Add authentication to my API"
> "Fix the SQL injection vulnerability in user registration"
> "Add rate limiting to all API endpoints"
> "Refactor the database connection code for better performance"
> "Convert the entire codebase from JavaScript to TypeScript"
> "Set up a CI/CD pipeline with GitHub Actions" Codebuff handles the rest — file discovery, planning, editing, running tests, and reviewing.
Switch modes mid-session depending on the task:
/mode:plan — “What’s the best way to add WebSocket support to this app?” (no code changes)/mode:max — “Refactor the entire payment processing pipeline” (best-of-N editing)/mode:lite — “Fix this typo in the error message” (fast and cheap)/mode:default — Back to standard mode for general development# Just install and run
npm install -g freebuff
cd your-project
freebuff FreeBuff works identically to Codebuff but uses more affordable models and shows contextual ads above the input box.
Codebuff occupies a unique position in the coding agent landscape, differentiated by its multi-agent architecture and research-driven approach.
| Dimension | Codebuff | Claude Code | Aider | Cursor |
|---|---|---|---|---|
| Architecture | Multi-agent orchestration | Single-model + sub-processes | Single-model | Single-model |
| File Discovery | Tree-based (~2s full scan) | Sequential grep + read | Manual file specification | Editor-integrated |
| Code Review | Automatic per-prompt | None | None | None |
| Max Mode | Best-of-N parallel editors | N/A | N/A | Composer |
| Model Choice | Any OpenRouter model | Claude only | Any (via config) | Claude + GPT + Custom |
| IDE Integration | CLI (works in any terminal) | CLI | CLI / VS Code plugin | Full IDE |
| Custom Agents | Full TypeScript framework | Basic sub-agent support | Limited | Limited |
| Pricing | $100/mo or 1¢/credit + free tier | $20/mo Pro + API costs | Free (BYO keys) | $20/mo Pro |
| SDK | ✅ | ❌ | ❌ | ❌ |
| Open Source | ✅ Apache-2.0 | ❌ Proprietary | ✅ Apache-2.0 | ❌ Proprietary |
| Evals | BuffBench (175+ tasks) | SWE-Bench | SWE-Bench | Internal |
Codebuff’s direct benchmark comparison shows meaningful advantages across the board:
Choose Codebuff over Claude Code when you want faster edits, lower cost per task, automatic code review, and the ability to define custom agent workflows. Choose Claude Code when you need enterprise controls (SSO, RBAC, compliance programs) or direct Anthropic procurement.
Codebuff and Aider both run in the terminal and support multi-model backends, but diverge significantly:
Choose Codebuff for complex, multi-file refactoring tasks where automatic file discovery and code review save significant time. Choose Aider for simpler, focused edits where you want to minimize overhead and cost.
Cursor is a full IDE with AI features; Codebuff is a CLI agent:
Choose Codebuff if you prefer terminal-centric workflows, need programmable agents for automation, or want a free tier. Choose Cursor if you want a polished IDE experience with inline completions and visual diff views.
Codebuff represents a genuine architectural leap in AI coding assistants. Where most tools — Claude Code, Cursor, Aider, GitHub Copilot — rely on a single LLM to handle everything from file discovery to code editing to quality assurance, Codebuff orchestrates a team of specialized agents, each purpose-built for their role.
The results speak for themselves. A 61% win rate against Claude Code on BuffBench, tasks completed 100+ seconds faster on average, automatic code review on every change, and a custom agent framework that lets you define, compose, and publish your own agent workflows. The tree-based file discovery alone — indexing your entire codebase in ~2 seconds — eliminates one of the most frustrating bottlenecks in AI-assisted coding: watching your tool slowly explore your project file by file.
Codebuff isn’t without trade-offs. The multi-agent architecture adds overhead on trivial tasks. The pricing model is more complex than a flat subscription (tiers, credits, ads, and a free tier). There’s no native IDE integration — you use it in a terminal, even if that terminal is inside VS Code or Cursor. And with a smaller community than Claude Code or Copilot, you’ll find fewer tutorials, blog posts, and community extensions.
For developers who work on complex, multi-file projects and want a coding assistant that thinks architecturally rather than operating file-by-file, Codebuff is a compelling choice. The agent framework alone opens up possibilities that single-model tools can’t match — automated refactoring pipelines, CI/CD-integrated code review, custom agents for domain-specific tasks. And with FreeBuff, there’s zero cost to try it.
The broader implication is clear: the future of AI coding assistants isn’t better single models — it’s better orchestration of multiple models working together. Codebuff is betting on that future and, based on the evidence so far, it’s a bet worth watching.
Initial public launch — multi-agent architecture with Default, Max, Plan, and Lite modes
BuffBench eval suite, FreeBuff free tier, SDK release
Tree-sitter based file discovery, multi-agent orchestrator
# Install Codebuff globally
npm install -g codebuff
# Navigate to your project
cd /path/to/your-project
# Launch Codebuff
codebuff
# Example prompts inside Codebuff:
# > "Add authentication to my API"
# > "Fix the SQL injection vulnerability in user registration"
# > "Refactor the database connection code for better performance"
# Switch modes mid-session with Shift+Tab or /mode:max
# > /mode:max
# > "Add rate limiting to all API endpoints"
# Use FreeBuff (free tier, no subscription)
npm install -g freebuff && freebuff AI code review platform for the AI era. Automated code reviews, security scanning, and team analytics across GitHub, GitLab, VS Code, and JetBrains. Used by 300,000+ developers.
AI-powered PR description generator and code review assistant. Automatically writes pull request descriptions, sends stakeholder notifications, creates changelogs, and provides inline code refactoring.
Multi-agent AI coding platform with 12+ agents and 24+ models, featuring Chairman LLM for parallel multi-agent evaluation and end-to-end encrypted inference. Ships across six surfaces: CLI, IDE, Cloud, API, Mobile, and Builder.