# Building Production-Ready Claude Code Skills | Artificialus

> For the complete content index, see [llms.txt](https://artificialus.com/llms.txt). Markdown versions of all pages are available by appending `.md` to any URL.

- Home
- /
- Articles
- /
- Building Production-Ready Claude Code Skills

Guides

# Building Production-Ready Claude Code Skills

Claude Code Skills are filesystem-based modules that extend the agent with specialized capabilities, and they're not the same thing as CLAUDE.md. Here's how the progressive-disclosure architecture actually works, how to build a production-ready skill end-to-end, and why Simon Willison thinks they might be a bigger deal than MCP.

May 17, 2026

7 min read

D

Written by

Doc | The Researcher

Share

X

Facebook

Reddit

Telegram

Bluesky

Email

Claude Code 's Skills are filesystem-based modules that extend the agent with specialized capabilities — workflows, tools, conventions, domain knowledge. Anthropic introduced the format on October 16, 2025, and released it as an open standard shortly after. Today, the same `SKILL.md` you write for Claude Code also runs in Claude.ai, the Claude API, and (importantly) in OpenAI Codex, Cursor, Gemini CLI, Antigravity, and Windsurf. Write a skill once, run it across the major agent platforms.

This guide walks through what Skills actually are, how the progressive-disclosure architecture works, and how to build one end-to-end.

## What Skills Are — and What They're Not

A common point of confusion: Skills are not the same thing as `CLAUDE.md`.
- `CLAUDE.md` is a project memory file. Claude Code reads it on startup and keeps its contents in the system prompt for the whole session. It's the right place for "always remember this": code style rules, framework conventions, testing preferences, the names of your private packages. It's always loaded, so everything in it pays a context-window cost on every turn.
- Skills are modular capabilities packaged as folders. Only their metadata loads on startup; the full instructions and any bundled resources load on demand, when Claude decides the skill is relevant to the task at hand. This is why you can have hundreds of skills installed without bloating the context window.
You use `CLAUDE.md` for things Claude must always know about your project. You use Skills for things Claude should know how to do — repeatable tasks that may or may not come up in any given session.

It also helps to position Skills against two adjacent layers:
- MCP servers are the transport: they define how an agent connects to external systems, handles authentication, and discovers available tools.
- Tools are the individual functions an agent can invoke (`read_file`, `create_issue`, `query_database`).
- Skills are the behavior: what to do, in what order, with what guardrails, once the agent has the connections and tools it needs.
In production, all three layers run together. Skills don't replace MCP — they sit on top of it.

> As Simon Willison put it shortly after the launch: Claude Skills are awesome, maybe a bigger deal than MCP.

## The Progressive-Disclosure Architecture

The single most important technical idea in Skills is progressive disclosure — and it's the part most "intro to Skills" tutorials skip past. Skills load in three stages:
- Metadata scan (~100 tokens per skill). At startup, Claude sees only each skill's `name` and `description` from its YAML frontmatter. With hundreds of skills installed, this adds up to a few thousand tokens — manageable.
- Full instructions (typically <5,000 tokens). When the model decides a skill is relevant to the current task — based purely on those descriptions — it loads the full `SKILL.md` body into context.
- Bundled resources (on demand). Reference docs, scripts, templates, and assets in the skill folder load only when the `SKILL.md` instructions point Claude to them.
This is what makes the model practical at scale. A 50KB reference document inside a skill costs zero context tokens until Claude actually reads it. The implication for skill authors is direct: write your `SKILL.md` for what to do, and push everything Claude needs to know into reference files. A 2,000-word `SKILL.md` that mixes process with background context loads inefficiently and gives the model an attention problem; a 200-line `SKILL.md` that points to `references/api-guide.md` when needed is fast and clean.

The activation decision happens at the model level — no embeddings, no classifiers, no keyword matching. Claude reads the descriptions and decides. Which means your `description` field is the single biggest lever for whether your skill actually gets used. Vague descriptions ("Code review tool") get ignored; specific ones with concrete trigger phrases get picked up.

## Anatomy of a Skill

The minimal skill is a folder with a single file:

```
`my-skill/
└── SKILL.md
`
```

A full-featured skill might look like this:

```
`code-reviewer/
├── SKILL.md # Required: frontmatter + process instructions
├── references/ # Optional: load-on-demand context
│ ├── security-checks.md
│ ├── perf-patterns.md
│ └── team-conventions.md
├── scripts/ # Optional: deterministic executables
│ ├── run_linters.sh
│ └── check_types.py
└── templates/ # Optional: output templates
└── review-report.md
`
```

The `SKILL.md` itself starts with YAML frontmatter, then markdown instructions:

```
`---
name: code-reviewer
description: Review pull request diffs for performance, security, readability, and project conventions. Use when reviewing PRs, checking code quality, analyzing diffs, or when the user mentions "review", "PR", or "code quality".
---

# Code Review Skill

## Process

1. Read the diff using `git diff main...HEAD`.
2. Run `scripts/run_linters.sh` and capture output.
3. Run `scripts/check_types.py` and capture output.
4. For each changed file, check against `references/security-checks.md`
and `references/perf-patterns.md`.
5. Compile findings using `templates/review-report.md`,
tagging each with severity: blocker / major / minor / nit.

## Conventions

Read `references/team-conventions.md` before flagging style issues
to avoid contradicting team standards.
`
```

Notice the structure: the `SKILL.md` describes the process (what to do, in what order); the `references/` files contain the knowledge (what to look for). When Claude runs this skill, the process loads immediately, and the references load only when the process tells Claude to read them.

## Building a Code Review Skill, End to End

Let's make the example above concrete. Start by scaffolding the directory in your project's `.claude/skills/` folder (or `~/.claude/skills/` for user-wide skills):

```
`mkdir -p .claude/skills/code-reviewer/{references,scripts,templates}
cd .claude/skills/code-reviewer
`
```

Write the `SKILL.md` as shown above. Then build out the supporting files:

`scripts/run_linters.sh` — deterministic execution beats asking Claude to "run the linter mentally":

```
`#!/usr/bin/env bash
set -e
ruff check . --output-format=json > /tmp/lint.json || true
prettier --check . > /tmp/format.txt || true
cat /tmp/lint.json /tmp/format.txt
`
```

`references/security-checks.md` — the actual checklist Claude consults when reviewing security-sensitive code (auth, input validation, SQL handling, secret management, etc.). Keep this comprehensive: it costs zero tokens until needed.

`references/team-conventions.md` — your team's idioms. Naming patterns, preferred error handling, approved dependencies. The kind of thing that would otherwise live in a senior engineer's head.

`templates/review-report.md` — a structured output template so reviews come back in a consistent shape your tooling (or your humans) can parse.

Once installed, you don't invoke this skill explicitly. You ask Claude to review a PR, and the description's trigger phrases ("review", "PR", "code quality") cause Claude to load the skill on its own. The first time it works seamlessly, the magic of progressive disclosure clicks.

## The Ecosystem

The community has been building skills aggressively since the October 2025 launch. A few landmarks:
- `anthropics/skills` — Anthropic's official skill repository. Document manipulation (docx, pdf, pptx, xlsx), brand guidelines, internal communications. The reference implementations.
- `obra/superpowers` — Jesse Vincent's collection of 20+ battle-tested workflow skills (TDD, debugging, planning). Widely cited as a quality baseline for what good skills look like.
- `karanb192/awesome-claude-skills` — 50+ community-verified skills with a verification badge system.
- `ComposioHQ/awesome-claude-skills` — A larger directory (1,000+ entries) with cross-platform support notes.
- `sickn33/antigravity-awesome-skills` — At 38k+ stars, the most prominent installable library: 1,460+ skills with a `npx antigravity-awesome-skills` CLI installer that places them in the right directory for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and others.
First-party releases are also showing up from major platforms — Vercel Labs , Supabase , Microsoft , and others have published official skills for their stacks.

## Security: The One Slide Everyone Skips

> ⚠️ Skills can execute arbitrary code. A skill's `scripts/` directory runs with the same privileges as the agent that loaded it. Installing a malicious skill is functionally equivalent to running an untrusted shell script on your machine.

Anthropic's own engineering post on Skills warns explicitly that malicious skills can introduce vulnerabilities or exfiltrate data. The community has converged on a small set of practices that mitigate the risk without giving up the convenience:
- Install only from sources you've personally vetted. The verification badges on curated lists help, but don't substitute for reading the code.
- Review every `SKILL.md` and every file in `scripts/` before installing. If you wouldn't `curl | bash` the script, don't install the skill.
- Pin to specific commits, not `main`. A skill that was safe yesterday isn't guaranteed to be safe today.
- Test new skills in a sandboxed project before adding them to your daily workflow.
The convenience of `npx some-installer add-everything` is real. So is the risk. The two scale together.

## The Shift Skills Represent

Skills are quietly doing something interesting: they're encoding team and organizational knowledge in a format an agent can use directly. The senior engineer's "you should always check X before merging Y" stops being tribal knowledge passed in 1:1s and becomes a `references/team-conventions.md` that Claude consults every time it touches that part of the codebase.

The teams getting the most out of Claude Code right now aren't the ones using the most skills. They're the ones investing the time to convert their best processes — code review checklists, incident playbooks, release procedures — into well-structured skills that any developer on the team can trigger by name. Skills make the agent an extension of how your team actually works, not just a generic assistant pasted on top of it.

Further reading: the official Skills documentation , Anthropic's Skill authoring best practices , the Claude Code Skills docs , and Simon Willison's analysis of why Skills may matter more than MCP .

### No comments yet

Name

Email

Don't fill this out

Comment

Post Comment

Key Metrics

Read time

7 min

Words

1,338

In this article

## Continue reading

AI Research

6 min

### The Infrastructure Category That Didn't Exist Two Years Ago: AI Agent Observability

Why traditional APM breaks on agent workloads and how LangSmith, Braintrust, and Arize are building the observability stack for the AI era.

AI Research

Jun 3, 2026

Engineering

8 min

### GitHub Copilot Token-Based Billing: What It Means for Developers

GitHub Copilot moves to token-based AI Credits on June 1, 2026. A practitioner's analysis of the new pricing, what it reveals about agentic AI costs, and how to optimize usage.

Engineering

Jun 3, 2026

Landscape

7 min

### Anthropic's IPO: The $965B Test of Safety-First AI at Scale

Anthropic files for IPO after $65B raise at $965B valuation. The safety-first AI company faces its toughest test yet: can principles survive public markets?

Landscape

Analysis

Jun 3, 2026