Claude Code skills: why they exist when prompt templates already worked
Published 2026-05-11 by Owner
The first instinct when you want Claude Code to follow a specific workflow is to drop instructions into CLAUDE.md. That works. For a while, it works well. Then you add another workflow. And another. By the time CLAUDE.md has five workflows in it, the model is loading all of them on every single prompt — including prompts that have nothing to do with any of them.
Claude Code skills solve a specific version of this problem: the context-bloat that happens when you want the model to handle many different specialized behaviors, not all at once.
What a skill actually is
A skill is a directory containing a SKILL.md file. That’s the whole package. The SKILL.md has frontmatter with two required fields — name and description — and a body that describes what the model should do when the skill is active.
---
name: pr-checklist
description: Run this before creating any pull request. Verifies test coverage, checks for console.log statements, confirms branch is rebased on main, and formats the commit message.
---
Before creating a PR:
1. Run `bun run test` and confirm all tests pass.
2. Search for `console.log` in staged files. If any exist, ask whether to remove them.
3. Run `git log --oneline main..HEAD` — if the branch is more than 5 commits behind main, suggest rebasing.
4. The commit message should be imperative mood, under 72 characters on the first line.
Do not create the PR until all four checks are resolved.
That’s a complete, functional skill. The frontmatter tells Claude Code what the skill is and when it might apply. The body tells it what to do when invoked.
The description field does the real work
Here is the part that makes skills different from prompt templates: Claude Code reads the description field of every installed skill to decide whether to load it for a given task.
With a prompt template, you paste it in every time or you configure it to always load. Either you remember to use it, or it’s always in context.
With a skill, the model compares your request against skill descriptions at the start of each task. If your request matches a skill’s description, that skill’s body gets loaded into context. If it doesn’t match, the skill isn’t loaded at all.
The pr-checklist skill above will be considered any time you mention creating a pull request. For a prompt about writing a database migration, it won’t appear at all. This is the lazy-loading property — context tokens only get spent on a skill when there’s actual reason to think the skill applies.
The description needs to be precise enough that the model can make a good match decision. Vague descriptions — “useful for coding tasks” — match everything and load everywhere, which defeats the purpose. Specific descriptions — “run before creating any pull request” — load in exactly the right context.
A side effect of good descriptions: they act as documentation. When someone looks at your team’s skill list, the description tells them exactly when each skill applies. That’s something a pile of prompt templates can’t offer.
Here’s a before/after that illustrates the precision difference:
# Too vague — matches almost any task:
description: "Helps with code quality and best practices"
# Specific — matches exactly the right moment:
description: "Use when refactoring a module that has no tests yet.
Adds a test file alongside the module before touching any implementation."
The second description activates in a narrow window. The model only loads it when the task is specifically a refactor with absent tests. Seventeen other skills can sit installed and quiet while this one fires on its one use case.
How skills reach the model
Skills are discovered through plugins. A plugin is a directory with a plugins.json or manifest file that declares what skills it contains. When you install a plugin into Claude Code, all the skills in that plugin become available.
The distribution model this creates is:
- Marketplace plugins: Skills packaged and distributed via the Claude Code plugin marketplace. You install the plugin; its skills show up in your skill list.
- Team plugins: A git repo that your team uses as a plugin source. Everyone who clones the repo and installs it as a plugin gets the same set of skills. This is the main pattern for team-shared workflows.
- Local plugins: A directory on your local machine, not checked into anything. Good for personal workflows.
For a team, the practical setup is a git repo that holds skills for the team’s specific workflows — deploy procedures, PR formats, incident response checklists. You install that repo as a plugin. When someone updates a skill, everyone pulls and the new version is available.
This is a meaningful difference from CLAUDE.md. A CLAUDE.md is per-project. Skills can be installed across all your projects, shared across a team, or pulled from the marketplace.
Invoking a skill directly
Two ways to invoke a skill.
The first is auto-loading: if the model determines that your task matches a skill’s description, the skill gets loaded automatically. You don’t have to do anything.
The second is explicit invocation via the Skill tool. When Claude Code is running as an agent and has the Skill tool available, it can be called with a skill name. The name resolves to the installed skill’s body and loads it into the active context. In practice, you’d use this when you want to explicitly hand off to a skill rather than relying on description matching.
The auto-loading path is what most people use in daily work. You type “create a PR” and the pr-checklist skill activates on its own. The explicit invocation path shows up in agent compositions where one skill hands control to another.
What goes in CLAUDE.md instead
Skills are for modular, reusable behaviors that apply in specific contexts. CLAUDE.md is for things that always apply, project-wide.
The distinction in practice:
- Code style conventions, library preferences, test patterns → CLAUDE.md. These apply to every task in the project.
- PR creation workflow → skill. Only relevant when creating a PR.
- Deploy procedure → skill. Only relevant when deploying.
- “Use bun, not npm” → CLAUDE.md. Applies everywhere.
- Incident response runbook → skill. Applies only when responding to an incident.
If you find yourself putting conditional logic into CLAUDE.md — “if you’re creating a PR, do X; if you’re deploying, do Y” — that’s a signal the conditionals belong in skills.
There’s also a practical token argument. A CLAUDE.md with three embedded workflows might run 600-900 tokens of constant overhead on every prompt. Split those into three skills and you drop to near zero overhead on any single-purpose task, with the relevant skill loading when actually needed. On a long coding session with many distinct tasks, that adds up.
When not to write a skill
Three situations where a skill is the wrong tool:
One-off prompts. If you need something once and won’t need it again, a skill is overhead. Write the prompt inline, use it, move on.
Environmental setup. Things like “always use this API key format” or “the test database is at this URL” aren’t behaviors — they’re facts about the environment. Those belong in CLAUDE.md or in hooks, not in skills.
Things that should always be active. If a behavior should apply to every single task, putting it in a skill and relying on description-matching introduces uncertainty. The model might not match it when you need it to. For always-on rules, CLAUDE.md is more reliable.
The wrong mental model for skills: “a place to store prompts.” The right mental model: “a behavior that should activate in specific circumstances, which can be installed once and shared.”
The size and scope question
Skills work best when they’re scoped to one clear behavior. The pr-checklist skill does one thing: check the pre-PR state. It doesn’t also handle post-merge cleanup and dependency updates. Those would be separate skills.
The reason for tight scope is the description field. A narrowly scoped skill has a description that the model can match confidently. A broadly scoped skill needs a description that covers many cases — which starts to look like “applies to coding tasks” and loses its matching precision.
My working rule: if I can’t write the description in one sentence, the skill is probably doing too much.
Marketplace distribution and what it means
The skills marketplace means skills can be published by anyone and installed by anyone. A skilled consultant who builds a solid code-review workflow can distribute it as a plugin. A company can publish an internal tooling plugin to their private registry.
This matters because it changes the unit of sharing. Before skills, sharing AI workflows meant sharing prompt text — paste this into your CLAUDE.md, or use this prompt template. Skills make workflows installable artifacts with versioning, discoverability, and install/uninstall lifecycles.
In practice, for most teams today, the relevant distribution is the private team plugin in a git repo. The marketplace matters more as the skill ecosystem matures and there are more published packages worth installing.
For now: if your team has repeatable workflows that you want in Claude Code, put them in skills, put the skills in a git repo, install the repo as a plugin. That’s the current practical form of the distribution model.
A note on what this isn’t
Claude Code’s skill system is specific to Claude Code — the Anthropic CLI tool, not the API, not Claude.ai. It’s not the same thing as OpenAI’s GPTs, Anthropic’s broader “Skills” product branding, or any other agent framework’s concept of a skill. The mechanism described here — SKILL.md files, plugin discovery, description-based auto-loading — is specific to how Claude Code manages reusable agent behaviors.
If you’re working with the Claude API directly, this doesn’t apply. Skills are a Claude Code feature.
The skills system is still maturing. Description matching isn’t perfect, plugin management tooling is minimal, and the marketplace is early. But the core mechanism — dynamically loaded, description-matched behavior packages distributed via installable plugins — is already useful for teams with more than a handful of repeatable workflows. CLAUDE.md carries everything; skills carry only what’s needed, when it’s needed.
A reasonable starting point: audit your current CLAUDE.md for sections that only apply in specific circumstances. Anything conditional — “when creating migrations, always…”, “before any deploy, check…” — is a candidate to extract into a skill. The first extraction will make the split obvious. After a few, the pattern becomes second nature and you stop reaching for CLAUDE.md as the default home for every AI behavior.