Aider chat history compression: keeping long sessions affordable

Published 2026-05-11 by Owner

Every Aider turn sends the entire conversation history to the model. That’s not a bug — the model needs the history to stay coherent across turns. But it means cost compounds in a way that isn’t visible until the bill arrives.

A session with 40 turns and a chat history that has grown to 25,000 tokens doesn’t cost 40 times a single turn. It costs roughly the sum of 1 + 2 + 3 + … + 40 turns worth of history tokens. The later turns are the most expensive because they carry the most accumulated context, and a long history can dwarf the tokens in the actual code change being made.

To put numbers on it: a fresh Aider turn with a single 200-line file in context might cost 2,000 input tokens. That same turn at session turn 50, with 20,000 tokens of accumulated conversation, costs 22,000 input tokens — eleven times as much for the same code change. The code change didn’t get harder. The history got longer.

There are three tools for managing this: /clear, /drop, and mid-session summarization. There’s also a harder reset: end the session, commit the work, start fresh. Each approach has a different cost-vs-context tradeoff, and they’re not mutually exclusive — a well-managed long session might use all of them at different points.

The baseline check is /tokens, which Aider exposes as a slash command. Running it periodically makes the cost distribution visible and gives an early signal when history is starting to dominate the context budget.

The `/clear` and `/drop` commands

/clear removes all chat history in one operation. The files Aider has added to the session stay in context — the coding context is preserved — but the conversation log is wiped. The next turn costs only the system prompt, repo map, files, and the new message, not the 30-turn backlog.

/clear

That’s the blunt instrument. More surgical is /drop, which removes a specific file from the session. If three files were added for a task that’s now done and two new files are needed, drop the old ones.

/drop src/auth/old-service.ts

Files that aren’t in the session aren’t priced into the context. This matters more than it looks. A session that adds files as it goes — feature A touches four files, feature B needs three different files — can easily accumulate 10-15 open files by turn 30. Many of those files are no longer being edited; they’re just sitting in context, costing tokens on every turn.

A practical sequence for a long session that’s still ongoing:

Check what’s currently in context with /tokens. This shows the breakdown: system prompt, repo map, chat history, files.
/drop any files that were needed earlier but aren’t relevant to the current task.
If the conversation log is bloated (many turns of back-and-forth on a now-resolved question), /clear.
Re-add the files needed for the next task.

After this sequence, a session that costs 18k tokens per turn often drops to 4–6k. On Claude Sonnet at current pricing, that’s a meaningful difference per turn and even more so across 20 remaining turns. The saved context is the history that was answering questions already settled by the code.

The one thing to watch with /clear: don’t do it in the middle of an unsettled problem. If there are three open questions about how a feature should work, those questions live in the history. Clearing before they’re resolved means re-explaining context on the next turn, which can cost more than the tokens saved. Clear after a task is done, not before it’s finished.

One useful pattern is to use /drop aggressively throughout a session, and reserve /clear for the transitions between distinct tasks. Drop files as soon as a sub-task is done — they don’t need to stay open. Clear history when moving to a genuinely new problem. This keeps the session lean without losing the active working context.

Mid-session summarization

Aider includes automatic summarization for sessions that grow past a threshold. When the conversation history approaches the model’s context window limit, Aider compresses it: it asks the model to summarize what’s been discussed, then replaces the raw history with the summary.

This happens transparently — a note in the output indicates that summarization occurred. The context gets shorter, and subsequent turns are cheaper.

The tradeoff is real and worth understanding. A summary captures “we decided to use a repository pattern for data access” but loses “we tried a service-layer approach first and dropped it because of the circular dependency between UserService and AuthService.” The specific reasoning that led to a decision is exactly the kind of context that prevents the model from suggesting the rejected approach in turn 47.

Summarization is most useful when cost is the dominant constraint and the important decisions are already reflected in the code itself. It’s least useful in the middle of a nuanced debugging session where the chain of what was tried and ruled out actively shapes the next step.

There’s no manual trigger for summarization in Aider — it activates automatically based on context length. The way to influence it is to manage history before it gets long enough that auto-summarization triggers, so the choice of what gets compressed is yours (via /clear or session restart) rather than delegated to the model. A model-generated summary of your debugging session is better than hitting the context ceiling, but it’s worse than a deliberate context reset at a logical boundary.

Restarting from a commit

The cleanest way to reset context isn’t a command — it’s ending the session.

When a chunk of work is complete and committed, the new code is the ground truth. A fresh Aider session with that commit as the baseline gives the model the correct context without any accumulated conversation overhead. The code already holds the answers that history was carrying.

The workflow:

# in Aider
/commit                     # commits current working changes
/quit                       # exit Aider

# in terminal
git log --oneline -3        # verify the commit landed cleanly
aider src/feature-b.ts      # start fresh for the next task

The fresh session carries zero history tokens. The model reads the committed code and starts from a clean slate. For multi-hour sessions, the pattern of short session → commit → short session → commit is consistently cheaper than one long session with periodic /clear calls.

It also produces a cleaner git history. A session that runs for 80 turns typically produces a messy sequence of intermediate commits — some labeled “wip”, some unlabeled, some bundling unrelated changes together. Three sessions of 25–30 turns each, each starting from a clear intent, tend to produce three coherent and reviewable commits.

The mental model: each commit is a checkpoint. Starting a new session from a checkpoint isn’t admitting defeat on the previous session — it’s deploying the progress from that session as the new baseline.

One thing the restart approach enables that /clear doesn’t: model switching. Starting a fresh session lets you pass a different --model flag. If the complex architectural reasoning in the previous session established a clear structure, the next session — which just needs to fill in the implementation — might not need the same model. The restart is also a natural point to reconsider whether the model choice is right for the remaining work.

The anti-pattern: one session that should have been three

Here’s how a session grows past the point where any cleanup command fully helps.

A session starts with a well-scoped task: add a new API endpoint. The endpoint needs a new database model, a migration, a service layer, and a controller. That’s perhaps 20 turns, all related, all necessary context for each other.

Midway through, a bug surfaces in the auth middleware. The auth code gets added to the session, 8 turns of debugging follow. The bug is fixed but not committed separately — it’s bundled into the same working tree because the endpoint work isn’t done yet.

Then, while testing the endpoint, a performance issue appears in a query the endpoint relies on. That gets addressed in another 15 turns. The session is now at 43 turns. The history includes the auth debugging (resolved), the query optimization (resolved), and the endpoint work (in progress). Everything is entangled.

The history costs keep compounding. Each turn on the endpoint work drags along the auth debugging and performance history, which add no value — those tasks are done and the results are in the code. But /clear would also lose the endpoint context that’s still needed. There’s no clean trim available.

The point where this should have been split was obvious in retrospect: when the auth bug surfaced, the right move was to /commit the in-progress endpoint work as a WIP commit, start a fresh session for the auth fix, commit that separately, then return to the endpoint session. Three sessions with clean conceptual boundaries, each paying only for its own history. The total token cost would have been lower, and the git history would have been readable.

The signal to split isn’t “this session feels long.” It’s “a new task surfaced that’s conceptually separate from what I’m working on.” The auth bug was a separate task the moment it was identified. Treating it as such — with a separate session — is the discipline that keeps long-form Aider work affordable.

A secondary cost worth naming: entangled sessions produce entangled commits. Reviewing a PR where one commit fixes a middleware bug, optimizes a query, and adds a new endpoint is harder than reviewing three focused commits. The cost isn’t just monetary — it’s review friction and rollback risk. If the endpoint feature turns out to need changes, the clean separation makes it easy to revert or modify only that work without touching the auth fix.

This is the reason the “should have been three sessions” pattern is worth correcting even when the total cost difference is modest. The session structure maps to the work structure. Keeping them aligned produces better output on both dimensions.

Reading the `/tokens` output

Aider’s /tokens command shows a per-category breakdown before costs spiral:

Tokens: 2,156 sent, 0 received
  System: 1,204
  Chat history: 682
  Files: 270

A useful heuristic: chat history below 30–40% of total tokens is healthy. When it climbs past 50%, the conversation log has become the dominant cost driver and a reset of some kind is warranted — either /clear for history, /drop for files, or a session restart.

The system prompt and repo map are fixed overhead. Files are controlled by what you’ve added to the session. History is the one variable that compounds automatically with time. Checking /tokens every 10–15 turns takes a few seconds and makes the cost distribution legible before it becomes a problem.

What to look for specifically:

History approaching or exceeding Files: files are doing work; history is a record of past work. If history is larger than the active file context, the session is paying for overhead that no longer drives quality.
Total tokens above 15k–20k on a task that doesn’t require it: not every task needs large context. A refactor of a single function should stay under 5k total. If it’s at 18k, something accumulated that doesn’t need to be there.
Slow Aider responses: the model processes the entire input on every turn. A bloated context shows up as noticeably slower responses before the cost shows up in the bill. If Aider feels sluggish, check /tokens before assuming it’s a network or API issue.

The practical insight: Aider costs aren’t proportional to the number of turns. They’re proportional to the cumulative token area under the curve as history grows. A session of 50 turns with active history management can cost less than a session of 20 turns that was never pruned. The difference is visible in /tokens, turn by turn, if you look.

Aider chat history compression: keeping long sessions affordable

The /clear and /drop commands

Mid-session summarization

Restarting from a commit

The anti-pattern: one session that should have been three

Reading the /tokens output

Aider

The `/clear` and `/drop` commands

Reading the `/tokens` output