Why I stopped using AI for commit messages

I used AI to generate my git commit messages for about six months. Cursor’s “Generate Commit Message” feature, JetBrains AI Assistant’s equivalent, custom Claude prompts. The output was always grammatical, usually accurate to the diff, and consistently the wrong commit message.

It took me a while to figure out why. The output described what changed. Commit messages should describe why the change was made.

After noticing, I went back to writing my own. The change cost me about 30 seconds per commit and made my git history materially more useful. This is what I learned, and why I think most AI commit message tools are subtly miscalibrated.

The two things a commit message can do

Every commit message is doing one or both of:

Describing what changed — the visible-in-the-diff modifications
Describing why it changed — the reason the change was made

The first is mechanical. The second requires knowing things that aren’t in the diff: what the bug was that this fixes, what feature this enables, what tradeoff was made and why this side of the tradeoff was chosen.

AI tools, by their nature, can only see the diff. They write the first kind of commit message. They cannot write the second kind, except by inferring from naming and patterns, which is unreliable.

What AI commit messages typically look like

A real example from my old workflow. The diff: a change to a validateEmail function to allow plus-addressing (user+tag@example.com).

Cursor’s generated commit message:

Update validateEmail to allow + characters in local part

This is technically accurate. It describes the diff. It tells me literally nothing useful that I couldn’t get from git show.

The commit message I’d write by hand:

Allow plus-addressing in email validation

Customers were reporting that signups failed when they used + in their 
email addresses. This is valid per RFC 5321 and used widely (especially 
gmail filtering). The previous validator rejected these addresses, so 
those signups failed silently.

Tested with: user+tag@example.com, user+@example.com, user+a+b@example.com

The hand-written version answers questions the AI version doesn’t:

Why was this change made? (Customer reports of failed signups)
Was this an oversight or a deliberate choice that’s being revisited? (The validator was wrong; this fixes it)
What’s the failure mode that prompted the change? (Silent signup failure)
What was tested? (Specific edge cases)

Six months later, when someone is git-blaming this line to figure out whether they can change the validator further, the AI version is useless and the hand version contains the relevant context.

The asymmetry

Here’s the asymmetry that makes this consistently miscalibrated:

At commit time, an AI message is fine. It describes what you just did. You don’t need it to tell you anything; you remember the change. The message is just a label.

At read-time, six months or two years later, you don’t remember the change. You’re reading the commit message because you need to know why the change was made. The AI message is useless to you because the AI couldn’t have known why.

The AI commit message tools optimize for write-time convenience and underweight read-time utility. The cost is invisible at write time. It’s expensive at read time, distributed across every developer who later needs to understand the codebase.

The “what” is in the diff anyway

The thing AI commit messages provide — a description of what changed — is information you can get from git show directly. The message is redundant.

A useful commit message provides information that isn’t in the diff:

The motivating context
The decision that was made
The alternatives considered
The tests that verify the change
The risks if this needs to be reverted

None of these is in the diff. None of these can be inferred by an AI looking at the diff. All of these are useful at read-time.

What AI is actually good at for commits

A few specific commit-related tasks where AI does help:

Formatting and structure. Once you’ve written your message, AI can format it consistently — wrap at 72 characters, reorder body paragraphs, add appropriate footers. This is mechanical and AI does it well.

Catching missing information. A useful prompt: “Look at this diff and my commit message. What information that would be useful for future readers is missing?” The AI sometimes notices that you didn’t mention which tests were updated, didn’t reference the issue this fixes, didn’t mention the breaking change. This is the AI augmenting your message, not replacing it.

Multilingual translation. If your team commits in English but you’re more comfortable thinking in another language, AI translation of your written message preserves the why and just translates the language. This is fine.

Drafting for very mechanical commits. Bumping a dependency version, formatting changes, lint-only fixes. These genuinely don’t have a “why” beyond “we needed to,” and AI’s mechanical message is sufficient.

The pattern: AI is good for mechanical commits and useful for augmenting human-written messages. It’s bad as a default for commits that have meaningful intent behind them.

The 30-second cost

The actual cost of writing my own commit messages is small. After six months of practice, my commit message writing is fast:

Read my own diff (5 seconds — I just wrote it)
Think about why I’m committing this (10 seconds)
Type the message (15-30 seconds)

That’s 30-45 seconds per commit. For someone committing 5-10 times a day, that’s 5-8 minutes daily. Not free, but small.

The benefit of those minutes: a git history I can actually use. Six months from now, when I’m trying to remember why I changed the email validator, I can read my own message and remember. With AI messages, I’d read the message, learn nothing useful, and have to dig further.

What changed for me

Three things shifted after I went back to hand-written commit messages:

My commits got smaller. When I have to write the why, I notice when “the why” covers two unrelated things, and I split the commit. AI messages didn’t surface this — they could mush several reasons into one description without me noticing.

My PR descriptions got better. A PR description is essentially “all the commit messages, restructured.” Hand-written commit messages give me better raw material for PR descriptions. The PR-writing time dropped because I wasn’t reconstructing the rationale from scratch.

My code review felt more grounded. Reviewing a PR with thoughtful commit messages is faster than reviewing one with AI-generated descriptions. The reviewer can read the commit messages and know what each commit was supposed to do, then verify the diff achieves it. With AI messages, the reviewer has to figure out the intent from the diff alone.

For most engineers, my recommendation now:

Don’t use AI to generate commit messages from diffs. The output is technically accurate and not useful.
Do use AI to format or augment commit messages you’ve drafted. Mechanical work is fine.
For mechanical commits (deps bumps, formatting), AI messages are OK. These don’t have meaningful intent.
For real commits, write the message yourself. It takes 30 seconds and produces materially better output.

The exception is teams that don’t read their own commit messages anyway. Some teams use only PR-level descriptions and don’t read individual commit messages much. For those teams, AI commit messages are fine because nobody’s going to read them. The waste isn’t visible because the artifact isn’t used.

For teams that do git blame, do git log archaeology, do post-incident commit-history reviews, the small cost of hand-written commit messages pays for itself many times over. The discipline isn’t visible day-to-day; it shows up in the team’s ability to navigate its own history.

That ability matters more than the 30 seconds of save-time per commit. It’s a small lever with compounding returns. AI tools optimize for the wrong side of the trade-off here, and after six months I noticed.