Cursor's BugBot for PR review: where it earns its $40/month and where it doesn't

Published 2026-04-06 by Owner

Cursor launched BugBot in early 2025 — an automatic PR reviewer that posts inline comments on bugs it finds. It’s $40/month per active developer on top of Cursor Pro. Whether it’s worth it depends on your team’s existing review culture and what classes of bugs slip through your other gates.

I’ve been running BugBot on a small team’s repo for two months. Here’s the assessment.

What BugBot does

For each PR, BugBot:

Reads the diff
Reads the surrounding code in changed files
Reads related files via the codebase index
Posts inline comments where it suspects bugs
Marks the PR with a high-level summary of confidence

It does not comment on style, naming, or patterns. The marketing claim is “real bugs only.” This is mostly true — I’ve seen one style comment in two months, and it was actually a real null-safety issue that I’d misclassified initially.

What kinds of bugs it catches

Across about 80 PRs:

Off-by-one errors. Caught reliably. The classic for (let i = 0; i < arr.length - 1; i++) instead of i <= arr.length or i < arr.length. BugBot pointed out three real ones across the period.

Missing null/undefined checks. Caught reliably. Especially good at noticing when a function’s return type allows undefined and the caller doesn’t handle it. About 12 real catches.

Race conditions in async code. Caught about 60% of the time. Notable: BugBot once flagged a useEffect dependency that would cause an infinite loop under specific input conditions. That one was a real catch and would have been a 4am incident.

Wrong error handling. Caught reliably for thrown exceptions; less reliable for callback-style or signal-based error paths. About 8 real catches.

SQL injection patterns. Caught reliably for direct string interpolation; missed when interpolation went through a helper function. About 4 catches.

Missing branches in switch statements. Caught when the type system has explicit unions; less reliable when the cases are runtime values. About 5 catches.

What it misses

Logic bugs that look correct. Code that does the wrong thing for the right syntactic reason. BugBot’s review is structural; it doesn’t understand the intended behavior beyond what’s written in the comments.

Concurrency bugs across services. A race between two services accessing the same database row. BugBot sees one service at a time; the cross-service race is invisible.

Performance regressions. A change that adds an N+1 query, doubles a hot path’s allocations, or accidentally turns a O(n) operation into O(n²). BugBot doesn’t reason about complexity. Catches none of these.

Business rule violations. “This change allows a user to set their balance to negative.” If the type system says number, BugBot doesn’t know your business has a “balance must be non-negative” rule unless it’s encoded in code.

Test coverage gaps. BugBot doesn’t comment on missing tests. If a change adds new code paths and doesn’t add tests for them, BugBot is silent.

False positive rate

About 40% of BugBot’s comments don’t represent real bugs. The breakdown:

~15% are technically correct but pointing at intentional behavior
~15% are subtly wrong understanding of the code’s actual flow
~10% are correct in isolation but ignored because of a higher-level invariant the model doesn’t know about

This is high. It’s not “ignore everything BugBot says” high — the real catches are valuable enough that I read every comment — but it’s high enough that you can’t merge based purely on “BugBot is happy.” You still need a human reviewer.

Comparison to manual review

A serious human reviewer catches more bugs than BugBot. The catch rate isn’t even close. But:

Human reviewers don’t review every PR. BugBot does.
Human reviewers catch the bugs they’re predisposed to look for. BugBot is more uniform.
Human reviewers spend their time variably. BugBot is consistent.

The right comparison isn’t “BugBot vs the best human reviewer.” It’s “BugBot vs no review on the PRs that don’t get human attention.” For teams where some PRs get cursory review, BugBot is filling a real gap.

For teams with rigorous review on every PR, BugBot’s marginal value is smaller.

The $40/month question

For me, the cost analysis comes out to:

Real bugs caught per developer per month: ~3
Time saved per real bug caught: ~30 minutes (catching post-merge would mean revert + redo + maybe customer impact)
Total time saved per developer per month: ~90 minutes
Time spent reading false positives: ~30 minutes

Net time saved: ~60 minutes per developer per month.

At $40/month per developer, the breakeven is around $40/hour fully-loaded developer cost. For most teams, the math works.

The math doesn’t work for:

Teams of 1-2 where every PR gets careful review anyway
Teams that are extremely disciplined about CI gates and have low post-merge regression rates
Teams where the bugs that ship aren’t the kind BugBot catches (more likely on infrastructure or distributed systems work)

What I’d configure differently

Out of the box, BugBot reviews every PR. For our team, we’d save noise by:

Skipping draft PRs (BugBot reviews them, comments arrive, you push more, they all become stale)
Skipping documentation-only PRs (rarely have bugs, lots of false positives on prose)
Skipping config-only PRs (unfamiliar territory for the model)

These filters aren’t built in. You can mostly approximate them by just ignoring the draft and docs PRs’ BugBot comments, but the noise is still in the GitHub UI.

A specific BugBot comment I’m glad we got

A real example, sanitized. A junior engineer added a feature that included this code:

const recentLogins = await getRecentLogins(userId);
const lastLogin = recentLogins[0];
trackEvent('user.return', {
  userId,
  daysSinceLastLogin: differenceInDays(new Date(), lastLogin.createdAt)
});

BugBot’s comment: “If the user has no recent logins, recentLogins[0] is undefined and accessing .createdAt will throw. Consider checking length first.”

This was a real bug. The function returned an empty array for first-time users. The engineer hadn’t thought about it. The reviewing engineer hadn’t either — they were focused on a different concern.

This bug would have shipped, fired only on first-time users, and produced a confusing analytics gap for a few weeks. Or longer. BugBot caught it before merge.

That single catch represents about 2 hours of saved investigation time, plus the customer experience cost of users hitting a 500. At $40/month, the tool earned its keep for that PR alone.

The other catches are gravy.