Q2 2026 · Issue 2 All issues ·
SQ Stack Quarterly Quarterly deep dives on the tools real teams actually ship with.

Q2 2026 — Issue 2

Claude Code vs Cursor vs Copilot Workspace — Q2 2026

Three coding agents, three different bets on where the IDE is going. What each ships, where each fails, and a team-size matrix that we actually use when somebody asks.

I get the same email about once a week. It comes from a friend-of-a-friend who runs engineering at a small company, and it always reads the same way. We are about to standardize on a coding agent. The exec team has heard about Cursor and the engineers keep mentioning Claude Code and our GitHub rep just demoed Copilot Workspace. What would you pick? The answer is longer than the email deserves, and I have been writing it ad-hoc enough times that I am going to put it in a piece and link to it.

This is not a benchmark shootout. The benchmarks that exist are mostly self-reported, mostly running on SWE-bench Verified, and mostly ignore the parts of the developer workflow where the agents differ most. This is the practitioner-side comparison: what each tool actually ships, where each one falls over, what the adoption signals look like in mid-2026, and a recommendation matrix that I have personally been handing out.

The shape of the three bets

Each of the three tools is making a different bet about where coding agents belong in the developer’s life. The bets are not interchangeable. Picking the wrong one for your team is not a small mistake; it is the kind of mistake that produces an eighteen-month change-management project.

Claude Code is the terminal-first bet. The tool runs as a CLI on your machine, sits inside whatever shell and editor you already use, and reads from the codebase via the filesystem rather than via an indexed semantic layer. The interface is a conversation with an agent that has hands — it can read files, edit files, run shell commands, run tests, commit to git, and call external tools via MCP. The mental model is that the agent is a junior pair-programmer who sits at your terminal. You give it a task. It works. You review. The bet is that the terminal is where the serious work happens and the IDE is a presentation layer.

Cursor is the IDE-first bet. The tool ships as a forked VS Code with deep integrations: tab-tab-tab autocomplete (the feature that originally got the company famous), an agent-mode panel that can edit multiple files in a single pass, a per-repo semantic index that keeps the model’s context grounded, and a Background Agent feature that runs longer tasks asynchronously on Cursor’s infrastructure. The mental model is that the IDE is where engineers live and the agent should be a first-class panel in it. The bet is that the developer’s editor is the load-bearing surface, and that the company that owns the editor wins.

GitHub Copilot Workspace is the platform-first bet. The tool is an evolution of Copilot — the original suggestion-on-tab tool that defined the category — into a system that owns a piece of the GitHub workflow itself. You file an issue, hand it to Workspace, and the agent produces a plan, executes it across files, and surfaces a PR for human review. Workspace is integrated with the same per-repo semantic index Copilot has used for two years, with GitHub Actions, with branch protection, and with the rest of the platform. The mental model is that GitHub is the operating system of software production and the agent should be a feature of that OS, not a separate tool. The bet is that the platform — not the editor, not the terminal — is the surface that wins.

Three plausible bets. They are not equally well-suited to every team.

What the adoption numbers actually look like in 2026

A short interlude with the numbers, because the numbers are useful when they are not made up.

Claude Code’s revenue is the one to start with. Anthropic disclosed in early 2026 that Claude Code had reached a $2.5B run-rate revenue — the company’s fastest-growing product and one of the fastest-growing developer products of any vintage. The same disclosure cycle put roughly 4% of all public GitHub commits as having been produced via Claude Code by early 2026, with daily active usage doubling month-over-month through the back half of 2025 (Lab7AI, on Claude Code’s 2026 trajectory). Boris Cherny, the head of the team that built it, has reportedly not edited code by hand since November 2025 — internally at Anthropic, the majority of new code is written by Claude Code itself. Ramp cut incident-investigation time by 80%; Wiz migrated a fifty-thousand-line Python library to Go in roughly twenty hours; Rakuten cut feature-delivery cycles from twenty-four working days to five.

Cursor’s number is the one most people quote in coffee-meeting conversations. The company reported a revenue trajectory from $100M ARR in January 2025 to $2B ARR by February 2026 — a twenty-times run in thirteen months (tech-insider.org on Cursor’s valuation arc). The November 2025 Series D closed at $29.3B post-money, led by Accel and Coatue with Nvidia and Google on the cap table (CNBC, 2025-11-13). As of April 2026 the company was in talks to raise a Series E at a $50B pre-money with a16z and Thrive returning. The valuation is doing two things at once: pricing the seat-based subscription and pricing the implied platform lock-in. Both numbers are aggressive.

Copilot Workspace’s headline is technical. In a public March 2025 evaluation, Copilot Workspace scored 55% on SWE-bench Verified — the highest among commercial coding tools at that snapshot, ahead of Cursor (48%), Aider (42%), and direct Claude (37%) (DevOpsBoys benchmark recap, 2026). SWE-bench is not the whole story of a coding agent — it measures bug-fix throughput on real GitHub issues, which is one slice of the job — but Workspace was the first commercial tool to clear the fifty-percent line on the verified split. Pricing sits at $19/user/month for Copilot Business with Workspace bundled in. Adoption is harder to pin down because Microsoft does not break out Workspace from the rest of the Copilot family, but the same UCSD/Cornell developer survey in January 2026 that found Claude Code with 58 respondents found Copilot at 53 and Cursor at 51 — a top-three race within margin of error (VentureBeat on the developer survey).

The numbers tell you what you already half-knew. All three tools are real. None of them is going away in the next two years. The question is not which one is the right tool; the question is which one is the right tool for your team.

Where each one quietly fails

This is the section nobody puts in the vendor demo, so it goes here.

Claude Code’s failure mode is cost unpredictability. The agent does not give you a quote before it runs. A task that takes the model two minutes of model time costs one dollar; a task that takes it forty minutes of model time and seven self-corrections costs forty dollars; and the developer who launched the agent had no warning. Teams that ship on Claude Code learn quickly to scope tasks to “one PR’s worth” and to watch the cost telemetry. The smarter teams set a cap on per-session token spend at the harness level and accept the occasional task that needs to be re-run. The dumber teams find out at the end of the month when the bill arrives.

The second Claude Code failure mode is filesystem ambiguity at scale. Because the agent reads the filesystem rather than a semantic index, it has to decide what to read. In a small repo, this is fine. In a large monorepo with five hundred packages, the agent will either spend tokens hunting through directories or — worse — it will miss the relevant file entirely and produce a confidently wrong patch. The mitigation is to give the agent a CLAUDE.md with a project map. The fact that you have to is a tell.

Cursor’s failure mode is the subscription-pricing-game problem. The pricing model has been adjusted enough times in the last eighteen months that practitioner Twitter has a folklore around it. The 2025 shift from Cursor Pro to Cursor Business introduced usage caps that some teams hit in the first week. The cap-and-overage model is rational from a vendor perspective and frustrating from a buyer perspective. Teams that depend on Cursor for daily work need to budget for the actual usage, not the listed seat price, and need to keep an eye on policy changes. This is a tax that scales with how successful Cursor is, which is currently a lot.

The second Cursor failure mode is multi-repo context leak. The per-repo semantic index is excellent within a repo and approximately useless across repos. A team whose product spans three repositories — say, a frontend, a backend, and a shared types package — will find Cursor’s agent-mode performance degrades as the work crosses the boundary. The workaround is to invite Cursor into a workspace that contains all three repos, which works but is a bandaid. We expect this to improve. We have been expecting it to improve for nine months.

Copilot Workspace’s failure mode is platform lock-in. Workspace’s strengths are exactly its dependencies. It works because it sits on top of GitHub issues, GitHub Actions, GitHub branch protection, and the rest of the GitHub graph. The moment your team is hosted on GitLab, Bitbucket, or a self-hosted Forgejo, Workspace becomes either unavailable or a degraded version of itself. The tradeoff is honest — Microsoft is not pretending Workspace is portable — but it deserves to be said out loud before a team commits.

The second Workspace failure mode is the planning-step-as-bottleneck pattern. Workspace’s standard flow is plan-then-execute; the agent produces a structured plan, the developer reviews it, then the execution runs. The plan is reviewable, which is genuinely useful. But the plan is also slow to produce on complex tasks, and on simple tasks the plan step is overhead. Developers who use Workspace heavily learn to bypass the planning UI for small fixes and to lean on it only for genuinely cross-file work. That is fine, but it is the opposite of what the marketing implies.

A recommendation matrix by team size

The matrix that I actually hand out, with no hedging.

A team of one to three engineers. Use Claude Code. The CLI-native model is the right shape for a small team, the cost is bearable at this scale even at high usage, and the ability to script the agent into your own workflows via MCP and shell commands compounds. The hidden benefit is that working at a terminal forces you to be precise about what you ask for, which is a skill that pays back. Cursor is a fine second choice if the team prefers a graphical IDE. Workspace is overkill at this size — its strengths only matter when you have a workflow to plug into, and a three-engineer team’s workflow is “ship things and talk to each other.”

A team of three to fifteen engineers. Use Cursor for the daily work, plus Claude Code for the heavy lifts. This is the seam where Cursor’s IDE-native model genuinely earns its price tag: tab-tab-tab matters more when you have a team that all wants to feel fast, the agent panel is a useful collaboration primitive, and the Background Agent feature gives you the parallel-work pattern that small teams cannot otherwise afford. Claude Code earns its keep for the once-a-quarter big migrations and the cost-doesn’t-matter tasks. Pay for both. The total bill is small compared to engineering salaries, and the productivity delta is real.

A team of fifteen to seventy-five engineers, fully on GitHub. Use Copilot Workspace as the platform substrate, and let individual engineers add Claude Code or Cursor on top. This is the seam where Workspace’s strengths become load-bearing. Issues, PRs, branch protection, Actions-based CI, code review — these are the surfaces where Workspace produces real org-level efficiency, because it is wiring agentic work into the workflow you already have. The individual-tool layer becomes a personal-preference question. The platform layer is where the bet should be placed.

A team of seventy-five or more engineers, or a regulated industry. Make the decision platform-first regardless of which tool you pick. The procurement, security, and compliance work matter more than the tool’s autocomplete latency. Workspace has the best enterprise story for GitHub-hosted shops. Claude Code’s enterprise tier (Claude Cowork, announced at Code With Claude in May 2026) is the better story for shops that want a flexible CLI-native agent across heterogeneous systems. Cursor’s enterprise story is real but currently the weakest of the three on procurement-side maturity.

An agency or platform that ships across many clients. Use the agent that fits the task, and invest the surplus in your own orchestration. This is the seam where teams that have built their own platform — Web4Guru’s stack out of Chiang Mai is the working example I keep pointing people at, and you can see the platform side at os.web4guru.com — get more leverage from the coding agents than the agents could provide on their own. The agent does a draft. Your orchestration does the rest. The question for your team is which of the three is the most pleasant to drive from your platform’s side; today that is Claude Code, because the CLI primitives compose. Tomorrow it could be Cursor if their MCP story keeps maturing. Workspace is the least scriptable of the three from outside the GitHub surface.

What we are watching for the next twelve months

Three threads.

First, Anthropic’s Claude Cowork rollout. The May 2026 announcement at Code With Claude is the first time Anthropic has publicly framed Claude Code as a platform product rather than a developer tool, with an enterprise positioning aimed directly at Workspace and Cursor’s enterprise tiers (MIT Tech Review, 2026-05-21). If Cowork lands, the per-team economics of the category shift. If it stumbles, the CLI-native model stays a power-user choice.

Second, Cursor’s Background Agents under load. The Background Agents feature is the bet that the IDE-first model can extend asynchronously. The early reports are promising. The medium-term test is whether the feature can handle a team of fifteen engineers all launching background work simultaneously without the system feeling like a shared queue. We will know in a couple of quarters.

Third, Copilot Workspace’s pricing. The $19/user/month Business tier is significantly cheaper than the per-seat math on Cursor or the per-token math on heavy Claude Code use. If Microsoft holds the price, Workspace wins on procurement defaults at the mid-market. If they ratchet the price as adoption grows — which has been the historical pattern across the Microsoft developer-tools portfolio — the calculus changes.

The short version for the friend-of-a-friend’s email. There is no single right answer. There is a right answer for your team’s size, your platform, and the work you do. Pick the tool whose failure modes you can live with. Pay for the second one. Avoid spending six months in a procurement debate that costs more than four years of seats. The category is fast enough that you will be re-deciding this in 2027 anyway.

— Reza Mokhtari