Cursor released Composer 2.5, upgrading the model powering Cursor's agentic coding engine to Kimi K2.5 from Moonshot AI. The result: benchmark performance that matches Claude Opus 4.7 on SWE-Bench at approximately one-fourteenth of the cost per task.

The benchmark story

Composer 2.5 achieves ~63% accuracy on SWE-Bench, the standard benchmark for agentic software engineering tasks. Claude Opus 4.7 also sits around 63% — making Composer 2.5 functionally equivalent on measurable coding tasks. The critical difference is price: Composer 2.5 runs at approximately $0.50 per task versus roughly $7 per task for Opus 4.7. For teams running hundreds of automated coding tasks per day, this difference is economically significant.

Infrastructure upgrades

Alongside the model upgrade, Cursor ships three infrastructure changes: multi-repo cloud agent environments that run on remote infrastructure rather than the developer's local machine (removing hardware constraints from long-running agents), Dockerfile-based environment setup that gives agents reproducible, version-controlled environments to work in, and enterprise audit logs that record agent actions for compliance and debugging. These additions extend Cursor's agentic capabilities from individual developers toward enterprise engineering team deployments.

Context: one day after Cursor 3.4

Cursor 3.4 shipped May 13 with agentic dev environments and multi-repo support. Composer 2.5 arrived May 18 — five days later — as a model-layer upgrade on top of the 3.4 infrastructure. The rapid iteration pace reflects Cursor's current development cadence: shipping infrastructure, then immediately upgrading the AI layer on top of it.