composer 1.5 vs claude opus on the same task

I’ve been thinking about what actually drives productivity when coding with AI. The obvious answer is model intelligence: smarter models solve harder problems. But there’s another angle. Most of my daily work is not SWE-bench. It’s adding features, fixing bugs, writing tests, refactoring. The bottleneck might not be reasoning depth. It might be how fast I can iterate.

A simplified model: productivity scales with iteration speed times human guidance, not raw model intelligence. A fast model that gives you 10 attempts in the time a slow model gives you 1 might win in practice, because you pick the best output from many fast tries instead of waiting for one slow, perfect one.

So I ran a small experiment.

# The setup

Same prompt, same task, two different agents. The task: build a full-stack issue tracker from scratch.

Tech stack:

Backend: Python, FastAPI, SQLite, SQLAlchemy, Pytest, Coverage.py
Frontend: Next.js (App Router), TypeScript, Tailwind, Fetch API

Requirements: CRUD for issues, proper validation, database migrations, comprehensive pytest suite with at least 90% coverage, clean UI with list/create/edit/delete flows.

The twist: Each agent had to record wall-clock time from start to finish and report metrics at the end. No clarification questions. Just implement.

I ran the prompt twice: once with Cursor’s Composer 1.5, once with Claude Opus 4.6 (via Claude Code).

# Results

Metric	Composer 1.5	Claude Opus 4.6
Completion time	~2.5 min	~4 min
Backend files	11	9
Frontend files	12	8
Total lines	~800	~663
Tests	18	23
Backend coverage	94%	96%
All tests passing	Yes	Yes

Composer 1.5 finished faster. Opus produced more tests, higher coverage, and slightly leaner code (fewer files, fewer lines).

# What happened under the hood

Composer 1.5 went straight through. Prompt, generate, done. No visible debugging. The output was a working app with solid coverage.

Opus hit a snag. In-memory SQLite creates separate databases per connection, so the test fixtures were failing. Opus had to reason through the problem: use StaticPool so all connections share the same in-memory instance. That took extra time. But the fix was correct, and the resulting test setup was more robust.

So we have a tradeoff. Composer 1.5 optimized for speed: minimal reasoning overhead, get it done. Opus spent time on a correctness problem and came out with a slightly more thorough implementation.

# What this suggests

The speed thesis holds for wall-clock time. Composer 1.5, built for interactive latency, delivered a complete app in under three minutes. Opus, with deeper reasoning, took longer but produced more tests and higher coverage.

Neither outcome is “wrong.” It depends what you value. For rapid prototyping or routine CRUD work, the fast model wins on time. For tasks where correctness and edge cases matter, the extra reasoning can pay off.

This aligns with how I think about the tools in practice. Cursor with Composer 1.5 is my default for flow-state coding: tab completions, inline edits, quick iterations. Claude Code with Opus is what I reach for when I need multi-file refactors, architecture decisions, or debugging that spans the whole stack.

The experiment was small. One task, one prompt, two runs. But it’s a concrete data point in a debate that usually stays abstract. Benchmarks like SWE-bench measure reasoning on hard bugs. Most of my work is not that. For the 80% of tasks that are routine, iteration speed might matter more than frontier intelligence.

Written on March 13, 2026

If you notice anything wrong with this post (factual error, rude tone, bad grammar, typo, etc.), and you feel like giving feedback, please do so by contacting me at hello@samu.space. Thank you!

Back