composer 1.5 vs claude opus on the same task
I’ve been thinking about what actually drives productivity when coding with AI. The obvious answer is model intelligence: smarter models solve harder problems. But there’s another angle. Most of my daily work is not SWE-bench. It’s adding features, fixing bugs, writing tests, refactoring. The bottleneck might not be reasoning depth. It might be how fast I can iterate.
A simplified model: productivity scales with iteration speed times human guidance, not raw model intelligence. A fast model that gives you 10 attempts in the time a slow model gives you 1 might win in practice, because you pick the best output from many fast tries instead of waiting for one slow, perfect one.
So I ran a small experiment.
# The setup
Same prompt, same task, two different agents. The task: build a full-stack issue tracker from scratch.
Tech stack:
- Backend: Python, FastAPI, SQLite, SQLAlchemy, Pytest, Coverage.py
- Frontend: Next.js (App Router), TypeScript, Tailwind, Fetch API
Requirements: CRUD for issues, proper validation, database migrations, comprehensive pytest suite with at least 90% coverage, clean UI with list/create/edit/delete flows.
The twist: Each agent had to record wall-clock time from start to finish and report metrics at the end. No clarification questions. Just implement.
I ran the prompt twice: once with Cursor’s Composer 1.5, once with Claude Opus 4.6 (via Claude Code).
# Results
| Metric | Composer 1.5 | Claude Opus 4.6 |
|---|---|---|
| Completion time | ~2.5 min | ~4 min |
| Backend files | 11 | 9 |
| Frontend files | 12 | 8 |
| Total lines | ~800 | ~663 |
| Tests | 18 | 23 |
| Backend coverage | 94% | 96% |
| All tests passing | Yes | Yes |
Composer 1.5 finished faster. Opus produced more tests, higher coverage, and slightly leaner code (fewer files, fewer lines).
# What happened under the hood
Composer 1.5 went straight through. Prompt, generate, done. No visible debugging. The output was a working app with solid coverage.
Opus hit a snag. In-memory SQLite creates separate databases per connection, so the test fixtures were failing. Opus had to reason through the problem: use StaticPool so all connections share the same in-memory instance. That took extra time. But the fix was correct, and the resulting test setup was more robust.
So we have a tradeoff. Composer 1.5 optimized for speed: minimal reasoning overhead, get it done. Opus spent time on a correctness problem and came out with a slightly more thorough implementation.
# What this suggests
The speed thesis holds for wall-clock time. Composer 1.5, built for interactive latency, delivered a complete app in under three minutes. Opus, with deeper reasoning, took longer but produced more tests and higher coverage.
Neither outcome is “wrong.” It depends what you value. For rapid prototyping or routine CRUD work, the fast model wins on time. For tasks where correctness and edge cases matter, the extra reasoning can pay off.
This aligns with how I think about the tools in practice. Cursor with Composer 1.5 is my default for flow-state coding: tab completions, inline edits, quick iterations. Claude Code with Opus is what I reach for when I need multi-file refactors, architecture decisions, or debugging that spans the whole stack.
The experiment was small. One task, one prompt, two runs. But it’s a concrete data point in a debate that usually stays abstract. Benchmarks like SWE-bench measure reasoning on hard bugs. Most of my work is not that. For the 80% of tasks that are routine, iteration speed might matter more than frontier intelligence.
