John Collins

I’d been using Claude every day for over a year before December 2025, but that’s the month it stopped feeling like a clever toy and started feeling like a coworker who had been on the project the whole time.

Three things lined up. The context windows finally got long enough that I could hand over a whole feature without surgically extracting just the relevant files. Tool use stopped being a parlour trick — agents could read the repo, run the tests, see what failed, and fix it without me sitting on top of them. And the models stopped panicking when something went sideways; they’d just look at the error, form a theory, and try the next thing.

That’s the actual sweet spot — not raw intelligence, but composure. The shift isn’t “the model got smarter on benchmarks.” It’s “I can leave the room for ten minutes and come back to a working PR.” That’s a different category of useful.

The bar I now hold a tool to: can it finish the boring 80% so I can spend my attention on the 20% that’s actually a judgement call? In December, the answer became yes.