amine.dev

Microsoft didn't wait. Neither should you.

Date Published

Claude Opus 4.7 AI chip

Claude Opus 4.7 dropped on April 16. I've been using Claude daily for over a year now, building internal tools at work, running automations, and generally treating it as a core part of how I get things done. So a new Opus release gets my attention. But honestly, the model itself wasn't what caught me off guard this time. It was what happened within 24 hours of the announcement.

Microsoft had it live in Copilot for Microsoft 365. Same day. And that tells you more about where enterprise AI is heading than any benchmark score.

What's actually new in Opus 4.7

The headline capability upgrade is vision. Maximum image resolution went from 1.15 megapixels to 3.75 megapixels, and on scientific figure interpretation tasks (CharXiv), that translated to a 13-point improvement. If you're doing anything with documents, charts, or dense visual content, that's a meaningful jump.

The other change I find more interesting is self-verification. The model now checks its own outputs before reporting a task complete, catching logical errors during planning rather than at the end. For anyone using Claude in agentic workflows, this matters. I've had runs where a tool call quietly fails and the model just... carries on. Better self-correction mid-task reduces that.

There's also a new effort parameter for tuning the speed-to-capability trade-off, improved memory for agentic file-based tasks, and a new tokenizer that may use up to 1.35x more tokens. Context window stays at 1M tokens with 128k max output.

Performance monitoring data screens

How it benchmarks against the competition

On GPQA Diamond (graduate-level reasoning), Opus 4.7 scores 94.2%. GPT-5.4 is at 94.4%, Gemini 3.1 Pro at 94.3%. Essentially the same. Reasoning benchmarks at this level are approaching saturation, so reading too much into those numbers is a mistake.

Where Claude pulls ahead is coding. On SWE-bench Pro, Opus 4.7 hits 64.3% vs GPT-5.4 at 57.7% and Gemini at 54.2%. That's not a marginal difference. For software engineering tasks, Claude is the clear pick right now, and the self-verification work probably plays into that.

GPT-5.4 wins on web research (89.3% on BrowseComp vs Claude's 79.3%). Gemini is cheaper and often better on very high-resolution document tasks. So there's no single winner across everything. But for the kind of work I do, which leans heavily toward reasoning, writing, and building, Opus 4.7 is the model I'd reach for.

Two people working with Microsoft laptop

Microsoft moved in less than 24 hours. That's the story.

When Anthropic announced Opus 4.7 on April 16, Microsoft had it available in Copilot for Microsoft 365 the same day. Not days later. Not "rolling out over the next few weeks". Same day. Available in Copilot Cowork (Frontier), Copilot Studio, Copilot in Excel, GitHub Copilot, and Azure AI Foundry.

I've worked in tech long enough to know that enterprise software integrations don't usually move at that speed. Especially not at Microsoft's scale. The fact that they could ship this that fast tells you a few things.

First, they were ready. The infrastructure to plug in a new Anthropic model was already in place and tested. This wasn't a scramble. Second, they want optionality. Microsoft is running GPT, Claude, and Gemini models inside Copilot simultaneously now. They're not betting the enterprise productivity suite on one model from one provider. Third, and this is what I keep coming back to, the pace of model iteration is now so fast that the integrations have to be built for speed. Any enterprise AI product that takes weeks to adopt a new model is going to fall behind.

Person thinking at laptop computer

My take

I've been saying for a while that the model itself is becoming less of the differentiator. What matters is how fast you can get it into the hands of the people who need it, in the context where they actually work. Microsoft just demonstrated what that looks like at enterprise scale.

The Copilot team has built something that can absorb a new frontier model in under 24 hours. That's not a feature. That's an architectural decision they made months ago. And it's the kind of decision that compounds.

For Anthropic, the Microsoft partnership isn't just about revenue. It's distribution. Getting Claude in front of every Microsoft 365 user is a different kind of reach than API access. And for enterprise customers, having Claude available inside the tools they already use removes the biggest adoption barrier there is: the friction of going somewhere new.

Opus 4.7 is a genuinely good model. But the 24-hour Copilot integration is the thing I'll remember about this release.

If you're thinking about how to bring AI into your team's existing tools and workflows, get in touch.