Claude Opus 4.7 long context capability regression test: 3 truths behind the halving of the MRCR benchmark
Expert programmers have combed through the 232-page official Anthropic system card, and the conclusion is unanimous: Claude Opus 4.7’s long-context performance has seen a significant regression compared to 4.6. This finding stands in sharp contrast to the phrasing in Anthropic’s official blog, which claimed, "Opus 4.7 delivered the most consistent long-context performance of any model … Read more