Claude Opus 4.7 long context capability regression test: 3 truths behind the halving of the MRCR benchmark

claude opus 4 7 long context regression en image 0 图示

Expert programmers have combed through the 232-page official Anthropic system card, and the conclusion is unanimous: Claude Opus 4.7’s long-context performance has seen a significant regression compared to 4.6. This finding stands in sharp contrast to the phrasing in Anthropic’s official blog, which claimed, "Opus 4.7 delivered the most consistent long-context performance of any model … Read more