I personally suspect that their self-report uplift numbers are inflated and that agent time horizons are still limited. But if taken at face value, then even the most aggressive scenarios (e.g. AI 2027 or
https://blog.redwoodresearch.org/p/whats-up-with-anthropic-predicting…) would have underestimated progress.
Such dramatic and surprising progress (if it held up to scrutiny) would make me think that a dramatic 'software intelligence explosion' was much more likely in the next few years, and I worry we'd be far from ready for such an outcome.
https://www.forethought.org/research/how-quick-and-big-would-a-software-intelligence-explosion-be…Mostly I think this indicates that we urgently need better evals and measurement. I suspect that a proper uplift RCT, and more challenging AI R&D evaluations, would reveal Opus 4.5 is on a much more modest trend and still far away from the AI R&D 4 threshold (fully automating remote junior employees). But we need to build and run those assessments to know for sure, and the stakes are very high.
https://x.com/CFGeek/status/1985216285664539038…Anthropic has decided to activate their mitigations for AI R&D 4 early, and so claim this assessment is no longer load-bearing. I'm glad they are getting started on their risk reports, but I hope this doesn't mean deprioritizing efforts to bridge this critical measurement gap.
Hoping to dig much more into this in the near future (e.g. get an Opus 4.5 time horizon
https://x.com/joel_bkr/status/1993479320262844911…). If you want to be on the ground-floor of reconciling the contradictory evidence on this crucial question, consider applying to work with us!
https://metr.org/careers