Home > Articles

❓📈🤦 Failing to Understand the Exponential, Again

🤖 AI Summary

  • 📈 AI discourse regarding a “bubble” parallels the failure to grasp Covid-19’s exponential spread.
  • 🦠 Commentators missed the pandemic’s scale, treating it as remote after exponential trends became obvious.
  • 🚫 Model mistakes prompt conclusions that AI will never reach human-level performance or will have only minor impact.
  • 📉 Lack of conversational change across model releases suggests AI is plateauing and scaling is over.
  • 💻 The METR study documents a clear exponential trend in autonomous software engineering task length.
  • 🚀 Sonnet 3.7 achieved 50% success on one-hour tasks; recent models exceed this, completing tasks over 2 hours.
  • 💼 OpenAI’s GDPval study measures performance across 44 occupations in 9 industries.
  • 📊 Evaluation shows a similar trend, with GPT-5 nearing human performance.
  • 🥇 Claude Opus 4.1 significantly outperforms GPT-5, almost matching industry expert performance.
  • ⏳ Models will autonomously work 8-hour days by mid-2026.
  • 🧑‍🔬 At least one model will match human experts across many industries before late 2026.
  • 🧠 By late 2027, models will frequently outperform experts on many tasks.
  • ⚠️ Grok 4 and Gemini 2.5 Pro underperformance is notable given previous state-of-the-art claims.

🤔 Evaluation

  • ⚖️ The analogy to the Covid-19 pandemic is contrasted with the mechanism of AI progress by commentators.
  • 🦠 For COVID-19, the spread of infection is an understood, deductive exponential process, unlike the fuzzier process underlying the AI boom.
  • ⚙️ AI improvement is considered closer to Moore’s law, which depends on the whole industry focusing on new innovations, suggesting improvements are not inevitable.
  • 📋 The METR and GDPval tasks are contrasted with real-world work by being characterized as not “messy.”
  • 📝 METR’s benchmark tasks have a mean messiness score of ~3/16, while a regular software engineering task is 7-8, suggesting current evaluations do not capture the large variety of real-world work.
  • 🔮 A legitimate perspective posits that AI may be able to perform non-messy tasks for eight hours at a 50% success rate and outperform experts, yet somehow fail to replace anyone, similar to the introduction of technology to radiologists.
  • 🤝 The topic to explore for better understanding is the creation of evaluations that include both messier and longer horizon tasks.
  • 🛠️ Another topic to explore is how to best structure the human-AI collaboration necessary for ultra-high productivity, where AI functions as a very smart tool rather than a replacement.

📚 Book Recommendations

💡 Similar

⚖️ Contrasting

  • 🧠🧠🧠🧠 A Thousand Brains: A New Theory of Intelligence by Jeff Hawkins. This work offers a biologically-grounded theory of intelligence suggesting current AI architectures may be fundamentally flawed or missing key elements, offering a potential plateauing mechanism that contrasts the article’s optimism.
  • 🌍 Poverty, by America by Matthew Desmond. A deeply contrasting societal analysis that focuses on the distributional failures of a wealthy society, forcing a necessary consideration of where exponential technological gains might fail to solve fundamental human problems.
  • 🐌 The Myth of the AI Revolution by Kate Crawford. This work offers a critical, structuralist perspective, arguing that AI is not a disembodied intelligence but a system built on vast resources and political choices, suggesting a slower, more complex path to ‘revolution’ that contrasts the article’s simple extrapolation.
  • ⚫🦢🎲 The Black Swan: The Impact of the Highly Improbable by Nassim Nicholas Taleb. This book discusses the impact of highly improbable, high-impact events—like a sudden AGI breakthrough—that are inherently unpredictable but reshape history, illustrating why the future is not merely an extrapolation of the past.
  • 💥 The Shock of the New by Robert Hughes. A history of modern art that deals with how society adapts—or fails to adapt—to relentless, high-velocity change in culture and technology, linking the emotional and cultural impact of exponential progress.
  • 📉📈🌪️💪 Antifragile: Things That Gain from Disorder by Nassim Nicholas Taleb. This book discusses systems that not only withstand shocks but benefit from them, providing a framework for how individuals and institutions can prepare for a future shaped by unpredictable, exponentially growing technologies.

🐦 Tweet