Last week, we featured Derek Thompson’e piece on how the AI boom is holding up the US economy and markets. The past few days have seen multiple setbacks to the AI narrative. First, the underwhelming response to OpenAI’s GPT-5. Then, its CEO Sam Altman acknowledged that we might be in an AI bubble. Earlier this week, a report from MIT showed how 95% of enterprises failed to see any success in their AI initiatives. This piece by Cal Newport, a computer science professor and more famously the author of the brilliant book “Deep Work”, now shows why progress in AI technology is stalling.
He begins with a famous research paper by OpenAI back in 2020 that claimed that GenerativeAI models were subject to ‘scaling laws’: “… that these models would only get better as they grew, and indeed that such improvements might follow a power law—an aggressive curve that resembles a hockey stick. The implication: if you keep building larger language models, and you train them on larger data sets, they’ll start to get shockingly good. A few months after the paper, OpenAI seemed to validate the scaling law by releasing GPT-3, which was ten times larger—and leaps and bounds better—than its predecessor, GPT-2.”
At this rate, AI proponents claimed we will very soon achieve Artificial General Intelligence – a human like intelligence that will match or surpass humans in most cognitive tasks. Critics such as Gary Marcus, a professor of psychology at NYU, argued: “the so-called scaling laws aren’t universal laws like gravity but rather mere observations that might not hold forever”, only to be ridiculed by the likes of Altman and Elon Musk.
But recent versions of GPT and Grok (Musk’s AI) with only incremental gains over their predecessors question the sanctity of the scaling law: “If building ever-bigger models was yielding diminishing returns, the tech companies would need a new strategy to strengthen their A.I. products. They soon settled on what could be described as “post-training improvements.” The leading large language models all go through a process called pre-training in which they essentially digest the entire internet to become smart. But it is also possible to refine models later, to help them better make use of the knowledge and abilities they have absorbed. One post-training technique is to apply a machine-learning tool, reinforcement learning, to teach a pre-trained model to behave better on specific types of tasks. Another enables a model to spend more computing time generating responses to demanding queries.”
However, even post training approach coupled with large reasoning models don’t seem to put us back on the path to AGI anytime soon:
“OpenAI’s announcement for GPT-5 included more than two dozen charts and graphs, on measures such as “Aider Polyglot Multi-language code editing” and “ERQA Multimodal spatial reasoning,” to quantify how much the model outperforms its predecessors. Some A.I. benchmarks capture useful advances. GPT-5 scored higher than previous models on benchmarks focussed on programming, and early reviews seemed to agree that it produces better code. New models also write in a more natural and fluid way, and this is reflected in the benchmarks as well. But these changes now feel narrow—more like the targeted improvements you’d expect from a software update than like the broad expansion of capabilities in earlier generative-A.I. breakthroughs. You didn’t need a bar chart to recognize that GPT-4 had leaped ahead of anything that had come before.”
Hence, a more realistic expectation of AI’s progress might be warranted: “…If these moderate views of A.I. are right, then in the next few years A.I. tools will make steady but gradual advances. Many people will use A.I. on a regular but limited basis, whether to look up information or to speed up certain annoying tasks, such as summarizing a report or writing the rough draft of an event agenda. Certain fields, like programming and academia, will change dramatically. A minority of professions, such as voice acting and social-media copywriting, might essentially disappear. But A.I. may not massively disrupt the job market, and more hyperbolic ideas like superintelligence may come to seem unserious.”
Newport ends with this: “The appendices of the scaling-law paper, from 2020, included a section called “Caveats,” which subsequent coverage tended to miss. “At present we do not have a solid theoretical understanding for any of our proposed scaling laws,” the authors wrote. “The scaling relations with model size and compute are especially mysterious.” In practice, the scaling laws worked until they didn’t. The whole enterprise of teaching computers to think remains mysterious. We should proceed with less hubris and more care.”
If you want to read our other published material, please visit https://marcellus.in/blog/
Note: The above material is neither investment research, nor financial advice. Marcellus does not seek payment for or business from this publication in any shape or form. The information provided is intended for educational purposes only. Marcellus Investment Managers is regulated by the Securities and Exchange Board of India (SEBI) and is also an FME (Non-Retail) with the International Financial Services Centres Authority (IFSCA) as a provider of Portfolio Management Services. Additionally, Marcellus is also registered with US Securities and Exchange Commission (“US SEC”) as an Investment Advisor.