Gary Marcus is a bete noir of the Silicon Valley AI investment community. While he does think current LLM AIs can do lots of useful things, he doubts they are the road to AGI. He says AGI will need independent reasoning, and contrary to the claims of some, that is not happening as an emergent property via scaling LLM AIs.
So far, he’s been proved right. On the other hand, Daniel Kokotajlo, involved in OpenAI when it was a non-profit, penned a 2021 essay called – [“What 2026 looks like”.](https://www.alignmentforum.org/posts/6Xgy6CAf2jqHhynHL/what-2026-looks-like) So far his track record for prediction looks good too. He says that 2027 is the year AI will start writing its own code to self-recursively improve itself.
Spara-Extreme on
This is accurate IMO though I think the usage of LLM’s in normal products will improve them dramatically.
dftba-ftw on
Uh huh… So we’re just going to ignore that Behemoth (before COT) benchmarks close to a lot of reasoning models? Cause that seems to me like scaling is working and the other models poor performance can be explained away via dumbing down caused by poor implementation of MOE.
Edit: why the downvotes? Actually look at the published benchmarks for llama 4’s biggest model, it is constantly within spitting distance of o3, o1, r1, qwq, and other reasoning models – this suggests that when COT is applied it will beat those models by a fair amount. This is all consistent with scaling laws.
ale_93113 on
Why do people forget that almost all AI labs know this and are working on multi-modality?
Not just that, but reasoning systems are 6 months old and they have already fixed the stagnation the first half or 2024 was characterised by, by CHANGING ARCHITECTURE
Wow, almost as if AI researchers know this and take course of action
Meanwhile the discussion online when these very obviously true headlines come is not “I wonder what next step will be taken to continue to improve these models that will lead to the automatisation of all jobs hopefully soon”
But instead it is “Yeah those dumb AIs will never progress because look at what this expert said”
[deleted] on
[removed]
foeffa on
I just fucking knew it was gonna be about Gary Marcus
Fonzie1225 on
Internet Guy makes a prediction then brags about how he was right on his own substack… hmm…
Garden_Wizard on
The problem is that not only do AI salespeople lie about how good AI is, but ai itself lies all the time. It absolutely is not a replacement for a human at this time
Cubey42 on
“scaling is over” but the 2T model still shows the scaling up increases results so…. How did we draw that conclusion?
Sirisian on
They have a point about generative text not having a moat, but generative images right now does have a small moat. OpenAI’s latest image generation handles “prompt following” (where it can relate complex descriptions to objects with relative placement) at a level that isn’t in any other system. It’s not clear when any other product will catch up. With more compute, multiple discovery is almost guaranteed among researchers, so pointing out that we shouldn’t see a big moat is a solid, albeit obvious, prediction. (Companies leapfrogging each other is fully expected).
Looking at Nvidia’s stock price rather than actual output seems flawed. They’re selling everything they produce at massive profit. Also datacenter creation (and investment in general) is rising as predicted further increasing their demand. This investment isn’t just about size and scaling. It’s about enabling new ideas and faster/cheaper iteration. The architecture jumps that are predicted are directly related to this. (Companies will scale current methods that work and that’s expected, but at the same time they’re tweaking and seeing if ideas improve or hurt outputs).
There’s a reason articles will use “current LLM”s when talking about limitations. It’s pedantic, but a lot of the LLMs are really multi-modal language models, MLLMs. Their architectures continue to become more multi-modal and are training on more diverse data. The models are already becoming more flexible in that I can upload various data sources, including images, and get responses about them. These approaches have years of continuous advancement ahead of them. (I don’t think anyone has seriously said a text LLM will lead to AGI. There’s a lot of discussion about multi-modal, embodied, and continual learning models though).
In the big picture, it seems like articles like this are looking at such an early window. I personally don’t think investment is at the level where “the financial bubble may be bursting” even means anything. Like if we assume that trends converge in the 2040s then we’d expect investment in the 2030s to dwarf what we’re seeing. Like 100s of billions being a drop in the bucket. In that sense it means any momentary blip for Nvidia is unimportant unless it affects their R&D or their relationship to TSMC or other foundries.
gordonjames62 on
This is a really great point about LLMs and the hope for AGI.
LLMs are like a form of machine learning. We don’t always know what goes on in the processes of machine learning. **People hoped that making LLMs bigger, and giving them more data would lead to independent reasoning.** It still may, but it is looking less and less likely.
Independent reasoning is not something we can program.
Oddyssis on
Do we even want AGI? LLMs are great and they can do a lot of tasks that are very helpful to humans. We don’t really need the complications of creating something that’s has the ability to grow and learn and potentially develop consciousness and furthermore, I don’t think we’re really mature enough as a culture to handle it.
Let’s be honest and admit that if we developed AGI tomorrow it would be used for whatever the creators wanted without any regard for it’s potential consciousness or sense of self. If there were issues with it it would be shut down and wiped without a second thought.
Winter_Tension5432 on
I am pretty sure even if Transformers doesn’t lead to AGI at least will accelerate the architecture that leads to AGI.
KidKilobyte on
He literally says non-reasoning models are stalling out. Hello, that’s why we’re on to reasoning models now. You can almost make LLMs arbitrarily smart if you tell them to take long enough and really thoroughly check their answers. Trouble is these kinds of queries take time, energy, and resources. Like $10,000 worth of compute per question just to get top pole position in benchmarks.
But tricks like distillation will come along and make these kind of results available at much cheaper costs eventually.
There is no wall on AI, only on AI done with scaling alone. New chip architectures will make both training and inference far cheaper than they are now until we can brute force AGI to get the self improvement cycle started.
TemetN on
I mean, reasoning models already break that. His GPT-4 level bit is also wrong (even with the disappointing baseline results of GPT-4.5 it both demonstrated scaling continues and passed GPT-4 (though it depends on what you mean by level, which is another reason to be dubious about this). Past that, hallucinations have dropped by… I think the record is something like two thirds? I’d have to really double check since hallucination related benchmarks are not well adopted.
GnarlyNarwhalNoms on
>says he’s been proved right that LLMs and scaling won’t lead to AGI
Well, shit, I could have told you that. They’re different animals. They operate in different ways. It’s like comparing a submarine to a spaceship. No matter how big or technologically advanced that submarine gets, it’ll never get to space.
LLMs, fundamentally, mimic human writing. They’re very good at it. But mimicry doesn’t lead to understanding or awareness.
sibylazure on
I don’t take Gary Marcus very seriously not because he is “a leading AI contrarian,” but because I don’t think symbolic AI will prove to be significant in the future. It is a purely, hopelessly useless architecture for achieving AGI
What we need is a way to generate high-quality data and develop brand-new DNN architectures. Just because RNNs or CNNs have their shortcomings doesn’t mean we need symbolic AI to overcome them. I believe the same holds true this time as well
I’m also highly suspicious of the so-called neuro-symbolic approach. Perhaps it may outperform pure deep learning architectures in narrowly defined domains and prove somewhat useful, but no, it won’t be useful in creating AGI in no way
azaathik on
Ai is nothing but pattern recognition and half assed replication based on those patterns. Ai can not create, and unless heavily assisted by someone competently creative, its replications aren’t worth looking at 90% of the time. Even those lack the “soul” of something created without Ai.
Beyond creative endeavors, Ai can’t know everything, and it understands nothing. You need to know why a thing is done to do it in a proper nuanced way that actually works.
18 Comments
Submission Statement
Gary Marcus is a bete noir of the Silicon Valley AI investment community. While he does think current LLM AIs can do lots of useful things, he doubts they are the road to AGI. He says AGI will need independent reasoning, and contrary to the claims of some, that is not happening as an emergent property via scaling LLM AIs.
So far, he’s been proved right. On the other hand, Daniel Kokotajlo, involved in OpenAI when it was a non-profit, penned a 2021 essay called – [“What 2026 looks like”.](https://www.alignmentforum.org/posts/6Xgy6CAf2jqHhynHL/what-2026-looks-like) So far his track record for prediction looks good too. He says that 2027 is the year AI will start writing its own code to self-recursively improve itself.
This is accurate IMO though I think the usage of LLM’s in normal products will improve them dramatically.
Uh huh… So we’re just going to ignore that Behemoth (before COT) benchmarks close to a lot of reasoning models? Cause that seems to me like scaling is working and the other models poor performance can be explained away via dumbing down caused by poor implementation of MOE.
Edit: why the downvotes? Actually look at the published benchmarks for llama 4’s biggest model, it is constantly within spitting distance of o3, o1, r1, qwq, and other reasoning models – this suggests that when COT is applied it will beat those models by a fair amount. This is all consistent with scaling laws.
Why do people forget that almost all AI labs know this and are working on multi-modality?
Not just that, but reasoning systems are 6 months old and they have already fixed the stagnation the first half or 2024 was characterised by, by CHANGING ARCHITECTURE
Wow, almost as if AI researchers know this and take course of action
Meanwhile the discussion online when these very obviously true headlines come is not “I wonder what next step will be taken to continue to improve these models that will lead to the automatisation of all jobs hopefully soon”
But instead it is “Yeah those dumb AIs will never progress because look at what this expert said”
[removed]
I just fucking knew it was gonna be about Gary Marcus
Internet Guy makes a prediction then brags about how he was right on his own substack… hmm…
The problem is that not only do AI salespeople lie about how good AI is, but ai itself lies all the time. It absolutely is not a replacement for a human at this time
“scaling is over” but the 2T model still shows the scaling up increases results so…. How did we draw that conclusion?
They have a point about generative text not having a moat, but generative images right now does have a small moat. OpenAI’s latest image generation handles “prompt following” (where it can relate complex descriptions to objects with relative placement) at a level that isn’t in any other system. It’s not clear when any other product will catch up. With more compute, multiple discovery is almost guaranteed among researchers, so pointing out that we shouldn’t see a big moat is a solid, albeit obvious, prediction. (Companies leapfrogging each other is fully expected).
Looking at Nvidia’s stock price rather than actual output seems flawed. They’re selling everything they produce at massive profit. Also datacenter creation (and investment in general) is rising as predicted further increasing their demand. This investment isn’t just about size and scaling. It’s about enabling new ideas and faster/cheaper iteration. The architecture jumps that are predicted are directly related to this. (Companies will scale current methods that work and that’s expected, but at the same time they’re tweaking and seeing if ideas improve or hurt outputs).
There’s a reason articles will use “current LLM”s when talking about limitations. It’s pedantic, but a lot of the LLMs are really multi-modal language models, MLLMs. Their architectures continue to become more multi-modal and are training on more diverse data. The models are already becoming more flexible in that I can upload various data sources, including images, and get responses about them. These approaches have years of continuous advancement ahead of them. (I don’t think anyone has seriously said a text LLM will lead to AGI. There’s a lot of discussion about multi-modal, embodied, and continual learning models though).
In the big picture, it seems like articles like this are looking at such an early window. I personally don’t think investment is at the level where “the financial bubble may be bursting” even means anything. Like if we assume that trends converge in the 2040s then we’d expect investment in the 2030s to dwarf what we’re seeing. Like 100s of billions being a drop in the bucket. In that sense it means any momentary blip for Nvidia is unimportant unless it affects their R&D or their relationship to TSMC or other foundries.
This is a really great point about LLMs and the hope for AGI.
LLMs are like a form of machine learning. We don’t always know what goes on in the processes of machine learning. **People hoped that making LLMs bigger, and giving them more data would lead to independent reasoning.** It still may, but it is looking less and less likely.
Independent reasoning is not something we can program.
Do we even want AGI? LLMs are great and they can do a lot of tasks that are very helpful to humans. We don’t really need the complications of creating something that’s has the ability to grow and learn and potentially develop consciousness and furthermore, I don’t think we’re really mature enough as a culture to handle it.
Let’s be honest and admit that if we developed AGI tomorrow it would be used for whatever the creators wanted without any regard for it’s potential consciousness or sense of self. If there were issues with it it would be shut down and wiped without a second thought.
I am pretty sure even if Transformers doesn’t lead to AGI at least will accelerate the architecture that leads to AGI.
He literally says non-reasoning models are stalling out. Hello, that’s why we’re on to reasoning models now. You can almost make LLMs arbitrarily smart if you tell them to take long enough and really thoroughly check their answers. Trouble is these kinds of queries take time, energy, and resources. Like $10,000 worth of compute per question just to get top pole position in benchmarks.
But tricks like distillation will come along and make these kind of results available at much cheaper costs eventually.
There is no wall on AI, only on AI done with scaling alone. New chip architectures will make both training and inference far cheaper than they are now until we can brute force AGI to get the self improvement cycle started.
I mean, reasoning models already break that. His GPT-4 level bit is also wrong (even with the disappointing baseline results of GPT-4.5 it both demonstrated scaling continues and passed GPT-4 (though it depends on what you mean by level, which is another reason to be dubious about this). Past that, hallucinations have dropped by… I think the record is something like two thirds? I’d have to really double check since hallucination related benchmarks are not well adopted.
>says he’s been proved right that LLMs and scaling won’t lead to AGI
Well, shit, I could have told you that. They’re different animals. They operate in different ways. It’s like comparing a submarine to a spaceship. No matter how big or technologically advanced that submarine gets, it’ll never get to space.
LLMs, fundamentally, mimic human writing. They’re very good at it. But mimicry doesn’t lead to understanding or awareness.
I don’t take Gary Marcus very seriously not because he is “a leading AI contrarian,” but because I don’t think symbolic AI will prove to be significant in the future. It is a purely, hopelessly useless architecture for achieving AGI
What we need is a way to generate high-quality data and develop brand-new DNN architectures. Just because RNNs or CNNs have their shortcomings doesn’t mean we need symbolic AI to overcome them. I believe the same holds true this time as well
I’m also highly suspicious of the so-called neuro-symbolic approach. Perhaps it may outperform pure deep learning architectures in narrowly defined domains and prove somewhat useful, but no, it won’t be useful in creating AGI in no way
Ai is nothing but pattern recognition and half assed replication based on those patterns. Ai can not create, and unless heavily assisted by someone competently creative, its replications aren’t worth looking at 90% of the time. Even those lack the “soul” of something created without Ai.
Beyond creative endeavors, Ai can’t know everything, and it understands nothing. You need to know why a thing is done to do it in a proper nuanced way that actually works.