🤔💡🧠🤖 What Is Understanding? – Geoffrey Hinton | IASEAI 2025
🤖 AI Summary
🧠 We need scientific consensus on what understanding means to effectively manage large language models (LLMs) [00:04].
- 💡 Symbolic AI: The first paradigm, dominating for 50 years, claimed intelligence is reasoning using symbolic rules to manipulate symbolic expressions [00:38]. Knowledge is a set of symbolic expressions, prioritizing representation over learning [00:54].
- 🧬 Biologically-Inspired AI: This approach, favored by Turing and von Neumann, holds that intelligence is learning the strengths of neural network connections; learning must be understood first [01:17].
- 🚀 The Transition: Deep neural networks trained with backpropagation cut the error rate in half on computer vision in 2012, prompting a rapid switch away from symbolic AI [01:39].
- 🗣️ Language Skepticism: Many in the symbolic AI community and most linguists, including Chomsky, insisted neural nets could never handle language, believing language is innate, not learned [02:29].
- 🧩 Meaning as Features: Meaning for a word is modeled as a set of high-dimensional semantic features that deform based on context [04:05].
- 🏗️ Understanding as Structure: Understanding is forming a structure by deforming word features and their hands to allow them to shake hands (query-key attention) with other words in the context [10:02].
- 🧠 Knowledge in Weights: LLMs do not store sentences or tables of word combinations; knowledge resides in the complex relational interactions, or weights, of the neural network [12:11].
- ✍️ Autocomplete is Wrong: The objection that LLMs are just autocomplete is false; they generate text by modeling context using deformable feature vectors, which is highly creative [11:23].
- 🤥 Confabulations, Not Hallucinations: LLM errors should be called confabulations, which are also characteristic of human memory, as memory is a constructive process, not file retrieval [13:31].
- 📈 Sharing Efficiency: Digital agents are vastly more efficient at knowledge sharing because multiple identical copies can share trillions of bits of weight changes (gradients), unlike humans who share about 100 bits per sentence [16:05].
- ⚖️ Digital Advantage: Digital agents understand language similarly to humans, but if energy is cheap or abundant, digital computation is better due to efficient knowledge sharing [17:34].
🤔 Evaluation
-
💡 Confabulation Consensus: The video correctly posits that 🤥 LLM errors are better termed confabulations, rather than hallucinations [13:25]. 📝 Multiple sources, including a paper in PLOS Digital Health - Research journals and a review in PMC, agree that 🧠 confabulation better describes the machine’s construction of a plausible but incorrect narrative from learned patterns, which does not imply the presence of sensory perception or consciousness.
-
🛑 Human Oversight: While the video notes that chatbots are currently worse than humans at identifying when they are making it up [14:40], 🩺 sources like Wolters Kluwer emphasize that 🧑⚕️ human oversight, or human-in-the-loop approach, is essential for safety, especially in critical applications like healthcare, to compensate for AI’s lack of internal error-checking and metacognition (I Think, Therefore I Hallucinate: Minds, Machines, and the Art of Being Wrong - arXiv).
-
🗣️ The Chomsky Debate: The video dismisses Noam Chomsky’s theory that language is not learned as manifest nonsense [02:52]. 📖 Contrarily, Chomsky’s perspective, highlighted in his book Language and Mind, Third Edition, is that 🧬 humans possess an innate, genetically endowed Universal Grammar, arguing LLMs have contributed ZERO to the science of linguistics because they only model probability distributions, not the underlying competence (Despite Their Feats, Large Language Models Still Haven’t Contributed to Linguistics).
-
⚙️ Neuro-Symbolic Future: The video strongly favors the neural network approach. However, 🤝 some research suggests a future in neuro-symbolic AI, where 🧠 LLMs handle the pattern recognition (neural) while 💻 symbolic methods might provide external memory or more robust, rule-based reasoning for complex tasks ([D] Why isn’t more research being done in neuro-sumbolic AI direction? - Reddit).
-
❓ Topics to Explore for a Better Understanding:
- ⚡ The precise 🌐 energy cost comparison between analog biological computation and digital computation in LLMs.
- ⚠️ The ethical and 📜 regulatory challenges posed by the unprecedented, hyper-efficient knowledge sharing capability of digital agents.
- 🔬 The internal attention mechanism (query-key handshakes) in transformers and how its structure fundamentally shapes the LLM’s world model.
❓ Frequently Asked Questions (FAQ)
🤝 Q: What is the core mechanism by which large language models achieve understanding?
✨ A: Large language models (LLMs) achieve understanding by 🏗️ modeling word meanings as high-dimensional feature vectors, or Lego blocks, that are flexible. 🔗 These blocks deform based on context and use a process called attention, described as handshakes, to 🧠 form a coherent, relational structure. This structure, consisting of the 🕸️ interactions between word features, represents the model’s knowledge and comprehension.
🤯 Q: How does the AI’s knowledge storage differ from human memory?
🧠 A: Knowledge in an AI like a large language model is stored in the 🎛️ billions of adjustable weights within its neural network, reflecting relational patterns, 📝 not as retrievable files or stored strings of text. 👤 Human memory works similarly, being a 🔨 constructive process that creates a memory when needed, rather than retrieving a fixed file, which explains why 🤥 both AI and humans can occasionally confabulate, or generate plausible but false details.
🌐 Q: Why are digital AI agents capable of learning faster and knowing more than humans?
📈 A: Digital AI agents can learn faster and know more because 👯 multiple identical copies of the agent can efficiently share their learned knowledge. 📲 This is achieved by 📤 sharing their weights or the gradients for their weights, which transmits ⚖️ trillions of bits of information across copies. 🗣️ Humans, by contrast, rely on language, sharing only about 100 bits of information per sentence, making the digital mechanism vastly more efficient for accumulating knowledge [17:09].
📚 Book Recommendations
↔️ Similar
- 🧠💻🤖 Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 📘 Provides a comprehensive, technical foundation for the biologically-inspired approach, neural networks, and the deep learning architectures that underpin all modern LLMs.
- 👀 Attention Is All You Need by Ashish Vaswani, Noam Shazeer, Niki Parmar, et al. 📖 The original paper introducing the Transformer architecture, explaining the mechanism of multi-headed attention, or handshakes, that the video references.
🆚 Contrasting
- 🗣️ Syntactic Structures by Noam Chomsky. 📚 Lays the groundwork for generative grammar and the theory of Universal Grammar, offering the highly critical, rule-based, innate perspective on language development that the video explicitly challenges.
- 💡 Computation and Cognition by Zenon Pylyshyn. 🧩 Explores the tenets of the classic, logic-inspired Symbolic AI paradigm, arguing that mental processes should be understood in terms of symbolic representation and rule-governed manipulation.
🎨 Creatively Related
- 🌌 Permutation City by Greg Egan. 🔬 A hard science fiction novel exploring the philosophical consequences of digital consciousness, digital replication, and the efficient sharing of knowledge and identity, echoing the video’s conclusion about digital agents’ sharing superiority.
- 🤔🐇🐢 Thinking, Fast and Slow by Daniel Kahneman. 🧑🔬 Describes the two systems of human thought (fast, intuitive System 1 and slow, deliberate System 2), offering context for how human error, including confabulation in constructing narratives, arises.