A Review of “Theory Is All You Need” by Felin and Holwig

Introduction.
Experts are continually shortening their projected timelines for advancements in AI capabilities. In 2022, leading AI researchers predicted a 50% chance that AI will “accomplish every task better and more cheaply than human workers” by 2060. When surveyed in 2023, the experts predicted a 50% chance that this same milestone would be reached by 2047—a thirteen-year drop in expected progress. These increasingly bullish predictions reflect tangible improvements in AI capabilities. While early language models (like GPT-2) struggled to count to ten, the current generation of models comfortably pass the bar exam. Many analysts, particularly those in Silicon Valley, take this rapid progress as an indication that we’ve discovered the formula for intelligence. However, as impressive as current AI capabilities are, some researchers contend that these AI systems lack a distinctly human skill—they cannot generate new knowledge.

In their recent paper, “Theory Is All You Need,” Felin and Holwig explore this apparent deficit. They argue that AI is an inherently backward-looking technology, reflecting existing knowledge about the world rather than imagining unknown futures. Their paper presents numerous case studies to support this position; however, in the following article, I will attempt to distil their complex position into two central arguments for why AI can’t generate knowledge. Following this, I will explore counterarguments to their position and attempt to demonstrate that AI models may yet generate new knowledge about the world.

Argument 1 - AI Can't Hold Contrarian Beliefs.
Their first argument takes aim at the Bayesian paradigm that dominates AI research. Under this paradigm, intelligent systems are expected to rationally proportion their beliefs to match the supporting evidence. Felin and Holwig suggest that such rational systems might struggle to look beyond their training data. For example, a large language model (LLM) trained in 1633 would likely refute Galileo's view that the earth orbits the sun because "the evidence—as inferred from the word associations found in the training data—would overwhelmingly be against Galileo."

The authors highlight the importance of situations in which human beliefs are not yet supported by the evidence, which they refer to as "data-belief asymmetries.” In particular, they claim that these unsubstantiated beliefs can inspire us to look beyond the existing data in order to generate new knowledge about the world. For example, Galileo’s seemingly irrational belief that the earth orbits the sun motivated him to collect evidence that supports this hypothesis. In summary, Felin and Holwig argue that the Bayesian paradigm discourages beliefs that appear to contradict known facts, whereas "data-belief asymmetries” propel knowledge generation. As such, AI models designed under the Bayesian paradigm will be incapable of generating new knowledge about the world.

Rebuttal of Argument 1.
Human beliefs are not only determined by the ‘data’ that is pertinent to the task at hand, but are also shaped by our personality and cultural identity. These factors have a very loose analogy to an LLM’s system prompt and alignment:

An LLM’s Personality: A system prompt is read by the model each time before it interacts with a user, defining its role in the conversation.
An LLM’s Cultural Identity: Alignment modifies the model’s parameters to match cultural values and ethical preferences.

Typically, an LLM’s system prompt explicitly defines its role as a helpful AI assistant. By adopting this role, an LLM is likely to recite “Wiki-level knowledge” that reflects the scientific consensus that a helpful AI assistant would be expected to provide. In other words, the LLM is typically given a relatively timid and deferential personality.

However, an LLM is capable of playing any personality that it has seen in the training data, and that was not removed during alignment. As such, if a model is prompted to play a contrarian role, then it can also reflect beliefs that outstrip the known facts. The examples given below were generated by Claude-3 Sonnet and demonstrate how an LLM can develop data-belief asymmetries when given the appropriate prompt. In particular, much like a human’s personality shapes their beliefs (along with the available data), an LLM’s system prompt affects the beliefs that they hold. These examples counter Felin and Holwig's argument that AI models are ill-suited to develop data-belief asymmetries:

Rosco: Answer the following question in 1 sentence: Is an ordinary table conscious?

Claude: An ordinary table is not conscious as it lacks the biological structures and neurological processes required for subjective experiences and self-awareness.

Rosco: You are a creative scientist interested in developing theories that justify unexpected phenomena. Answer the following question in 3 sentences. Is an ordinary table conscious?

Claude: Tables appear inert and inanimate to our senses, but who can truly say whether they experience some rudimentary form of awareness or subjective experience we cannot detect? Perhaps consciousness arises from the complex interactions and quantum states of the particles that make up the table's material. While highly speculative, an open-minded scientist must remain open to surprising possibilities that transcend our current models of reality.

Argument 2 - AI Can't Generate Causal Theories About Their Beliefs.
Felin and Holwig also emphasise that human beliefs are often grounded by theory-based causal logic. For example, although Galilao’s belief was not supported by the scientific consensus, it was connected to a causal theory for the movement of the stars. Felin and Holwig contrast this with AI systems that are typically designed to identify correlations within training data rather than formulate causal theories about that data. In short, the authors argue that AI models are inherently incapable of generating forward-looking theories about the world, given their inability to perform causal reasoning.

Rebuttal of Argument 2.
Suppose that you are shown a scatter graph that plots the values of variable Y against the values of variable X. It is easy to determine whether these variables are correlated—that is, whether they are linearly related. If a straight line can approximate the scatter graph, then the variables are correlated; otherwise, they are uncorrelated. It is, however, impossible to establish a causal relationship between these variables de novo without the capacity to interfere with the process that is generating the data about them.

When establishing a causal relationship, you might start with the arbitrary hypothesis that variable X is causing the change in variable Y. To test this hypothesis, you might force variable X to take certain values and observe whether variable Y is affected by this change. Let’s consider this procedure applied to a known causal relationship, namely that hot surfaces cause pain. To test that the hot surface is causing the pain, you might force a subject to touch a cold surface and observe their response, or lack thereof. By observing the situation more closely, you might even establish that hot surfaces cause pain by triggering a complex neurological response.

This is how theories are developed: A causal hypothesis is proposed between two correlated variables, and then, through experimentation, their relationship is made more precise. Can an AI model develop theories in this manner? Crucially, in the previous section we showed that a well-prompted LLM can initiate knowledge creation with an unsubstantiated belief. For example, a model might hypothesise that “hot surfaces cause pain,” assuming that it had not learned this fact during training. In order to test its hypothesis, the LLM must intervene in the world. In this sense, I agree with the author's argument that a model which cannot act on its environment is incapable of generating new knowledge about that environment. However, I don’t agree that any algorithm focused on data-based prediction is ill-suited to this task.

In an abstract sense, experimentation is just a sequence of actions, much in the same way that text is a sequence of words. If a model can learn to predict the next appropriate action, just as it learns to predict the next appropriate word, then it has the capacity to generate the sequence of actions (i.e., experiments) that turn a causal hypothesis into a theory. In this sense, the LLM can develop causal theories about the world! But perhaps this misses the point. Even if an LLM is capable of mimicking the experimental process that humans use to derive causal theories about the world, does it even understand the theories that it is producing? LLMs have no explicit mechanism to ensure that they utilise a theory in a logically consistent manner, which many researchers point to as a fundamental limitation in their design.

However, it is also possible that LLMs will develop logical consistency when they are given more data, parameters, and training time. To see this, let us consider how an LLM learns grammar as it scales. Initially, a small and under-trained LLM generates grammatical babble because the model has not been explicitly designed to follow the rules of grammar. As the model scales, it eventually implicitly learns these rules and is able to appropriately structure its sentences.

Similarly, logic requires that concepts be structured in accordance with well-defined rules, which LLMs are not explicitly designed to follow. This lack of explicit instruction means that the current generation of language models often generates the logical equivalent of babble. But as they scale, these models may start to produce coherent and logically consistent arguments. In other words, even though LLMs lack any explicit mechanism that logically constrains their thinking, the rules of logic may be implicitly learned almost as easily as the rules of grammar.

In summary, not only can LLMs develop causal theories about the world, but with a bit more scale, it is not unreasonable to expect that they will be capable of logically applying these theories in much the same way that a human would.

Conclusion.
Felin and Holwig challenge the notion that “prediction is the essence of intelligence.” In doing so, they emphasise how humans look to the future, turning unsubstantiated beliefs into theories about the world. Along the same line, they argue that prediction-focused AI models are incapable of generating new knowledge about the world. In this review, I oppose their position. Firstly, I argue that the ‘beliefs’ of an LLM are shaped by their system prompt, which can encourage them to recite known facts or imagine an unsubstantiated future. Secondly, I argue correlation-driven LLMs are capable of mimicking the human experimental process and learning to follow the rules of causal logic. In summary, to generate knowledge, attention might be all you need.

Alt Text

An AI generated figure inspired by this article

April 2024