Managing The Limitations Of AI In Physics

In the halls of learning, in conference rooms and government offices, wherever people meet to talk about the real advancements in AI, there’s a concession and a debate around a way forward that centers on one particular deficit: that AI really doesn’t understand real-world physics very well.

“Artificial intelligence can beat world champions at chess, generate stunning artwork, and write code that would take humans days to complete,” writes Dr. Tehseen Zia at Unite.ai . “Yet when it comes to understanding why a ball falls down instead of up, or predicting what happens when you push a glass off a table, AI systems often struggle in ways that would surprise a young child. This gap between AI’s computational prowess and its inability to understand basic physical intuition reveals key limitations about current form of artificial intelligence. While AI excels at pattern matching and statistical analysis, it lacks a deep understanding of the physical world that humans develop naturally from birth.”

It’s one thing to plug a bunch of equations into an LLM or neural net and have it conceptually understand them: it’s quite another thing to have the model know what a ball or a pin or some other object is going to do in a three-dimensional space. Why does AI need to know this? To render it in video, mainly, but also to fully evaluate real-world scenarios. A fuller understanding of these scenarios needs to be table stakes.

Thinking About the Way Forward: Notes from a Boston Conference

At our Imagination in Action event April 9–10, we had a panel where my colleague Daniela Rus, head of the MIT CSAIL Lab, spoke with Aleksander Madry of OpenAI about these issues and how to move the ball forward, so to speak.

In explaining what AI’s “real powers” are when it comes to physics, Madry spoke to a “jaggedness” in what AI understands, a discrepancy, and how that affects the results produced by the technology.

To understand the difference between the equations and the interactions, he suggests, is to ask: what is the data and what is the objective?

Madry also had this interesting thought on human reactions to AI results:

“When the model kind of comes up with some interesting thoughts, then we say, ‘okay, the model is very smart,’” he said. “When model comes with interesting, wrong thoughts, then we say, ‘oh, it's hallucinating.’”

“If you think about how large language models learn, they essentially learn from correlations,” he said. “They just essentially learn that, okay, if factor x appears in text, and factor y appears next to it, the next time you say ‘x,’ it will say, ‘Oh, ‘y’ should happen.’ So this is kind of learning by correlation. This is how, essentially, most of the mainstream AI works right now at scale.”

The Achille’s Heel of Correlations

This method, Madry argued, leads to some profound limitations and problems. One, he explained, is that you can trick the model.

“You can actually can fool the model into seeing something that is not there,” Madry said, “and in some sense, exactly like what these adversarial examples are doing, they're showing errors of modeling, saying, the way the model learns to solve some tasks is not the way we, as humans, solve them. We look at different patterns, and we get triggered by different features, and you are kind of using this duality.”

That difference brings up questions related to, again, how humans interact with the tech.

“The model does not solve a task the way we do,” he continued. “So when it fails, it may fail differently than we, as humans, do. How do you catch this misalignment, in some sense, between how a model solves the task, and how a human solves this task, so the LLMs are more predictable, also, in failing?”

Later, Rus asked Madry about progress in LLMs, applied to robotics, framing it this way:

“In robotics, we often have this debate,” she said, “should you actually use a big AI model in order to get the robot to do a task that can be solved with a simple equation from first principles? And so my position is that if we have a simple solution, we should apply the simple solution. But there are tasks that cannot be modeled from first principles, and there are also tasks where adaptation in future settings is important. So these are the areas where it's good to have AI, and the simulation, and in particular, the sim to real packages are getting extremely good, much, much better than they were back in the day.”

“I would not take everything at face value,” Madry said in response, citing the example of GPT trying to cover the full distribution of some real-world task. “It's quite a long tail. So these kinds of things still will be not exactly right at the edges.”

After a lot more discussion of AI in education, AI in fields like astrophysics, and considering whether science will be “ported” to AI, the two discussed the current state of scaling.

“There is a lot of innovation happening that enables scaling,” Madry said. “Reasoning models was definitely a step change. Honestly, when I'm hiring people to OpenAI, I'm just telling them, ‘by the way, much of your work will be engineering.’ It’s not just about having brilliant thoughts, it's also about executing them. And that's at scale. It takes a lot of grit and a lot of wrangling.”

They also discussed jobs, as many do when they try to imagine even the near future.

“I don't want to say that we all will be out of jobs, but it really means that other jobs might be different,” Madry said. “What do we do to make sure this transition happens? As you can imagine, there is a lot of angst about this.”

His takeaway seemed to be fairly positive about the prospects of figuring out fields like computer science, which will be important in imagining our shared world.

Managing The Limitations Of AI In Physics

Thinking About the Way Forward: Notes from a Boston Conference

The Achille’s Heel of Correlations

Read Next