All laws of physics can be expressed mathematically as connections between state variables. This set of variables provides a thorough and non-redundant description of the system at hand. The actual process of detecting the hidden state variables has defied automation, despite processing power and artificial intelligence availability.

Most data-driven methods for modeling physical phenomena still rely on the assumption that the relevant state variables are already known. A longstanding question is whether it can identify state variables from only high-dimensional observational data.

Scientists at Columbia Engineering proposed a principle for determining how many state variables an observed system is likely to have and what these variables might be. They designed a new AI program to observe physical phenomena through a video camera and then try to search for a minimal set of fundamental variables that fully describe the observed dynamics.

Hod Lipson, director of the Creative Machines Lab in the Department of Mechanical Engineering, said, “We thought this answer was close enough. Especially since all the AI had access to was raw video footage, without any knowledge of physics or geometry. But we wanted to know the variables, not just their number.”

Scientists then visualized the actual variables that the program identified. Since the program cannot express the variables in any intuitive language that would be accessible to humans, extracting the variables themselves was challenging. After considerable investigation, it turned out that two of the variables the computer selected matched the angles of the arms, but the other two are still unknown.

Boyuan Chen Ph.D. ’22, an assistant professor at Duke University, said, “We tried correlating the other variables with anything and everything we could think of: angular and linear velocities, kinetic and potential energy, and combinations of known quantities. But nothing seemed to match perfectly. We were confident that the AI had found a good set of four variables since it was making good predictions, but we don’t yet understand the mathematical language it is speaking.”

They, therefore, validated several other physical systems with known solutions. They fed videos of systems for which they did not know the explicit answer. The first videos featured an “air dancer” undulating in front of a local used car lot. After a few hours of analysis, the program returned 8 variables. A video of a Lava lamp also produced 8 eight variables. They then fed a video clip of flames from a holiday fireplace loop, and the program returned 24 variables.

Lipson said, “I always wondered if we ever met an intelligent alien race, would they have discovered the same physics laws as we have, or might they describe the universe differently?”

“Perhaps some phenomena seem enigmatically complex because we are trying to understand them using the wrong set of variables. In the experiments, the number of variables was the same each time the AI restarted, but the specific variables differed. So yes, there are alternative ways to describe the universe, and it is quite possible that our choices aren’t perfect.”

This type of AI can help scientists unravel complex phenomena for which theoretical understanding is not keeping pace with the deluge of data—areas ranging from biology to cosmology.

Kuang Huang Ph.D. ’22, who co-authored the paper, said, “While we used video data in this work, any array data source could be used—radar arrays, or DNA arrays, for example.”

Lipson argues that “scientists may be misinterpreting or failing to understand many phenomena simply because they don’t have a good set of variables to describe them.”

“For millennia, people knew about objects moving quickly or slowly, but it was only when the notion of velocity and acceleration was formally quantified that Newton could discover his famous law of motion F=MA.”

“Variables describing temperature and pressure needed to be identified before laws of thermodynamics could be formalized, and so on for every corner of the scientific world. The variables are a precursor to any theory.”

Journal Reference:

  1. Chen, B., Huang, K., Raghupathi, S. et al. Automated discovery of fundamental variables hidden in experimental data. Nat Comput Sci 2, 433–442 (2022). DOI: 10.1038/s43588-022-00281-6