Raising robots: Teaching robots things humans learn, including navigation, movement, dance, spatial reasoning

Purdue professor Aniket Bera sits at a desk with a computer, a headset and a four-legged robot

Purdue computer scientist Aniket Bera works to improve the way robots interact with the human world. (Purdue University photo/John Underwood)

September 17, 2024 Brittany Steff

WEST LAFAYETTE, Ind. — Being a baby is harder than it looks. Born into the world knowing almost nothing, they spend their first few years getting a PhD in navigating a physical environment.

Likewise, artificial intelligence programs come into the world as a stream of code, knowing nothing, with no experience and no abilities. But while babies learn about physics from the moment they’re born, robots have to be programmed to understand innately human traits.

“A robot needs to interact with the world,” said Aniket Bera, associate professor of computer science in Purdue University’s College of Science and an AI expert. “Human brains learn through experience and by extrapolating their experiences to new situations. Our brain learns properties are transferable. Machine learning models don’t currently do this. That’s the gap we’re addressing. This is very foundational research.”

As babies begin exploring the world around them, they touch their parents’ hair and nose. They graduate to pushing food off high chairs, putting toys in a bucket (and dumping them out!), sorting shapes into the correct compartments, and splashing in the tub. Babies walk. They navigate. And babies dance.

Bera is trying to teach robots to do those things, too. Helping robots understand the physical world improves their functionality and navigation, making them more practical in situations from wilderness rescue to food delivery.

To do that, AI programs must first be trained to recognize shapes, to understand how to form sentences, to understand queries and responses, and to acknowledge their names and commands. AI is a foundational component of the Institute for Physical Artificial Intelligence, a Purdue Computes initiative.

Elementary, my dear robot

Most people don’t know how the math behind fluid dynamics works, or parabolic motion, or inertia. But they do know that water will splash out when a heavy rock is dropped into a full bucket, how to catch a ball, and when to push a child on a swing so that they go as high as possible.

To robots, everything is math, and scientists must teach robots how physics works, which is more complicated than it sounds. Humans extrapolate to exist — robots do not naturally extrapolate from what they know to new situations.

Humans assume the floor will still be there when they get out of bed each morning; they don’t check. If gravity worked yesterday, humans assume it will today. If one book falls when pushed off a table, humans don’t need to test it to know all the others will. And humans assume the same goes for every other object in existence.

“Trying to model the movement of objects, liquids and gases is challenging,” Bera said. “Scientists have traditionally used mathematical models, but they’re very computationally heavy and data-intensive. We are building machine learning AI models that can understand the physics of situations, such as how a ball will move in an environment. Lots of AIs can recognize a ball, but far fewer understand what happens when a ball moves. Babies play with blocks to intrinsically learn physics. Their towers fall many times, but each time they’re learning.”

Even once an AI is trained to understand how a small ball moves, it can’t extrapolate the dynamics of the system to other shapes — or even to similar shapes with different colors or textures of objects.

Using math to teach robots and computer programs to extrapolate is a massive challenge and requires new and unique approaches to programming and machine learning.

“When you try to get an AI program to apply acquired knowledge into a different situation than the one it was trained on, it fails miserably,” Bera said. “This is also a problem with equity and ethics. Early facial recognition software struggled because it was trained on predominantly white faces. When it was tested on non-white faces, it couldn’t identify them. That’s because your models don’t transfer from one situation to another. They need to understand what a face is, what hair looks like, what a nose is, not just a collection of pixels. They need to go beyond pixels to understanding the whole to be able to apply knowledge in new areas or new applications.”

Two researchers in motion capture suits dancing together — Bera and his team used motion capture sensors to record dancers’ movements, which they later used to help program robots with more natural movements — including dance. (Purdue University photo/Jonathan Poole)

Dancing the robot

Robots can do lots of things humans typically can’t do. Translate any language into any other language, for instance. Track thousands of satellites at once. Solve process and math problems faster than most humans and almost all other animals.

But they can’t do something even 1-year-old humans and parrots on YouTube can do, and that’s dance.

Is that a problem? Dancing robots aren’t in high demand, generally. But as it turns out, the ability to dance says important things about a being’s — human or otherwise — coordination, cognition and perception.

Robots don’t need to dance, of course, but the programming introduces fluidity of movement and competence, helping them grasp how other physical objects, including potentially their own robot bodies, move in a physical environment.

Teaching robots to understand the way humans naturally move also makes them more predictable. If they understand normal human movements, they can guess how humans in a crowd will move. That ability to extrapolate future movement imparts a lesser chance of a robot colliding with a human and a greater chance that a robot can reach someone who needs help quickly or more efficiently get to where it needs to be.

“Modeling human motion is a very complex problem,” Bera said. “Especially in dancing, everything moves very dynamically and rhythmically. Dance incorporates the content of the lyrics of the song, the emotions of the melody. Generating natural humanlike motions is a very difficult proposition.”

How does a robot learn to define dancing? By watching dancers. To that end, Bera and his team have filmed dozens of people performing a diverse range of traditional cultural dance forms, including Latin, ballroom, folk and TikTok dances.

Using motion capture technology, they are analyzing the dance movements, then using that data to train a machine learning model. Their goal is not only to teach an AI to dance by itself, but also to dance as a pair — potentially even with a human partner.

“Duet forms — where people dance together — are an order of magnitude more difficult than single dances,” Bera said. “We are analyzing how couples interact, how they anticipate each other’s moves and how they react to the music. This is dancing as directed research; all these insights go to make robots safer and more helpful to humans.”

Where we’re going, we don’t need roads

GPS can tell drivers to turn left, turn right, follow street signs, and check databases for traffic and construction updates. What it can’t do is help when there’s no road. While some navigation software can give directions over roadless terrain by using words like “north” and coordinates on a map, it can’t adapt on the fly, and it certainly can’t adapt around moving targets, including shifting sands and human crowds.

That’s a problem for robots whose remit includes rescuing trapped humans in unstable landscapes or quickly navigating a crowd. In dynamic, even dangerous, environments, robots need to know how to stay safe and how to keep other people safe.

Like your robot vacuum continually butting into the same desk it has run into the last hundred times it cleaned your floors, robots don’t typically understand how their bodies move in space and how to solve physical problems. They think about obstacles differently than humans do.

“When robots move through the environment, they have to worry about the terrain, but they also have to navigate around humans,” Bera said. “Robots need to be able to find and predict gaps in a crowd and understand how people will move before they do it. We humans do this subconsciously all the time, but robots need to be taught.”

Part of the issue is that robots don’t see the same way humans or other animals do. Picture the robot vacuum again, terrified of a “cliff” that is actually a stark shadow on the floor. Biological sight uses images from the eye but also interpretation through the brain that scientists still don’t fully understand. The poor robots only have cameras — which means they need to use as many as possible to get a comprehensive and correct view of their environment.

Part of Bera’s newest research project involves braiding multiple visual inputs together in a robot’s brain to give them a better understanding of their surroundings. Those camera angles could include the robot’s own “eyes” as well as cameras on drones or remote cameras held by humans or attached to other robots. Robots live in the real world, not the idealized ones that exist inside cleanrooms and dry labs.

“Robots need to be able to quickly grasp and map their environment, especially if they’re on a planet’s surface,” Bera said. “Natural terrain is unpredictable, and it can shift. Different cameras give different viewpoints, and the robot needs to be able to integrate all that information.”

In addition, Bera is programming AI models to make decisions based on goals. If speed is of the essence, say in delivering vaccines or rescue materials, a robot might take the most direct route. But in other situations, the robot may need to weigh keeping itself safe. If its mission isn’t time sensitive or urgent over the short term, keeping the robot’s mechanism in good working order is also a goal.

If a robot is trying to deliver a pizza to someone in a building’s basement, falling down a stairwell isn’t the best way to go. The pizza might get there, but it’ll be squashed along with the robot.

In other settings, including extraplanetary research and even exploration of extreme environments on Earth, robots need to plan around natural conditions like weather.

“If it’s raining, and the robot isn’t in a hurry, can it plot a course that keeps it under trees and cover?” Bera said. “In a windstorm on Mars, can it figure out how to use boulders and crater rims to stay safe?”

Teaching humans to correctly weigh and assess risk is challenging enough. Trying to get a robot or AI to do the same thing requires helping them understand not only the physical and mathematical components, but more nebulous factors including long-term goals, probability and understanding of humans.

“If robots understand human emotions, behavior, motions and environment, that understanding will help them be more effective, more beneficial and more successful in the long term,” Bera said.

About Purdue University

Purdue University is a public research institution demonstrating excellence at scale. Ranked among top 10 public universities and with two colleges in the top four in the United States, Purdue discovers and disseminates knowledge with a quality and at a scale second to none. More than 105,000 students study at Purdue across modalities and locations, including nearly 50,000 in person on the West Lafayette campus. Committed to affordability and accessibility, Purdue’s main campus has frozen tuition 13 years in a row. See how Purdue never stops in the persistent pursuit of the next giant leap — including its first comprehensive urban campus in Indianapolis, the Mitch Daniels School of Business, Purdue Computes and the One Health initiative — at https://www.purdue.edu/president/strategic-initiatives.

Media contact: Brittany Steff, bsteff@purdue.edu