Grass and hiking trails are no problem for this robot, which learned to walk on the fly thanks to a machine learning algorithm.
26 Aug 2022
A robot dog can learn to walk on unfamiliar and hard-to-master terrain — like grass, barks and trails — in just 20 minutes, thanks to a machine learning algorithm.
Most autonomous robots must be carefully programmed by humans or extensively tested in simulated scenarios before they can perform real-world tasks, such as climbing a rocky hill or slippery slope, and when they encounter unfamiliar environments, they tend to run into trouble. .
Now, Sergey Levin at the University of California, Berkeley, and colleagues have shown that a robot using a type of machine learning called deep reinforcement learning can figure out how to walk in about 20 minutes in several different environments, such as a lawn, a layer of bark, a memory foam mattress and a hiking trail.
The robot uses an algorithm called Q-learning, which does not require a working model of the target terrain. Such machine learning algorithms are generally used in simulations. “We don’t need to understand how the physics of an environment actually work, we just put the robot in an environment and turn it on,” says Levine.
Instead, the robot receives a certain reward for each action it performs, based on how successful it was against predefined goals. He repeats this process continuously while comparing his previous successes until he learns to walk.
“In a sense, it’s very similar to how people learn,” says one team member. Ilya Kostrikov, that is, at the University of California, Berkeley. “Interacting with some environment, receiving some utility, and basically just thinking about your past experience and trying to understand what could have been improved.”
Although the robot can learn to walk on each new surface it encounters, Levine says the team would need to fine-tune the model’s reward system for the robot to learn other abilities.
Making deep reinforcement learning work in the real world is hard, says Chris Watkins at Royal Holloway, University of London, because of the number of different variables and data that have to interact at the same time.
“I think it’s very impressive,” says Watkins. “Honestly, I’m a bit surprised that you can use something as simple as Q-learning to learn skills like walking on different surfaces with so little experience and so quickly in real time.”
More on these topics: