AI Learns Human Movement From Unorganized Data 🏃‍♀️

Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. Last year, an amazing neural network-based
technique appeared that was able to look at a bunch of unlabeled motion data, and learned
to weave them together to control the motion of quadrupeds, like this wolf here. It was able to successfully address the shortcomings
of previous works, for instance, the weird sliding motions have been eliminated, and
it was also capable of following some predefined trajectories. This new paper continues research in this
direction by proposing a technique that is also capable of interacting with its environment
or other characters, for instance, they can punch each other, and after the punch, they
can recover from undesirable positions, and more. The problem formulation is as follows – it
is given the current state of the character and a goal, and you see here with blue how
it predicts the motion to continue.

It understands that we have to walk towards
the goal, that we are likely to fall when hit by a ball, and it knows, that then, we
have to get up and continue our journey, and eventually, reach our goal. Some amazing life advice from the AI right
there. The goal here is also to learn something meaningful
from lots of barely labeled human motion data. Barely labeled means that a bunch of videos
are given almost as-is, without additional information on what movements are being performed
in these videos. If we had labels for all this data that you
see here, it would say that this sequence shows a jump, and these ones are for running. However, the labeling process takes a ton
of time and effort, so if we can get away without it, that’s glorious, but, in return,
with this, we create an additional burden that the learning algorithm has to shoulder.

Unfortunately, the problem gets even worse
– as you see here, the number of frames contained in the original dataset is very scarce. To alleviate this, the authors decided to
augment this dataset, which means trying to combine parts of this data to squeeze out
as much information as possible. You see some examples here how this motion
data can be combined from many small segments, and in the paper, they show that the augmentation
helps us create even up to 10 to 30 times more training data for the neural networks. As a result of this augmented dataset, it
can learn to perform zombie, gorilla movements, chicken hopping, even dribbling with a basketball,
you name it. What’s even more, we can give the AI high-level
commands interactively, and it will try to weave the motions together appropriately. They can also punch each other. Ow. And all this was learned from a bunch of unorganized
data. What a time to be alive! Thanks for watching and for your generous
support, and I'll see you next time!

Leave a Reply

Your email address will not be published. Required fields are marked *