PartNet is a new semantic database of common objects that brings a new level of real-world understanding to robots
One of the things that makes humans so great at adapting to the world around us is our ability to understand entire categories of things all at once, and then use that general understanding to make sense of specific things that we’ve never seen before. For example, consider something like a lamp. We’ve all seen some lamps. Nobody has seen every single lamp there is. But in most cases, we can walk into someone’s house for the first time and easily identify all their lamps and how they work. Every once in a while, of course, there will be something incredibly weird that’ll cause you to have to ask, “Uh, is that a lamp? How do I turn it on?” But most of the time, our generalized mental model of lamps keeps us out of trouble.
It’s helpful that lamps, along with other categories of objects, have (by definition) lots of pieces in common with each other. Lamps usually have bulbs in them. They often have shades. There’s probably also a base to keep it from falling over, a body to get it off the ground, and a power cord. If you see something with all of those characteristics, it’s probably a lamp, and once you know that, you can make educated guesses about how to usefully interact with it.
This level of understanding is something that robots tend to be particularly bad at, which is a real shame because of how useful it is. You might even argue that robots will have to understand objects on a level close to this if we’re ever going to trust them to operate autonomously in unstructured environments. At the 2019 Conference on Computer Vision and Pattern Recognition (CVPR) this week, a group of researchers from Stanford, UCSD, SFU, and Intel are announcing PartNet, a huge database of common 3D objects that are broken down and annotated at the level required to, they hope, teach a robot exactly what a lamp is.