Sanctuary’s Humanoid Robot Is for General-Purpose Autonomy

Humanoid robots, Robotics, Sanctuary, Teleoperation

We’ve been keeping track of Sanctuary AI for quite a while, mainly through the company’s YouTube videos that show the upper half of a dexterous humanoid performing a huge variety of complicated manipulation tasks, thanks to the teleoperation skills of a remote human pilot.

Despite a recent successful commercial deployment of the teleoperated system at a store in Canada (where it was able to complete 110 retail-related tasks), Sanctuary’s end goal is way, way past telepresence. The company describes itself as “on a mission to create the world’s-first humanlike intelligence in general-purpose robots.” That sounds extremely ambitious, depending on what you believe “humanlike intelligence” and “general-purpose robots” actually mean. But today, Sanctuary is unveiling something that indicates a substantial amount of progress toward this goal: Phoenix, a new bipedal humanoid robot designed to do manual (in the sense of hand-dependent) labor.


Sanctuary’s teleoperated humanoid is very capable, but teleoperation is of course not scalable in the way that even partial autonomy is. What all of this teleop has allowed Sanctuary to do is to collect lots and lots of data about how humans do stuff. The long-term plan is that some of those human manipulation skills can eventually be transferred to a very humanlike robot, which is the design concept underlying Phoenix.

Some specs from the press release:

  • Humanlike form and function: standing at 5’ 7” and weighing 155 pounds (70.3 kilograms)
  • A maximum payload of 55 pounds (24.9 kg)
  • A maximum speed of 3 miles per hour (4.8 kilometers per hour)
  • Industry-leading robotic hands with increased degrees of freedom (20 in total) that rival human hand dexterity and fine manipulation with proprietary haptic technology that mimics the sense of touch

The hardware looks very impressive, but you should take the press release with a grain of salt, as it claims that the control system (called Carbon) “enables Phoenix to think and act to complete tasks like a person.” That may be the goal, but the company is certainly not there yet. For example, Phoenix is not currently walking, and is mobile thanks to a small wheeled autonomous base. We’ll get into the legs a bit more later on, but Phoenix has a ways to go in terms of functionality. This is by no means a criticism—robots are superhard, and a useful and reliable general-purpose bipedal humanoid is super-duper hard. For Sanctuary, there’s a long road ahead, but they’ve got a map, and some snacks, and experienced folks in the driver’s seat, to extend that metaphor just a little too far.


Sanctuary

Sanctuary’s plan is to start with telepresence and use that as a foundation on which to iterate toward general-purpose autonomy. The first step actually doesn’t involve robots at all—it’s to sensorize humans and record their movements while they do useful stuff out in the world. The data collected that way are used to design effective teleoperated robots, and as those robots get pushed back out into the world to do a bunch of that same useful stuff under teleoperation, Sanctuary pays attention to what tasks or subtasks keep getting repeated over and over. Things like opening a door or grasping a handle are the first targets to transition from teleoperated to autonomous. Automating some of the human pilot’s duties significantly boosts their efficiency. From there, Sanctuary will combine those autonomous tasks into longer sequences to transition to more of a supervised autonomy model. Then, the company hopes, it will gradually achieve full automaton autonomy.


Sanctuary

What doesn’t really come through when you glance at Phoenix is just how unique Sanctuary’s philosophy on general-purpose humanoid robots is. All the talk about completing tasks like a person and humanlike intelligence—which honestly sounds a lot like the kind of meaningless hype you often find in breathless robotics press releases—is in fact a reflection of how Sanctuary thinks that humanoid robots should be designed and programmed to maximize their flexibility and usefulness.

To better understand this perspective, we spoke with Geordie Rose, Sanctuary AI founder and CEO.

Sanctuary has a unique approach to developing autonomous skills for humanoid robots. Can you describe what you’ve been working on for the past several years?

Geordie Rose: Our approach to general-purpose humanoid robots has two main steps. The first is high-quality teleoperation—a human pilot controlling a robot using a rig that transmits their physical movements to the robot, which moves in the same way. And the robot’s senses are transmitted back to the pilot as well. The reason why this is so important is that complex robots are very difficult to control, and if you want to get good data about accomplishing interesting tasks in the world, this is the gold star way to do that. We use that data in step two.

Step two is the automation of things that humans can do. This is a process, not an event. The way that we do it is by using a construct called a cognitive architecture, which is borrowed from cognitive science. It’s the idea that the way the human mind controls a human body is decomposable into parts, such as memory, motor control, visual cortex, and so on. When you’re engineering a control system for a robot, one of the things you can do is try to replicate each of those pieces in software to essentially try to emulate what cognitive scientists believe the human brain is doing. So, our cognitive control system is based on that premise, and the data that is collected in the first step of this process becomes examples that the cognitive system can learn from, just like you would learn from a teacher through demonstration.

The way the human mind evolved, and what it’s for, is to convert perception data of a certain kind into actions of a certain kind. So, the mind is kind of a machine that translates perception into action. If you want to build a mind, the obvious thing to do is to build a physical thing that collects the same kinds of sensory data and outputs the same kind of actuator data, so that you’re solving the same problems as the human brain solves. Our central thesis is that the shortest way to get to general intelligence of the human kind is via building a control system for a robot that shares the same sensory and action modes that we have as people.

What made you decide on this cognitive approach, as opposed to one that’s more optimized for how robots have historically been designed and programmed?

Rose: Our previous company, Kindred, went down that road. We used essentially the same kinds of control tactics as we’re using at Sanctuary, but specialized for particular robot morphologies that we designed for specific tasks. What we found was that by doing so, you shave off all of the generality because you don’t need it. There’s nothing wrong with developing a specialized tool, but we decided that that’s not what we wanted to do—we wanted to go for a more ambitious goal.

What we’re trying to do is build a truly general-purpose technology; general purpose in the sense of being able to do the sorts of things that you’d expect a person to be able to do in the course of doing work. For that approach, human morphology is ideal, because all of our tools and environments are built for us.

How humanoid is the right amount of humanoid for a humanoid robot that will be leveraging your cognitive architecture approach and using human data as a model?

Rose: The place where we started is to focus on the things that are clearly the most valuable for delivering work. So, those are (roughly in order) the hands, the sensory apparatus like vision and haptics and sound and so on, and the ability to locomote to get the hands to work. There are a lot of different kinds of design decisions to make that are underneath those primary ones, but the primary ones are about the physical form that is necessary to actually deliver value in the world. It’s almost a truism that humans are defined by our brains and opposable thumbs, so we focus mostly on brains and hands.

What about adding sensing systems that humans don’t have to make things easier for your robot, like a wrist camera?

Rose: The main reason that we wouldn’t do that is to preserve our engineering clarity. When we started the project five years ago, one of the things we’ve never wavered on is the model of what we’re trying to do, and that’s fidelity to the human form when it comes to delivering work. While there are gray areas, adding sensors like wrist cameras is not helpful, in the general case—it makes the machine worse. The kind of cognition that humans have is based on certain kinds of sensory arrays, so the way that we think about the world is built around the way that we sense and act in it. The thesis we’ve focused on is trying to build a humanlike intelligence in a humanlike body to do labor.

“We’re a technologically advanced civilization, why aren’t there more robots? We believe that robots have traditionally fallen into this specialization trap of building the simplest possible thing for the most specific possible job. But that’s not necessary. Technology is advanced to the point where it’s a legitimate thing to ask: Could you build a machine that can do everything a person can do? Our answer is yes.”
–Geordie Rose, Sanctuary founder and CEO

When you say artificial general intelligence or humanlike intelligence, how far would you extend that?

Rose: All the way. I’m not claiming anything about the difficulty of the problem, because I think nobody knows how difficult it will be. Our team has the stated intent of trying to build a control system for a robot that is in nearly all ways the same as the way the mind controls the body in a person. That is a very tall order, of course, but it was the fundamental motivation, under certain interpretations, for why the field of AI was started in the first place. This idea of building generality in problem solving, and being able to deal with unforeseen circumstances, is the central feature of living in the real world. All animals have to solve this problem, because the real world is dangerous and ever-changing and so on. So the control system for a squirrel or a human needs to be able to adapt to ever-changing and dangerous conditions, and a properly designed control system for a robot needs to do that as well.

And by the way, I’m not slighting animals, because animals like squirrels are massively more powerful in terms of what they can do than the best machines that we’ve ever built. There’s this idea, I think that people might have, that there’s a lot of difference between a squirrel and a person. But if you can build a squirrel-like robot, you can layer on all of the symbolic and other AI stuff on top of it so that it can react to the world and understand it while also doing useful labor.

So there’s a bigger gap right now between robots and squirrels, than there is between squirrels and humans?

Rose: Right now, there’s a bigger gap between robots and squirrels, but it’s closing quickly.

Aside from your overall approach of using humans as a model for your system, what are the reasons to put legs on a robot that’s intended to do labor?

Rose: In analyzing the role of legs in work, they do contribute to a lot of what we do in ways that are not completely obvious. Legs are nowhere near as important as hands, so in our strategy for rolling out the product, we’re perfectly fine using wheels. And I think wheels are a better solution to certain kinds of problems than legs are. But there are certain things where you do need legs, and so there are certain kinds of customers who have been adamant that legs are a requirement.

The way that I think about this is that legs are ultimately where you want to be if you want to cover all of the human experience. My view is that legs are currently lagging behind some of the other robotic hardware, but they’ll catch up. At some point in the not-too-distant future, there will be multiple folks who have built walking algorithms and so on that we can then use in our platform. So, for example, I think you’re familiar with Apptronik; we own part of that company. Part of the reason we made that investment was to use their legs if and when they can solve that problem.

From the commercial side, we can get away with not using legs for a while, and just use wheeled base systems to deliver hands to work. But ultimately, I would like to have legs as well.

How much of a gap is there between building a machine that is physically capable of doing useful tasks, and building a robot with the intelligence to autonomously do those tasks?

Rose: Something about robotics that I’ve always believed is that the thing that you’re looking at, the machine, is actually not the important part of the robot. The important part is the software, and that’s the hardest part of all of this. Building control systems that have the thing that we call intelligence still contains many deep mysteries.

The way that we’ve approached this is a layered one, where we begin by using teleoperation of the robots, which is an established technology that we’ve been working on for roughly a decade. That’s our fallback layer, and we’re building increasing layers of autonomy on top of that, so that eventually the system gets to the point of being fully autonomous. But that doesn’t happen in one go; it happens by adding layers of autonomy over time.

The problems in building a human-level AI are very, very deep and profound. I think they’re intimately connected to the problem of embodiment. My perspective is that you don’t get to general humanlike intelligence in software—that’s not the way that intelligence works. Intelligence is part of a process that converts perception into action in an embodied agent in the real world. And that’s the way we think about it: Intelligence is actually a thing that makes a body move, and if you don’t look at intelligence that way, you’ll never get to it. So, all of the problems of building artificial general intelligence, humanlike intelligence, are manifest inside of this control problem.

Building a true intelligence of the sort that lives inside a robot is a grand challenge. It’s a civilization-level challenge, but it’s the challenge that we’ve set for ourselves. This is the reason for the existence of this organization: to solve that problem, and then apply that to delivering labor.