What does an ideal neural network chip look like? The most important part is to have oodles of memory on the chip itself, say engineers. That’s because data transfer (from main memory to the processor chip) generally uses the most energy and produces most of the system lag—even compared to the AI computation itself.
Cerebras Systems solved these problems, collectively called the memory wall, by making a computer consisting almost entirely of a single, large chip containing 18 gigabytes of memory. But researchers in France, Silicon Valley, and Singapore have come up with another way.
Called Illusion, it uses processors built with resistive RAM memory in a 3D stack built above the silicon logic, so it costs little energy or time to fetch data. By itself, even this isn’t enough, because neural networks are increasingly too large to fit in one chip. So the scheme also requires multiple such hybrid processors and an algorithm that both intelligently slices up the network among the processors and also knows when to rapidly turn processors off when they’re idle.
In tests, an eight-chip version of Illusion came within 3-4 percent of the energy use and latency of a “dream” processor that had all the needed memory and processing on one chip.