The computer industry faces epic change, as the demands of “deep learning” forms of machine learning force new requirements upon silicon, at the same time that Moore’s Law, the decades-old rule of progress in the chip business, is collapsing.
This week, some of the best minds in the chip industry gathered in San Francisco to talk about what it means.
Applied Materials, the dominant maker of tools to fabricate transistors, sponsored a full day of keynotes and panel sessions on Tuesday, called the “A.I. Design Forum,” in conjunction with one of the chip industry’s big annual trade shows, Semicon West.
The presentations and discussions had good news and bad news. On the plus side, many tools are at the disposal of companies such as Advanced Micro Devices and Xilinx to make “heterogenous” arrangements of chips to meet the demands of deep learning. On the downside, it’s not entirely clear that what they have in their kit bag will mitigate a potential exhaustion of data centers under the weight of increased computing demand.
No new chips were shown at the Semicon show, those kinds of unveilings long since passed to other trade shows and conferences. But the discussion at the A.I. forum gave a good sense of how the chip industry is thinking about the explosion of machine learning and what it means for computers.
Gary Dickerson, chief executive of Applied Materials, started his talk by noting the “dramatic slowdown of Moore’s Law, citing data from UC Berkeley Professor David Patterson and Alphabet chairman John Hennessy showing that new processors are improving in performance by only 3.5% per year. (The figure is slightly outdated; an essay by Patterson and Hennessy back in February pegged the slowdown to 3% improvement per year.)
Dickerson went on to claim that A.I. workloads in data centers worldwide could come to represent as much as 80% of all compute cycles and 10% of global electricity use in the next decade or so.
That means the industry needs to seek many routes to solutions, said Dickerson, including “new architectures” for chip design and new kinds of memory chips. He cited several types of memory, including “MRAM,” “ReRAM,” (resistive RAM), “PCRAM,” (phase-change RAM), and “FeRAM.” The industry would also have to explore analog chip designs, chips that manipulate data as continuous, real-valued signals, rather than discrete units, and new kinds of materials beyond silicon.
Also: AI is changing the entire nature of compute
Both Advanced Micro Devices’s chief, Lisa Su, and Xilinx’s CEO, Victor Peng, made a pitch for their respective roles in making possible heterogenous types of computing.
Su talked about the company’s “Epyc,” server chip, which is working around the Moore’s Law bottleneck by gathering together multiple silicon dice, called “chiplets,” into a single package, with a high-speed memory bus connecting the chiplets, to build a kind of chip that is its own computer system.
Peng rehashed remarks from the company’s May investor day in New York, saying that Xilinx’s programmable chips, “FPGAs,” can handle not only the matrix multiplications of A.I. but also the parts of traditional software execution that need to happen before and after the machine learning operations.
A senior Google engineer, Cliff Young, went into the details of the Tensor Processing Unit, or “TPU” chip that Google developed starting in 2013. The effort was prompted, he said, by a kind of panic. The company saw that with more and more machine learning services running at Google, “matrix multiplications were becoming a noticeable fraction of fleet cycles,” in Google data centers. “What if everyone talks to their phones two minutes a day, or wants to analyze video clips for two minutes a day,” using machine learning, he asked rhetorically. “We don’t have enough computers.”
“There was potential in that for both success and disaster,” he said of the exploding demand for A.I. services. “We began a 15-month crash project to achieve a ten-X improvement in performance.”
Despite now being on the third iteration of the TPU, Young implied the crisis is not over. Compute demand is increasing “cubicly,” he said, speaking of matrix multiplications. Google has whole warehouse-sized buildings full of “pods,” containers that have multiple racks filled with TPUs. Still it won’t be enough. “Even Google will reach limits to how we can scale data centers.”
Get ready for a warehouse bottleneck, in other words.
Young said there will have to be a lot of collaboration between hardware designers and software programmers, what he called “co-design,” but also co-design with materials physicists, he suggested.
“When you do co-design, it’s interdisciplinary work, and you are a stranger in a strange land,” he observed. “We have to get out of our comfort zone.”
“Can we use optical transceivers” to manipulate neural nets, he wondered. Optical computing is “awesome at matrix multiplication,” he observed, but it is not very good at another critical part of neural networks, the nonlinear activation functions of each artificial neuron.
“Packaging is a thing, what more can we do with packaging and chiplets?” he asked. The industry needs alternatives to CMOS, the basic silicon material of chips, he said, echoing Dickerson. In-memory computing will also be important, he said, having computations close to memory cells rather than moving back and forth, to and from memory to processor and back along a conventional memory bus.
Young offered that machine learning might open new opportunities for analog computing. “It’s weird that we have this digital layer between the real-numbered neural nets and the underlying analog devices,” he said, drawing a connection between the statistical or stochastic nature of both A.I. and silicon. “Maybe we don’t always need to go back into bits all the time,” mused Young.
Given all the challenges, “it’s a super-cool time to be guiding
Also: Google says ‘exponential’ growth of AI is changing nature of compute
Internet of Things