AMD (AMD - Get Report) just obtained a pretty big reference account as it tries to gain ground against Nvidia (NVDA - Get Report) in a large and growing market for server GPUs used for AI and high-performance computing (HPC) workloads.
On Tuesday morning, AMD and supercomputer maker Cray (CRAY) announced that they've won a $600 million-plus deal to supply the Department of Energy's Oak Ridge National Laboratory (ORNL) with a supercomputer featuring over 1.5 exaflops (that's over 1.5 million teraflops) of processing power. That would make the system, which is known as Frontier and set to be delivered in 2021, more than 10 times as powerful as Oak Ridge's Summit system, which is the most powerful supercomputer on the Nov. 2018 Top500 supercomputer list.
AMD and Cray go as far as to say that Frontier will be more powerful than the 160 fastest supercomputers on the planet today. They also note the system, which is based on Cray's Shasta supercomputer architecture, will weigh over 1 million pounds and contain over 90 miles worth of cables, and is meant to handle both AI/machine learning workloads and traditional simulation and modeling-related HPC workloads in fields such as weather, physics and genomics.
Whereas Summit is powered by IBM (IBM - Get Report) Power9 server CPUs and Nvidia Tesla V100 server GPUs, Frontier will be rely on a custom AMD Epyc server CPU featuring specialized instructions, along with a "purpose-built" server GPU for its Radeon Instinct family that (although designed with Frontier's needs in mind) will also be offered to third parties. Each CPU will be connected to four GPUs using a custom, high-speed fabric.
Notably, AMD says the Epyc CPU won't rely on the company's Zen 2 CPU core architecture, which will power the second-gen Epyc CPUs (codenamed Rome) that are launching in mid-2019, but a "future" Zen design. The company's Zen 3 architecture, which is due in 2020, is a strong possibility.
Oak Ridge's Frontier supercomputer will have 4 GPUs for each CPU. Source: AMD.
During a press briefing about the deal, AMD CEO Lisa Su called Frontier "an inflection point" for supercomputing. It just might also become an inflection point for AMD's server GPU business, at least when it comes to HPC use cases.
Whereas Nvidia's GPUs are used by 127 systems on the Nov. 2018 Top500 list -- that's up from 86 systems on the Nov. 2017 list -- AMD's GPUs aren't used by a single one. For that matter, though it was arguably just a matter of time before this changed in light of its Epyc launches, there's only one system on the list that uses AMD server CPUs -- it relies on chips supplied by AMD's Chinese joint venture. The lion's share of the Top500 systems rely on Inte (INTC - Get Report) Xeon server CPUs, with IBM CPUs powering many of the others.
In addition, Nvidia has remained the dominant player in the market for systems used to handle the demanding task of training AI models to do things such as decipher voice commands, translate text and detect objects within photos and videos. In both the AI training and traditional HPC markets, Nvidia's large R&D investments -- both in GPU design and in creating a software platform on top of its GPUs -- and the hardware and software ecosystem that these investments have produced remain big competitive strengths. In addition, Nvidia is just a couple months removed from announcing a $6.9 billion deal to buy Mellanox Technologies (MLNX - Get Report) , whose high-speed server interconnects are used by many HPC and AI training systems.
AMD, to be fair, has reported seeing strong growth for its server GPU business in recent quarters. But a lot of those sales are believed to involve graphics-related workloads such as powering virtual desktops and cloud gaming services (for example, Google's Stadia gaming service).
Now, however, AMD has a major customer win it can point to when battling for HPC deals in which GPU acceleration is involved. Though it's a safe bet that Nvidia will remain the dominant player in the market for the foreseeable future, just becoming a competitive #2 player would be a big development for AMD.
At the time of publication, Action Alerts PLUS, which Cramer co-manages as a charitable trust, was long NVDA.