Skip to main content

As Nvidia (NVDA) - Get NVIDIA Corporation Report  begins to see tougher competition in a booming AI training market that it has long dominated, the company is determined to make sure that the fight doesn't merely boil down to which chips have the best specs.

During a lengthy Tuesday keynote at Nvidia's annual GPU Technology Conference (GTC), CEO Jensen Huang unfurled a massive set of product and partnership announcements covering everything from servers to workstations to autonomous cars to robots/IoT devices.

They covered almost every major market that the company competes in save for PCs, where Nvidia, which maintains a high-end performance edge over AMD (AMD) - Get Advanced Micro Devices, Inc. Report courtesy of products launched in early 2017 that are based on its Pascal GPU architecture, is taking its time to launch gaming GPUs based on a next-gen architecture.

Of  all the announcements, the ones covering Nvidia's server GPU efforts are likely the most important for its top line in the short-to-intermediate term -- particularly the ones related to Nvidia's solutions for training AI/deep learning algorithms to do things such as understand voice commands, translate languages and detect objects within images.

Image placeholder title

With the help of large R&D investments and an unmatched developer ecosystem, as well as the architectural strengths GPUs claim for handling large numbers of small tasks in parallel, Nvidia's Tesla server GPU claims a giant share of the training market. Cloud giants such as Facebook (FB) - Get Facebook, Inc. Class A Report , (AMZN) - Get, Inc. Report and Alibaba (BABA) - Get Alibaba Group Holding Ltd. Sponsored ADR Report  are clients, as are numerous enterprises. In recent years, Nvidia has also launched a training server (the DGX-1) that packs eight high-end Tesla GPUs, along with an 8-GPU reference platform (the HGX-1) for third parties looking to design their own servers.

Thanks in part to strong training-related demand, Nvidia's Datacenter product segment saw its revenue rise 105% annually in the February quarter to $606 million. It now accounts for over a fifth of Nvidia's total sales, and is expected to account for about a third of them in a year's time.

Nvidia's Datacenter sales are also benefiting from strong Tesla shipments for virtual desktop and traditional high-performance computing (HPC), and from growing sales of GPUs used for inference -- the task of running AI algorithms against real-world data and content. Though a lot of inference work is still done using Intel (INTC) - Get Intel Corporation (INTC) Report server CPUs, and some is done using programmable chips (FPGAs) from Intel and Xilinx (XLNX) - Get Xilinx, Inc. (XLNX) Report , Nvidia has signaled its inference sales are growing strongly off a relatively small base, and has reported having over 1,200 clients.

On Tuesday, Nvidia strengthened its inference push by unveiling TensorRT 4, a next-gen inference optimizer and runtime for its GPUs that arrives just six months after its predecessor, and has been optimized for Alphabet (GOOGL) - Get Alphabet Inc. Class A Report Google's popular TensorFlow machine learning framework (among other things). And in a move that could help with larger inference deployments, Nvidia announced its GPUs will support Kubernetes, a popular software platform for orchestrating the deployment of app containers that can be quickly moved from one server (or one cloud) to another.

Meanwhile, for the training market, Nvidia unveiled the DGX-2, a next-gen AI training server that contains 16 of its Tesla V100 flagship GPUs (they're based on Nvidia's cutting-edge Volta architecture) along with a pair of Intel Xeon CPUs, and which will sell for a hefty $399,000. Notably, the DGX-2 contains 12 proprietary switching chips (known as NVSwitches) that combine to provide 2.4 terabytes per second of internal bandwidth for the GPUs. It also supports up to 30TB of flash memory and 1.5TB of RAM, both of which are large improvements relative to the DGX-1.

TheStreet Recommends

Nvidia also:

    Doubled the Tesla V100's HBM2 graphics memory to 32GB. Samsung is Nvidia's main HBM2 supplier, but Micron  and SK Hynix also have to be pleased this much graphics memory is being packed onto a server graphics card.

    Teamed with flash storage system vendor Pure Storage to launch the AIRI -- a solution for AI researchers that pairs four DGX-1 systems with Pure's FlashBlade storage array and 100-gig Arista Networks Ethernet switches.

    Saw its server partners unveil new modular V100-powered systems. Taiwan's Gigabyte, for example, launched the G190, a server that contains 4 Tesla V100 GPUs yet possesses a tiny 1U form factor (height of 1.75 inches).

    All of these moves come as Intel, as well as startups such as Graphcore and KnuEdge, try to chip away at Nvidia's training dominance via processors that were built from the ground up to train deep learning algorithms. Intel's recently-launched Nervana Neural Network Processor (NNP) appears to be no slouch in terms of either processing power/efficiency or memory bandwidth, and has been working with Facebook to test and optimize the chip.

    In addition to direct competition from other chipmakers, Nvidia has begun seeing some of its cloud clients take an interest in developing their own silicon for AI workloads. Last spring, Google unveiled its second-gen Tensor Processing Unit (TPU), which can handle both training and inference work involving TensorFlow. Google isn't directly selling the chip, believed to have been developed with Broadcom's (AVGO) - Get Broadcom Inc. Report help, but is using it for its own AI work and making it available through its cloud infrastructure platform.

    Also: On Monday, Susquehanna's Chris Rolland reported another cloud giant "has designed and is in the process of manufacturing" a training chip that will enter full-scale production during the second half of 2018.

    Against such a backdrop, Nvidia is partly relying on large chip engineering investments to keep rivals at bay -- the Tesla V100, it should be noted, is a monster of a chip that pairs 5,120 traditional GPU cores with 640 "Tensor cores" meant to handle deep learning operations. But it's also banking on the massive AI developer ecosystem it has enabled through its CUDA programming interfaces (APIs). And as Tuesday's announcements show, Nvidia is also betting that creating powerful system-level solutions -- both on its own and in partnership with OEMs -- will give it an edge against rivals who have merely developed a good chip and related accelerator card.

    Together with Nvidia's other strengths, these hardware efforts leave the GPU giant well-positioned to hold onto a sizable chunk of the fast-growing AI training market. Even if Intel, Google and others are bound to gain some share.

    Jim Cramer and the AAP team hold positions in Nvidia, Facebook, Amazon, Alphabet and Broadcom for their Action Alerts PLUS Charitable Trust Portfolio. Want to be alerted before Cramer buys or sells NVDA, FB, AMZN, GOOGL or AVGO? Learn more now.