Resources

VideosMay 23, 2023

DriveNets Network Cloud-AI

DriveNets Network Cloud-AI

This is a datacenter. It has servers with CPUs and GPUs where each of them connects to a leaf or top of rack switch that
connects to a centralized or spine chassis based switch and ultimately connects to the Internet, where most of the queries come from, so not a lot of backend traffic.

This Clos architecture is the best one for data centers, but when it comes to AI clusters, computer workloads run in parallel on multiple GPUs.
To support that, these servers need backend communications between them, generating high volumes of east west traffic.
Since one workload runs on multiple servers, high bandwidth, no jitter and no packet loss are a must to ensure the highest GPI utilization.
Any degradation in network performance will impact the Job Completion Time, or JCT.
By nature, the traditional nonscheduled three hop or higher Ethernet Clos architecture of data centers isn’t predictable, or lossless making it unsuitable for AI networking.

Alternatively, you can use a high performance proprietary networking solution, but then you’ll be locked into a specific GPU vendor and require planning, managing and maintaining two different technologies for your front end and back end networks.
Ideally, you could flatten the networking infrastructure so that it would improve application performance just by taking one hop between any two servers.
But this requires the world’s largest chassis, which would be an operational nightmare.

The ideal solution is a combination of two a standard Ethernet solution that is optimized for AI and supports JCT of the best proprietary solutions.
DriveNets’ Network Cloud-AI is exactly that, an innovative Ethernet based disaggregated, distributed architecture that supports the world’s largest chassis just in a distributed architecture, which solves the scale limitation of the metal enclosure.
Thanks to a cell based fabric. It maintains high availability, low jitter, single Ethernet hops between all servers, which eventually leads to the best JCT for an Ethernet-based AI cluster.
And it’s similar to the familiar Clos topology you’re used to working with.
DDC is the most effective Ethernet solution for AI networking.

Drivenets’ Network Cloud DDC has been validated by leading hyperscalers in recent trials and was found the most cost effective high performance Ethernet solution for AI, reducing AI clusters idle time by up to 30%.
DriveNets’ Network Cloud solution is deployed in multiple high scale networks around the world. supporting more than 50% of AT&T’s core backbone.
It can scale to connect up to 32,000 GPUs at speeds ranging from 100Gig to 800Gig on a single AI cluster, with perfect load balancing.
It equally distributes traffic across the AI network fabric, ensuring maximum network utilization and zero packet loss under the highest loads.
And it delivers the fastest JCT by supporting End-To-End traffic scheduling, which avoids flow collisions and jitter and provides zero impact failover with sub ten milliseconds automatic path convergence.

DriveNets’ Network Cloud for AI is the most innovative networking solution available today for AI. It maximizes the utilization of the AI infrastructures and substantially lowers their cost in a standard based implementation that doesn’t give up vendor interoperability.