DriveNets AI Fabric Product | Full-Stack AI Networking Solution

DriveNets AI Fabric

DriveNets AI fabric is a full-stack Ethernet-based networking solution for AI clusters’ back-end fabric, as well as front-end and storage connectivity – all with one solution. It includes hardware, software, and services supporting any AI infrastructure networking use case and GPU type. Test results from production environments show that DriveNets AI Fabric consistently demonstrates higher efficiency and performance, and faster deployment and fine-tuning time, compared to alternative InfiniBand or Ethernet fabric solutions.

DriveNets AI fabric is a full-stack networking solution that includes hardware, software, and services supporting any AI infrastructure networking use case.

AI networking and full-stack optimization for AMD clusters

AMD and DriveNets released a validated reference architecture document for clusters built with AMD Instinct MI355X GPUs, AMD Pollara NICs, and DriveNets scale-out and frontend solution.

The reference architecture document provides a comprehensive end-to-end blueprint for building a high-performance, scalable AI GPU cluster, and a repeatable deployment model that reduces integration and configuration risk.

Highest Performance AI Networking

Hardware

DriveNets Fabric-Scheduled Ethernet switches: 9300F, 5300R and 5301R
DriveNets Endpoint-Scheduled Ethernet switches: 2500S, 2600SL, 2601S
Network Interface Cards (NICs): a choice of NICs from industry leading vendors including AMD and Broadcom with performance optimization software

Use-cases

Scale-up networking: for AMD-based infrastructure
Scale-out networking: scheduled-Ethernet solutions, with fabric-scheduling or endpoint-scheduling technologies

Software

DNOS and DN-SONiC: Network operating systems
DriveNets AI Cluster Orchestrator: an orchestration and management suite with provisioning, benchmarking and ongoing operations’ engines

Services

DriveNets Infrastructure Services (DIS): full ownership of AI cluster lifecycle management – from design to first token, 24/7 Maintenance & support, Kernel/ROCm optimization, NBI integration, Software services – E2E performance optimization

Network Platforms

DriveNets 2500S

51.2 Tbps full duplex

64 x 800 GbE QSFP

Broadcom Tomahawk 5

Hardware Specifications

Interfaces
Network	64 × 800 GbE OSFP800
Inband Mgmt.	2 x 25G SFP28
OOB Mgmt.	1x IG RJ45
Performance
Switching Capacity	51.2 Tbps
Physical
ASIC	Broadcom Tomahawk 5
Memory	16GB x 2 with ECC (SO-DIMM) DDR4
Chassis	2RU
Typical / Max (with optics)	1100W / 1623W

DriveNets 2600SL

102.4Tbps full duplex

64 x 1600 GbE OSFP224

Broadcom Tomahawk 6

Liquid cooled

Hardware Specifications

Interfaces
Network	64 x 1600 GbE OSFP224
Inband Mgmt.	2 x 50G SFP56
OOB Mgmt.	1x IG RJ45
Performance
Switching Capacity	102.4 Tbps
Physical
ASIC	Broadcom Tomahawk6
Memory	2 x 32GB DDR4 2666 RDIMM/SODIMM w/ECC
Chassis	2OU ORv3

DriveNets 2601S

102.4Tbps full duplex

64 X 1600 GbE OSFP224

Broadcom Tomahawk 6

Air cooled

Hardware Specifications

Interfaces
Network	64 x 1600G OSFP
Inband Mgmt.	2 x 50G SFP56
OOB Mgmt.	1x IG RJ45
Performance
Switching Capacity	102.4 Tbps
Physical
ASIC	Broadcom Tomahawk 6
Memory	2 x 32GB DDR4 2666 RDIMM/SODIMM w/ECC
Chassis	3RU

DriveNets 5300R

30.4Tbps full duplex

18x800G OSFP network interface ports

20x800G OSFP fabric interface ports

Broadcom Jericho3-based

Hardware Specifications

Interfaces
Network	18 x 800G OSFP
Fabric	20 × 800G OSFP
Inband Mgmt.	2 x 25G SFP28
OOB Mgmt.	2 x 10G SFP, 1x IG RJ45
Performance
Switching Capacity	30.4 Tbps
HBM Deep Buffer	16GB
Physical
ASIC	Broadcom Jericho3
Memory	64GB DDR4 (2 x 32GB) with ECC
Chassis	2RU
Typical / Max (with optics)	782W / 1615W (14.5W port)

DriveNets 9300F

102.4Tbps full-duplex

128 x 800G OSFP fabric interface ports

Cell-based switching

Broadcom Ramon3-based

Hardware Specifications

Interfaces
Network	128 × 800G OSFP
Inband Mgmt.	2 x 25G SFP28
OOB Mgmt.	2 x 10G SFP, 1x IG RJ45
Performance
Switching Capacity	102.4 Tbps
Physical
ASIC	2x Broadcom Ramon3
Memory	32GB DDR4 (2 x 16GB) with ECC
Chassis	6RU
Typical / Max (with optics)	1135W / 4352W (14.5W port)

Use cases

DriveNets AI Fabric includes a solution for any part of the networking fabric, including:

Scale-up networking – An Ethernet based (ESUN-SUE/T) solution for rack-scale scale up networking. Open-standard, low latency and high efficiency solution for AMD-based architectures.
Scale-out networking – Fabric-scheduled Ethernet: A cell-based fabric with standard Ethernet connectivity to endpoint ensures the highest performance networking for any GPU and any NIC.
Scale-out networking – Endpoint-scheduled Ethernet: Standard Ethernet Clos architecture with endpoint scheduling performed in the NIC, with multiple NIC providers supported (AMD, Broadcom) and standard-based packet spraying (Ultra-Ethernet)
Scale-across – A connectivity solution for back-end networking across multiple datacenters, suitable for distances of up to 100km.
Front-end and storage networking – a unified fabric for storage network and front-end connectivity

Software stack

DriveNets provides an end-to-end software stack, including:
DNOS: Network operating-system that runs on multiple hardware options
AI Cluster Orchestrator: a lifecycle orchestration system with engines tailored for:

Provisioning: a bring-up, configuration and scaling tool that automatically discovers and provisions the entire AI infrastructure, including servers, NICs, and network components, ensuring complete visibility and seamless deployment of an optimized reference architecture.
Benchmarking: an end-to-end cluster performance validation engine that streamlines cluster benchmarking by executing RDMA, RCCL/NCCL, and representative AI workloads to ensure the environment operates at peak performance
Ongoing management: a day-n cluster manager

DriveNets Infrastructure Services (DIS)

End-to-end professional services for any GPU fabric deployment, including:
Bring up services

Architecture design and validation
Deployment and integration of AMD GPU
Performance optimization and troubleshooting
Knowledge transfer and enablement of the customer’s internal teams

Software services

Kernel optimization
Collectives’ optimization
ROCm development and enablement