CloudNets S5 E5: Orchestrating the world’s largest temporary network
Orchestrating the world’s largest temporary network
In this episode, Dudy Cohen and Brad Riapolov take a look behind the scenes of the world’s largest temporary network. Hear about the real challenges of orchestrating massive, multi-vendor networks—and how a new platform aims to simplify network operations end to end, with big news coming in 2026.
Chapters:
- 0:00 – Intro
- 00:30 – The world’s largest temporary network
- 00:58 – What are the main obstacles in orchestrating this network?
- 01:55 – Can we expect to work like hyperscalers, simple and quick, for setting up networks?
- 02:36 – Some big news in 2026
Key Takeaways
- Large-scale networks are still extremely complex to deploy and operate
Even with automation tools, building a massive, multi-vendor network—like the temporary supercomputing network at SuperCompute25—requires hundreds of experts, weeks of effort, and deep operational skill due to inconsistent standards and vendor implementations. - Today’s networks reflect decades of accumulated operational debt
The temporary supercomputing network serves as a microcosm of real-world production networks that have evolved over 15–20 years, highlighting how manual processes and fragmented orchestration approaches continue to slow down operations. - A shift toward full lifecycle network automation is coming
DriveNets is developing a platform designed to manage the entire network lifecycle end to end, significantly reducing operational time and complexity, with major announcements expected in 2026.
Listen on your favorite platform
Listen on Apple Podcasts
Listen on Spotify
Watch on YouTube
FAQs
-
What is network orchestration?
Last-mile redundancy involves balancing resilience, cost, and complexity. Single connections lack full redundancy, while dual connections improve reliability but come with higher costs and more complex network designs.
-
Why is network automation still challenging?
Network orchestration is the process of coordinating and managing different network components so they work together as a single system, across multiple vendors and technologies.
-
What does “network lifecycle management” mean?
It refers to managing a network from initial setup through ongoing operations, with the goal of simplifying processes and reducing the time and effort required to operate the network.
-
How can network design reduce blast radius?
Blast radius can be reduced by splitting the network into smaller, isolated domains. Techniques include segmenting the network using routing protocols (such as IGP or BGP) or creating architectural “islands” that prevent failures from propagating beyond their local domain.
-
How does a distributed chassis architecture reduce blast radius?
In a distributed chassis, failures are isolated to individual elements rather than impacting the entire system. If a single box fails, only customers directly connected to that box are affected, while traffic for other customers is rerouted through redundant elements in the cluster.
Read the full transcript
Hi and welcome to CloudNets where networks meet cloud and this is a special edition cloud app. Because we are here in St. Louis at the SuperCompute25 event, we have Brad, our supercomputer expert. And Brad, it’s been a tough couple of weeks for you, right?
Yes.
The largest temporary network on earth
What are we seeing behind us?
I have to tell you so if you look behind us, you are seeing the largest temporary network on earth. It’s assembled for two weeks by hundreds of volunteers from multiple research, educational institutions.
It’s called SciNet SciNet SuperCompute. Yeah. Okay, and what is your takeaway from this very busy couple of weeks? I understand you are here for two weeks bringing up this beast. Is there any takeaway as to how simple or how hard or what are the main obstacles in orchestrating all this mad stuff?
what are the main obstacles in orchestrating large networks?
Yes, it’s a beast to set up many vendors, hundreds of volunteers, and it’s only two weeks timeframe that we need to make sure that everything works together. I think that this network is a small microcosm of the networks that are running out there in the real world that have been around for 15, 20 years. And it goes to show how much the skill as well as how much time it takes to set up these type of networks. In the past we used to do them manually. It took a long time. Yeah, I’d tell you we’re using a third party automation tool, but still the problem still exists.
What about hyperscale-style orchestration for networks?
So the orchestration that’s supposed to make everything happen is not as simple as you.
Not simple because the standards are different. Yeah, the way that vendors implement standards are different as well. They have open models, they have their native models. And to be able to plug it all together and make it work takes hundreds of people to do it as well.
So Brad, any bright future ahead of us? You know, because I would expect now we’re 2025 to work like hyperscalers, to have everything automated, everything should be very easy and quick. Can we expect something like that in the future? Yes, we can.
DriveNets big news for 2026
We already have something. So DriveNets is working on a platform that will address the entire lifecycle of your network ecosystem from A to Z, completely simplifying the operation as well as reducing the time it takes operationally to drive that network. The biggest reduction is how much time it takes you to operate your network. And that’s the platform that we’re.
That sounds very promising. And we are going to have some big news during 2026 around this area. So it’s worth to keep us with the news from DriveNets. Thank you very much, Brad. For. For those two weeks and for this video. Thank you for watching. See you next time on CloudNets.
Bye.
Bye.