Season 3 CloudNets
Episode 1 Disaggregation Mythbusters
We’re talking about some myths about disaggregated distributed chassis or Network Cloud. The myth about Network Cloud and this architecture in general is just a one off. The myth about the operational headache. And the myth about that it’s a fairly complex architecture.
Listen on your favorite podcast platform
Listen on Apple Podcasts
Listen on Spotify
Full Transcript
Hi and welcome back to CloudNets, where networks meet cloud.
And this is the season three premiere.
Yeah, we didn’t believe we’ll see the day, so today we’re going to talk about some myths about disaggregated distributed chassis or Network Cloud or whatever.
And we have our famous mythbuster.
Yes, that’s me. Run.
Thank you for joining.
So Run, I’m going to throw three myths at you.
I’m supposed to bust them.
Yeah, you’re going to bust them three.
Yeah.
So the first myth about Network Cloud and this architecture in general is that it’s a one off, it’s just DriveNets and it’s just AT&T because
AT&T is the main customer and this is where it’s going to stop.
Okay, is that so?
Not really.
Well above AT&T.
AT&T is huge.
AT&T is over 50%.
That’s true.
But we also have customers based tier one customers based in India, based in
Japan, based in Europe.
We have the DDBR coming from TIP, which is kind of becoming a more predominant standard, and lots of operators are chimed into this.
So it definitely goes beyond AT&T.
Besides that, there are also alternatives.
So it’s not just DriveNets deploying DDCs, there are other solutions, it’s just that they are not as good as ours and that’s why you don’t hear about them.
But they do exist.
And actually in a recent report, I remember reading that out of the ten main SPs globally, four of them are already deployed DDC and four of them are in testing phase.
Correct.
It’s a big thing.
80% of the tier one market is yeah, quite big.
So this is the first myth, thank you for busting.
The second one is the operational headache, and specifically the fact that we take the back plane of the chassis and distribute it with all cables and connecting different boxes.
And there’s an operational fear that how do you approach this?
How do you troubleshoot this?
Okay, on the contrary, traditionally you would have a chassis, right, kind of a big black box.
Anything that goes wrong within the chassis, you’re dead in the water.
You don’t know what’s going on on the inside.
And it goes wrong.
Sometimes you enter a card and you.
One of the pins is bent.
Yeah.
One of pins is bent.
And the whole chassis goes, yeah, something.
Will go wrong because something will just go wrong.
Things happen.
So also in DDC things go wrong.
Absolutely.
But while in a chassis, you don’t have any ability to troubleshoot the insides of the chassis, right?
In a DDC you can all the interfaces are external, you have probes and indicators as to what’s going on between these interface points and therefore you can troubleshoot in a better way what’s going on in the insides of a DDC because it’s a fact outside.
So while in a chassis, you can’t go into the back plane and fix maybe a single connection that went bad you have to replace all the chassis
Here.
You have visibility to the fabric and the chassis, and you can pretty much troubleshoot.
Just because you close your eyes doesn’t mean that the problems go away here.
You don’t close your eyes.
Okay, great.
And the last thing is that there are lots of boxes, and it’s a fairly complex architecture, at least in terms of the number of elements in a single entity or a single cluster.
And you will probably need more people.
People are talking about adding full time employees in order to handle this complexity.
Well, actually, also in this case, quite the opposite.
You can solve one big problem, and that takes a lot of expertise, and you can break this problem into a lot of many identical, small problems.
This is how you automate.
You can automate a lot of many small problems.
But automating one big, huge problem is a bigger challenge.
And the example familiar.
Yeah, because you heard it before, because this is exactly what the hyperscalers have done going into building these huge facilities, data centers.
They say they automate everything, they normalized everything into something whichis replicated in many, many times, and then apply automation onto this.
And this is how you get to four engineers managing a data center of 100,000 servers.
And wow.
How is that possible?
So, actually, less FTE than… A lot less FTE.
A lot less FTE.
Okay, so we promised three myths.
Three myths.
Busted.
And thank you very much.
Run for busting.
Thank you for watching.
Stay tuned for additional episodes of season three.
See you later.
Bye.
Episode 3 Service Provider Panel at MWC
We talked to the panelists from AT&T, Orange, and Telefonica about disaggregation in general and about Network Cloud. They shared their perspective on the benefits of disaggregation: total cost of ownership and cost reduction, the super scalable character of the disaggregated architecture, and that innovation is a major driver for disaggregation.
Listen on your favorite podcast platform
Listen on Apple Podcasts
Listen on Spotify
Full Transcript
Hi and welcome.
Let me.
Right, sorry.
Hi and welcome to DriveNets, a special episode in this season three
Coming out of MWC, we had a great panel participating with several participants.
Buenas noches, Dudy.
You were there at MWC.
Yeah.
Give us your insight.
Okay, thank you.
Hi everyone.
So, yes, in MWC, we held a press event with multiple Tier-One operators
We had AT&T.
We had Telefonica.
We had Orange.
Three, as usual.
And there are some they talked for an hour or so.
It was very interesting.
But I think I have three main takeaways.
Three main takeaways?
This panel, let’s hear it.
So we talked to them about disaggregation, about Network Cloud, but also disaggregation in general.
And I think that the main benefits those operators see from disaggregation in general and from Network Cloud in particular are the following
One is total cost of ownership, cost reduction. At the end of the day, there are multiple reasons for that.
But Orange says, hey, this saved us 20 percent of the overall expenditure over our network.
AT&T for a specific project told us that the disaggregated model is ten times cheaper when look holistically in a TCO perspective than the classic approach that. Would be then a full TCO, not the CapEx, but the entire CapEx, managing operating power, whatever.
Okay, so this is with regards to but it’s not only a saving issue, it’s also a scalability issue because when you want to grow and AT&T mentioned that the yearly average growth of capacity of requirements from their network is about 30 percent.
The DDC architecture that they use is super scalable, and because of that, they are now running more than 50 percent of the traffic
of their network, traffic of their core IP MPLS traffic in North America over our DDC solution.
They plan to make the full transition in the next couple of years and we’re talking 600 petabytes a day running through this network.
This is very scalable and this is one of the reasons they went for it in the first time.
And it would that would mean that the modularity of the DDC is kind of an enabler to any scale that you want.
There’s a high level of flexibility.
You can do basically whatever you want with this infrastructure.
Yeah, basically whatever scale and whatever applications you want to run on this infrastructure.
And the last thing, and I think this is the most important and interesting angle is innovation.
Because all the operators mentioned that innovation is one of the drivers for disaggregation.
And this seems a bit counterintuitive, but when you think about it, and this is what Telefonica mentioned, that when you decouple hardware from software, you actually break the innovation cycles.
So you can focus on the software innovation.
And because software innovation is much, much faster than hardware innovation, this means that the time to market of new services and new capabilities for the network is much, much faster.
So this is a major driver for innovation.
This has a major impact on their top line or on their ability to be competitive, on their ability to launch new services.
And this is, I think, the big news from disaggregated chassis.
That would mean the decoupling of, say, dependency of the operator from whoever vendor from which is buying the equipment.
It’s completely decoupled. And then you have more, I’d say, freedom for the operators?
More freedom.
And the pace in which it innovates dependent is dependent only on software.
And as we know, it takes much shorter time to write new software than to develop.
So if I try to kind of recap it and maybe kind of boil it down, it’s the ability to cost less to build such a network.
The fact that network transformation will meet the needs in terms of scale, in terms of flexibility, and the ability for the network to evolve
into something which I don’t even know today, but in the future will enable me future capabilities.
Yeah.
One, two, three.
Yeah.
And if you want to see the whole event, it’s available on our website.
Wonderful.
Sounds like a great panel.
Yeah, I missed MWC, as you can imagine.
Thank you guys for tuning in today.
And we’ll see you soon again.
Cheers.
Episode 4 Fault Detection, Isolation, and Recovery
What are three things we need to know about how Network Cloud that can make your life safer, easier, quieter when it comes to recovery.
Listen on your favorite podcast platform
Listen on Apple Podcasts
Listen on Spotify
Full Transcript
Hi, and welcome back to CloudNets, where Networks meet Cloud.
Today we’re going to talk about recovering from major faults, those things that keep you up at night.
And we have Calin all the way from Canada, Toronto, our expert to talk about this.
So, Calin, what are three things we need to know about how Network Cloud can make your life safer, easier, quieter when it comes to network failure recovery.
Let’s talk about it.
It boils down to three major things.
It always does.
What it boils down to is a small blast radius.
If something happens, it’s very locally isolatable, you know where it is.
That talks to the idea that the DDC has really simple failure detection.
We can look all the way into all the piece part components, all of the disaggregated components, and find out where the fault really is.
This is really hard to do in a chassis because it’s all inside one big black box and it’s hard to figure out what actually happened here sometimes without getting a lot of sleepless nights in the process.
And the third thing is really fast recovery because the DDC is one big orchestrated set of white boxes.
When we apply things like hot patches, we can apply them once from an operator perspective and have the software distribute them on its own to all of the leafs that we have in these really humongous routers, so that
you don’t have to do it one at a time, router by router, maintenance window after maintenance window.
Okay, that’s great.
So let’s see if I got it right.
So the three things you need to know about how Network Cloud makes your life quieter at night.
One is that once you have a major fault, it’s very isolatable.
First of all, it’s smaller blast radius.
And second of all, you can isolate it and take care of it without it affecting the rest of the chassis, router, cluster, whatever, because everything is containerized and network function resides in container and you can isolate it.
So this is in terms of blast radius.
The second is that your ability to detect and to run a root cause analysis, et cetera, is much greater when I talk cloud because you have visibility even to the fabric which you don’t have in the chassis.
So you see all the intrinsics of the cluster and you can isolate and identify the problem very easily.
And third is that you need to apply a solution or hot patch or whatever, unlike in a Clos architecture in which you need to go one by one, router by router and apply it here.
This is a single entity, or it’s a single managed entity in which you can apply the patch to all the boxes in the cluster at once, automatically, and resolve the issue on your time network.
The operator does it once and then the software does it box by box on its own so you can sleep better at night.
Yes, this is our main goal to sleep better at night.
Thank you very much, Calin, for coming all this way.
Thank you for watching.
See you next time on CloudNets.
Bye.
Episode 5 Real-World Network Operations
What are three things can we learn about Network Cloud in real world network operations.
Listen on your favorite podcast platform
Listen on Apple Podcasts
Listen on Spotify
Full Transcript
Hello.
Welcome to CloudNets.
Today we have a special episode, we have a special guest and you have me is also a special host.
Topic of this session is about real life experience.
You’ve heard a lot of things from me and from Dudy about, in theory, how this is good, but you know that there is one big customer where the CloudNets or Network Cloud is being installed, and that customer is AT&T.
We have one guy who has been there from the very beginning and up until now.
That’s Abishek.
So Abishek, a few points from your perspective of your experience with Network Cloud.
Sure.
So Network Cloud is basically you get the ability to manage multiple devices as a single entity, which you don’t see in a Clos architecture.
And we can monitor services and services running on the cluster and the health of the cluster or up to the resolution of each entity in real time.
And this is one of the main highlights of the orchestrator that we have with DNOR.
And when we do have an issue, we get the looking glass view into the cluster as to where this issue is.
And it’s very easy to isolate and fix the issue.
Cool.
So I think these are very important when you’re operating the cluster and it gives a real time view of the system and it makes it even more easier for the operations to use it.
Nice.
So if I bring this down to my perspective of looking at things, first off is that we have one network element eventually, which we are building, and it can scale up as much as we want, but DNOR makes it
into one element which is being managed and then tracks everything that is going on on the inside.
So we have full visibility of the details of the services of the cluster, health, all this kind of issue.
And then when something goes wrong, we have DNOR for the rescue and it helps us in terms of pinpointing the problem and patching in kind of a localized fix to it.
Exactly.
Classic.
Wonderful.
Okay, Abishek, thank you very much for coming all the way for having this session from you real world network transformation experience.
And thank you guys for chiming in once again.
Cheers.
Thank you.
Episode 7 Total Cost of Ownership
While service providers often look at TCO in terms of 3-4 years, we’ll talk about TCO in longer terms, like 8-10 years.
Listen on your favorite podcast platform
Listen on Apple Podcasts
Listen on Spotify
Full Transcript
Hello.
Welcome to CloudNets.
Today we have a special episode and I’m taking Dudy’s part today because I’m doing the hosting and and the reason is we have a real expert from the real world who is coming in to bring his own experience.
Topic of the day is how do we measure TCO – Total Cost of Ownership?
For this, we have Juan coming in from Spain.
Juan, give us your perspective on this.
Okay, thank you very much, Run.
So, talking about TCO, we know that service providers normally consider short term TCO is three, four years.
I want to talk longer terms, right?
Let’s talk about spans of eight to ten years.
So how can we provide benefits for service operators in that span?
First thing is the cluster can scale unlimited from 2.4 Teras to hundreds of Teras.
What does it mean?
All the investment that you are doing is protected, right?
You don’t need to change any of the pieces, you just keep adding boxes and all the investment that you did in the past, you don’t need to throw it away.
Cool.
That’s 1
Second point.
Our cluster is distributed.
We support different pieces from multiple vendors, which means that we have more flexible end of life policies. Right?
If some of the components go end of life, you only need to replace that component and you can choose from a variety of providers to do that.
So that means that the investment can be postponed.
You only need to focus on the specific part of the cluster that has gone end of life.
Nice.
Cool.
And the third topic, because we are disaggregated, there is a decoupling between the hardware and software.
The lifespan is different, right?
So you don’t need to tackle software investments when you need to change the hardware.
So that also means that you can postpone that software investment and leave it for when it’s really needed.
Really nice.
Okay, let me see if I got it straight.
One thing is that the cluster will last longer in the network because it can scale, I wouldn’t say indefinitely, but really to large, very large sizes.
Second is that our mechanism is of a distributed cluster as part of the DDC.
Every component can be upgraded or replaced as a standalone, completely independent from the rest of the component.
And third, another aspect of independence where the software, once you upgrade it, has no impact on the hardware and vice versa.
So in fact, we have distributed disaggregated and the fact that we are a chassis.
So that all boils down to DDC.
And the bottom line is that it lasts longer.
It lasts longer.
And there are additional benefits when you consider TCOs in the longer term.
In the longer term.
Wonderful.
Thank you very much, Juan.
Been a pleasure everybody.
Thank you, Juan.
Thank you.
Bye bye.
Episode 9 KDDI Deploys Network Cloud
We’re going to talk about KDDI, one of Japan’s major Tier 1 operators. They recently announced that they implemented DriveNets Network Cloud solution in its network and they are very satisfied. They are talking about huge TCO advantages, about 46% saving in power and 40% savings in rackspace, etc.
Listen on your favorite podcast platform
Listen on Apple Podcasts
Listen on Spotify
Full Transcript
Hi, and welcome back to CloudNets, where networks meet cloud.
And today we’re going to talk about KDDI, one of Japan’s major tier-1 operators, that recently announced that it has implemented Drivenets’ Network Cloud solution in its network.
And they are very satisfied. They are talking about huge TCO advantages.
They’re talking about 46% saving in power and 40% saving in rackspace, etc.
Correct.
And we have our Japan expert.
Yeah.
Run, thank you for coming all the way from Tokyo.
Yeah, Run, let’s talk about KDDI.
And why is this deployment of Network Cloud so important for network transformation?
It’s a major deployment, but it is a unique one.
It is unique and unique for three main reasons.
Well, it’s true.
With all the benefits that KDDI gained from this put this aside.
It’s very important, obviously, from their perspective.
What does it mean for DriveNets?
Well, the first thing is that we’re talking about a different use case.
Up until now, we’ve been talking about core as a use case, aggregation.
In this case, it’s a peering router.
It’s a third version or a third type of a router, all of which are imposed or deployed onto the same DDC infrastructure.
Same solution, new use case or new implementation method.
The classic disaggregation of software from hardware functionality is here.
Hardware is a different place.
Second item is that we are not using Ufispace as the ODM, in this case, it’s a new ODM.
Yeah.
We’re using Delta in this case, preference of the customer.
So the customer could actually choose who is the hardware vendor with which he’s working with because in a disaggregated system, you select the software and pick any hardware vendor which is compatible.
Correct.
As you choose.
Correct.
And then the third item, and perhaps third and a half, this is a non US customer.
There is talk, there is rumor that DDC has been built specifically for AT&T.
That’s not the case.
It’s a uniform solution.
It applies for multiple use cases.
And any customer, any tier-1 operator worldwide, can take this model and adopt it into their use case.
In this case, it’s a customer in Japan.
So everything to do with reliability, even though we passed AT&T, of course.
But in this case, Japan is kind of a higher limit of …
And it was a long and very detailed process of checking.
Definitely.
Okay.
One, two, three things.
Yeah.
So, just to recap the three reasons the DDBR or DDC implementation of the Network Cloud in KDDI is so important,
One, it’s a new use case.
It’s an additional use case to the ones we are used to when deploying DDC.
it’s a peering routing implementation with the same infrastructure.
The second is that it introduced a new flavor of hardware, a new ODM, in this case, Delta.
It could have been Ufispace, it could have been Edgecore.
And this is the nice thing about disaggregation.
You choose the hardware vendor which is most suitable for you
And the third is it’s a new geography.
First time in Japan for commercial deployment and again just to show that this solution fits not only the North American carriers but carriers worldwide.
And we see it with other customers. It’s a generic solution.
Thank you very much, Run, for this
Pleasure.
Thank you for joining and watching us.
See you next time on CloudNets
Arigato.
Episode 10 DDC – the best of Chassis and Clos
From the perspective of scale, absolutely Clos topology outranks a chassis. From an operational standpoint, a Clos topology would mean that you will need to manage a lot of boxes and that’s a bigger headache than managing a single device, a chassis, one stop shop. But from a Capex standpoint, a Clos topology prevails over chassis. Looking at the alternative, Distributed Disaggregated Chassis (DDC), it takes the same scalability factor of a Clos, the operational aspects of a chassis and even when you look at the cost side of things, from an operational standpoint, DDC operates as a single device. Then from a Capex standpoint, DDC operates or works as white boxes.
Listen on your favorite podcast platform
Listen on Apple Podcasts
Listen on Spotify
Full Transcript
Hi,  and welcome back to CloudNets
where networks meet cloud and today we’re going to talk about something different
You know, chassis and Clos?
Which is better?
So and we have the ‘which is better expert’ ‘Run Clos’ who is no relation to the original one.
Let’s break it down
The chassis versus the Clos topology from the perspective of scale absolutely Clos topology outranks a chassis yeah, right, just easy enough Add tiers and fan out that’s easy to do from an operational standpoint, a Clos topology would mean that you will need to manage a lot of boxes okay?
That’s a bigger headache than managing a single device, a chassis, one stop shop, everything kind of boils down to one device.
Definitely from an operational standpoint,
It’s better to have less elements in your network not only from operational perspective, but also from the IGP convergence, from IGP routing protocols.
Yeah, the whole shebang.
That’s true.
And then when it comes to cost, On the operational expenditure, I’d say chassis
Again, one box to manage easier than managing thousands of From a Capex perspective, a Clos topology usually associates itself with white boxes which are coming from ODMs, meaning pay as you grow
pay as you grow as well and then you start from a lower price point.
So I would say from a Capex standpoint, a Clos topology would prevail against the chassis.
But outside of these three points, like scale and operational and cost let’s look at the alternative,
let’s look at DDC.
Oh, okay.
Because DDC takes the same scalability factor of a Clos, the same operational aspects of a chassis
and even when you look at the cost side of things, then again, from an operational standpoint, operates as a single device and from a Capex standpoint, it operates or works as white boxes.
So DDC kind of takes the best of both worlds when it comes to Clos versus chassis.
DDC wins
When you come think about it, it makes sense because DDC is actually taking a chassis and breaking it up to white boxes that are in a Clos architecture.
No coincidence there.
Yeah, no coincidence there.
It’s just both.
It’s simply put…both.
Okay, so this is interesting because we wanted to understand which is better, chassis or Clos.
And we came to the understanding that you do not have to choose because if you go DDC, you can have the scale of Clos and the operational efficiency, lower number of elements in the network, et cetera, of a chassis and the cost advantages of both.
So it’s a win win.
It’s a win win win and above everything, it gives you as a customer deploying this maximum control over what’s inserted into your network
Okay.
So I didn’t see this coming. Actually I did.
Thank you very much, Run,
Thank you for watching see you next time on CloudNets.
Bye bye.
 
                         
                            