PodcastsMay 18, 2020
DriveNets Discusses Network Disaggregation with Packet Pushers
Welcome to Packet Pushers Heavy Networking and boy do we have a heavy networking show today! We’re talking to DriveNets and this is the sponsor of this episode and we’re getting into a full-blown software and hardware architecture of a clustered network operating system that is more than that you can do a whole lot with this thing. And we’re speaking with Amir Krayden the VP of research and development and Yuval Moshe, VP of products. And we get nerdy with it, don’t we Greg We sure do and you know I’m not a huge fan of the chassis. I love me some hardware, but I I’m really over the chassis and the complexity of chassis and how difficult it is for vendors to be able to make them to work. I also want to get the reliability of a disaggregated solution, so lots of off-the-shelf components “Can I build a chassis out of those disaggregated components” is the question and DriveNets is exactly the answer to that. Yes, they want to build a telco class communications service provider chassis-based terabit class router, like hundreds of terabits of class router out of standard off-the-shelf components and in this show that’s what we’re diving into. It’s a chassis without the chassis, would be a good way to put it. It grows with you so, and again this is a nerdy show folks so dig in and enjoy this sponsored episode with DriveNets So we want to get into DriveNets here but you this is the first time you guys have been on the Packet Pushers Heavy Networking podcast. So can you give us that 10,000 foot overview of DriveNets , can you explain what the DriveNets product does?
Drivenets: Yes of course so originally the company was actually established to help CSPs shift their network business model so they can actually overcome their increasing challenges and meet future business needs. The entire product was inspired by the hyperscalers model, which they actually provided all kinds of answers to their unique business environments and their challenging requirements that their industry is facing. If you think about it what’s the major challenge that any service provider has today, that’s network capacity scaling. The growth is actually driven by all kinds of streaming and content-driven applications that we all know of today. This is 5G, IoT and everything is really leading to like a giant growth of capacity demands. I would say about 50 to 100 percent core, year over year. Now networks in the last 30 years probably hasn’t changed much, it’s just been built by larger chassis that grow bigger and bigger and being replaced every three to five years, but there’s nothing new there. It’s just that the costs are higher and those operators or service providers just need to pay or spend extra money every time there’s growth of capacity and that’s not a viable solution anymore. I mean the cost of the network scales up with the actual demand. You need to find an alternate solution or really simplify the network operation. You have both opex and capex that doesn’t scale linearly with the network.
So what you’re drilling into here is the problem in communication service providers. This is telcos or communication providers or network providers, whatever you want to call them, is that typically they go out and buy like a 12 slot or an 18 slot or a 25 slot chassis and then they try and fill it up with all the boxes, all the interfaces, so that they can get the maximum amount of throughput. But that’s not very scalable. It’s a vertical scale. The boxes are expensive because the bigger the vertical the more expensive they are and they’re actually not that stable, as it turns out most of the time. So your idea is, is there a way to break this down into smaller chunks, into disaggregated pieces and then drive it as though it was a single unit to make it look like 20 different network devices acting like a single unified piece, and throw out the chassis and replace it with a disaggregated bundle.
Drivenets : Yes exactly you need to create some kind of a model which is distributed and disaggregated network elements that looks like a single chassis or a single router for the operator, but the entire Opex, the TCO, the Capex that they have for the product itself is much simpler. It meets their needs and it actually brings additional value for their future needs. It could also be value-added services and those are new opportunities for them.
So this is drilling into this idea of buying an ODM, a standard unit switch, a special switch this is, I assume, the hardware we’ll talk about hardware later, but you’ve got a special switch that’s suited for the CSP market, the service provider market and you’re basically stacking lots of them together to turn them into like a chassis with your operating system and your controller, is that right.
Drivenets: Yes exactly, the solution itself, the product is based on two white boxes. One of them would be fabric similar to the monolithic chassis and the other one would be similar to a line card. Using those two white boxes you can actually scale from a single line card which is four terabits up to almost 768 terabits which is the largest cluster that we have, with the same type of white box. You don’t need all those types of different routers, different hardware, different components for various use cases in the network, you can use just a single type of log account for all use cases.
So in the same way with the chassis, when I buy a couple of fabrics with a couple of fabric cards and then I buy one line card, this is saying I buy a disaggregated unit, a switch running a particular type of ASICs from whomever I buy it from. I use as many of those as needed to reach the performance that I want, so that would imply that you’re also using a disaggregated software strategy instead of putting it all inside of one single chassis. Now you’ve got the components spread across all of the units that are in the cluster
Drivenets: Yes exactly, in disaggregation there’s actually various levels of disaggregation. The first one is what you mentioned about hardware naturally you desegregate software from hardware. Now that’s not new, I mean it exists in the world for several years it’s been done in mostly in the hyperscalers in x86 environments but when you’re talking about service providers the scale and performance needs to be much higher that’s in terms of the hardware versus software. When you’re talking about disaggregating the software then you’re separating the control plan from the data plane. I mean we’re using x86 off the shelf servers to run our software on top of that which is container-based similar to the hyperscalers approach and we’re running on the actual white boxes we’re running containers which their main function is actually to run the data plane features if you take as an example quality-of-service, access-for-security and those kind of things.
Okay so we’ve got the idea of disaggregation and then the operating functions are distributed across all of the unit for all the devices in a cluster. Let’s just bring this around to where the service providers would be interested in this. That again is because because I’m building it from disaggregated pieces of ODM equipment, my capex is lower, I’m building white boxes with merchant silicon standard off the shelf components. I’m also able to buy it in chunks, I don’t have to buy one big chassis and go to town. I can scale horizontally, I can increase it piece by piece is that right
Drivenets: Yes exactly. Now if you think about it we we spoke about what’s the main challenge that they have today, which is mostly the network capacity growth. I think this one actually relates to three main topics that they have to deal with or three main issues that they have. First off, the installation that they have today in the network has very low resource utilization. It’s actually their network is segmented to all kinds of dedicated physical network sections per type of service. If you think about it mobile, broadband, enterprise each one of them has its own segmented network using different types of routers and not a single white box or a single vendor that supplies for all different use cases. That’s the first one. The second one is the complexity multiple networks, multiple types of use cases, the operation complexity is huge there’s different types of router types, software releases, hundreds of actual components for each one of those routers you need to have a different type of fan unit or PSU which means you need to store them in inventory and you need to save them for spare parts. Lastly it’s high cost model, the entire model that they have today both from capex and opex perspective as we mentioned just scales up linearly.
So basically what you’re trying to do is step out the capital expenditure so that you’ve got just the capacity that you need you’re only burning up space for exactly what you need you’re not putting in a big multi-slot chassis for the day that planning in ten years time you’re just using up the space that you need for now. You separate the control planes and the data plane so that you’ve got the routing protocols are separated from the forwarding planes I guess that would be a requirement of the architecture really wouldn’t it.
Drivenets: Yes exactly if you want to support the disegregated model you have to have some kind of a separation of software both on the control plane and the data plan because you want to run their compute resources externally on x86 dedicated or actually off the shelf server and you want to have the white boxes act as single entities that can run their own use cases.
There’s a point of clarification we need to make here you know we’ve talked from a hardware perspective about white box switches, I believe you guys have a partnership with Broadcom, we’ve talked about x86 machines if I’m a service provider of course I also have a huge investment in a ton of I don’t know I was going to say legacy hardware but that’s probably a bit harsh. But you know the big routers and switches, the big iron out there that we’ve been talking about that’s so expensive and such. Is the DriveNets solution completely separate from all of that hardware I already have installed and it’s kind of like a a green field thing where I can add capacity to this new thing and my existing network is something else. How does this all fit together?
Drivenets: So that’s a very good question. I think that’s one of the questions the operators or customers always ask us how would they actually deal with that new network because they don’t really have a green field. So our suggestion always is to use some kind of a migration path. We install in parallel our solution, our clusters and if you think about it in most of the service providers today most of the network or if you actually bundle the entire capacity in the network, it doesn’t even reach a single cluster that we provide. It means when you install our solution you start taking over the new growth in the network and slowly merge what you mentioned as legacy router or legacy use cases merging to the new network. Within let’s say a year or two you’re pretty much done with the old network and you moved everything to the new network which is less equipment. It could be large clusters that handle the entire capacity of the network.
Adding one more thing to that is the beauty of the fact that maybe it’s not a new problem. As you go out and buy a very new incumbent vendor a very large router and you had an old one they’re not speaking to one another so you faced with that problem as well. Your migration plan, your move toward how do I build a new network which is one of top of another one a side of another will always be there when you juju when you do giant leaps in technology.
Right because I invested in some particular chassis that solved a problem I had seven years ago. Nowadays I need to go to some completely different and new architecture that impacts my spending that impacts my inventory that impacts how I do operations so you’re making the point that if I go DriveNets it’s it’s not like that’s creating a new problem for me I have that problem anyway, but with the DriveNets approach now I’m building out a disaggregated model I’m building out a model where I can unify my operational approach. I think another point here that was interesting that kind of came up in passing as you guys were introducing this, I have now flexibility to build out new applications in a way that maybe it was more difficult to do that before and I think for service providers that’s a really big deal because the network is the business if I am a service provider, so being able to build new applications that generate revenue for me is is a really big deal. Can you talk about that a bit?
Drivenets : Yeah I think I mean maybe can add some shed some light about how we actually support value-added services in our architecture.Yeah so first of all the fact and we’ll speak about it when we’re speaking about software architecture the fact that we’ve completely disaggregated control from the data plane gave them a few benefits that they never had before even before value-added services. If you speak about what a service provider needs to do in order to enrich his own offering would be mainly to go into the control plane and add functionality. When you’re doing it on top of a chassis you’re going you’re going and touching the device itself when we took the control plane out, first of all here you have a very nice room of innovation in which you can update this control plane with without touching any hardware. You’re upgrading a VM, you’re upgrading the Docker container, it’s a much easier approach than you ever seen before. Again when we’ll speak about software architecture our view is not having yet another network operating system, but but having a virtualization layer over, the white box is a virtualization layer over the compute and now when we look at it the this exact way we open up a very a nice new field if I have a DDOS mitigation which I want to do maybe part of it could be on what we call the line card, the white box ,maybe part of it can run on our control and it’s much closer to the networking environment. If someone is going to do a 5G application which needs to be very very low in terms of delay or jitter we can run his Docker container on top of our architecture. For sure looking into the future looking into hybrid clouds where a service provider can offer not only services in their big central offices but also at small central offices, this is a great room for innovation.
I think also the trick here is is that you’re also using a modern approach to the operating systems on the devices as well so if you’re focused on a you know if you’ve been using the big chassis based routers for the capacity and the forwarding, you’ve had a lot of things said to you about software quality and promises about how it operates. But what you’re doing is using the container based model and as far as I understand it correct me if I’m wrong you actually break down the components into small containers so that you can actually solve the problem of software compatibility and software interactions in smaller pieces. It’s not a monolith, it’s a micro services oriented architecture. In that sense and does that improve the usability for customers?
Drivenets : I think there’s two layers to what you ask. First of all you describe it completely correct. One is now the price you have to pay in order to bring new features into the system is much lower you’re not going into an upgrade a complete full software upgrade of a very large chassis you can upgrade this exact Docker container which gives you a whole new world of how do you patch do you do patch management in operational environment. So exactly as you said I think the most beautiful thing in the way that we’ve solved it, wasn’t looking at something like Cisco’s IOS XR or Juniper Junos and saying this is the way a NOS should be built, but to a very large extent looking at how web applications are being built today and they’re very resilient they’re very highly available. They just solved the problem in a completely different way.
Well that might be a good question to lead us into a high level overview of the software architecture then we sort of laid out a bit of a pathway there yeah we need to go there because I was kind of playing back what we’ve been talking about in my mind it’s like okay we’ve got this separation of control plane data plane but then we’re talking about where different containers could be running and it could be in either place so so amir I agree I think we got to back up a bit and and talk about that software architecture again so it becomes a little more clear in our minds where all these components are and how they interact?
Drivenets: At least for the beginning, try to visualize a chassis, and visualize a standalone box and ask yourself one very important question is there any difference do I as a software vendor really want to see a difference between how I operate my software on a 16 line card router or a single line card router. For us it came into mind very early that we absolutely do not want to see any difference we want to see the exact same software stack functioning the same. So the way to vision it is now a line card this might just might be a logical function it’s a card that does forwarding, if it’s a standalone box on the same standalone box I have an ASICs that does forwarding. The same exact problem, my control plane running BGP running ISAS running OSPF or traffic engineering protocol, it can run on an RP, if it was a chassis, it can run on box if it was a standalone box. It can run on both if I have a very large cluster I have BGP running on the server and I have the line card functionality doing the forwarding. So from our perspective the beauty was to solve it the exact same way.
Yeah and I can actually put your software on an x86 box as much as I can on an ODM Broadcom ASICs switch or you know of any sort, like the same software applies in every case
Drivenets: Completely correct. When drivel when DriveNets originally started we’ve been building a lot of VNF’s, a lot of virtual network function. We’ve built our own layers of forwarding and today we keep the exact same interfaces. My solution running over a native x86 and running over an ASICs would look exactly the same. Essentially the way that we measure how good our design is the way we measure how flexibility is by the ability to go between platform going between ASICs, doing exactly what you said, I think this is the beauty of the solution, this is how we should be measured.
Yep that flexibility is something that interests me because I think increasingly the network doesn’t stop at the edge of the ethernet port it goes right away into the VM or into the server itself, or you may actually want to use software as a forwarding plane for some reason. So, you mentioned cloud native along the way, what does that mean in DriveNets because a lot of companies talk about their application is cloud native and it normally means they’ve got a container in it and that’s usually enough to qualify as cloud native. What does DriveNets mean when you talk about cloud native?
Drivenets: We mean first of all that we started from looking at the problem as a microservice based one I’m not doing a line card I’m doing a forwarding service whether this forwarding service reside on an ASIC whether it reside on an x86 it doesn’t it doesn’t matter it has a functionality of forwarding packets and I need to look at it as an orchestration problem. If i’ll take you back to the problem of a router we’ll take a protocol, like LACP for example, this protocol controls a single port. I don’t need to run it on the server. I need to know that this protocol is relevant at an exact point which is in our case the white box.
You don’t even want to run it on the server there would be too much latency introduced to that specific process
Drivenets: Exactly, but if you look at the inherent problem, it’s an orchestration problem. Where do I put the service so it’s best fitted into the function it needs to serve and this is cloud native, meaning it’s much more than running a Docker container, but having the intelligent to put the Docker container at the right place to optimize from a system perspective what we’re trying to achieve.
I think you mentioned something which is quite important the latency impact. If you remember there was for several years everyone’s been talking mostly the telco providers about SDN and how SDN will change their life because they want to adopt what they saw in hyperscalers but the latency actually killed it. You can’t really put something which is I would say RSBP fast read out orchestrator or controller externally out of your network I mean the network has to be on-prem and something has to control it locally so you can have fast real capability.
You need to be able to react in in microseconds or milliseconds at at worst.
Drivenets : Exactly that’s why SDN kind of moved most of the orchestration features which could react at the later phase. We kept all the local capabilities on the actual x86 on-prem on the cluster itself.
That actually brings up another question about this, as we’re talking about cloud and this disaggregated model, does my control plane live on premises somewhere or is it actually up in the cloud somewhere
Drivenets : So essentially we can do both. Usually when we’re looking and speaking with our customers from DriveNets perspective you can run your own control plane wherever you want but if you look into sense and you’re looking where in where would be the right place to put it we need to remember that there is interaction between the control plane and data plane eventually a BGP packet coming from a line card from a white box going into my control plane needs to traverse some networking environment until the point they reach the protocol and handle it. So from a customer perspective it would be beneficial to minimize this delay so in this case you can call it on-prem. But another benefit that we bring into the service provider market is now it doesn’t have to reside in the same room which is also a leap forward I can have my control plane running in a “ IT environment” on very strong servers residing within dedicated servers rack and the rest of the white boxes boxes in the telco space. So I get low delay, but I also get the benefit of handling the function in the right place.
It’s an out-of-band network but for the control plane instead of just management traffic
Drivenets: Exactly. Yeah open, doesn’t have to say you win the chassis, you might be in the room or in the floor [Laughter].
I got another software architecture question for you guys because we talked about you you made the point very clear that I can run DriveNets on an x86 box, I can also run it on a box that’s got Broadcom asics in it let’s say. So that tells me you’ve got an abstraction layer in there somewhere so did you write your own abstraction layer or is there one of the several that are available in the industry that you’re taking advantage of
Drivenets: So we’re completely open in regard to that first of all exactly as you said, we have a very strong abstraction layer nowhere inside of our architecture until the point we’re really at we’re at a really low level of the white box do we translate to any ASIC format in broadcom case it’s only very low our architecture that we translate from the DriveNets “ language” to a Broadcom language the rest of the system is completely model driven meaning that we’ve defined DriveNets yangs in a strong in a strong community with the open compute project and essentially we speak at the DriveNets language we program a queue we program an acl we program a route but only at a later stage would we translate it into ASICs itself. So for us if someone brings in a new ASICs tomorrow and it’s good enough and it has all the capabilities that the service provider needs, great we will be more than willing to migrate it and test how fast we can do it.
Because all you got to deal with is the translation service you don’t have to rewrite fundamental code
Yeah okay that’s but it does sound like then you wrote yours you said DriveNets language you’ve got your own Yang modeling and so on and so your abstraction layer is is yours
Drivenets : I heard was, you wanted to be in control of your own destiny, you didn’t want to be beholden to the ASIC providers and you wanted the flexibility. Is that a fair assumption
That is the first assumption but also, yes we wrote our own abstraction layer but the Yang modules where we can be standard where there’s an IEEE draft where there’s IETF draft where OpenConfig defines it, great, we want to be as standard as we can because we have the benefit of people adapting it much quicker and it works it’s just win-win situation for everyone. But in this industry you leave it as long, at least longer than not everything is standard, not always did we see a Yang module which defined correctly rsvp or traffic engineering or segment routing on or anything else that we’re trying to do.
Yeah I was going to be mean and say but there’s so many standards in the Yang world, but you just said it for me.
Drivenets: It is exactly a non-standard, standard world.
Yeah the kind of thing we were all afraid was going to happen. But but as you pointed out there are certainly a number of Yang model libraries that folks are rallying around and as you say if you’re supporting those as best you can and that’s as expected right , and you invoked the Open Compute Project, so I would anticipate they’d really insist on that anyway. Well you’ve got an API in this thing which again doesn’t doesn’t shock me but but tell us about the API what could I do with it if I’m a service provider trying to integrate DriveNets into my operational world let’s say
Drivenets: Excellent so think of the API as kind of encompassing more than one thing. One form of API would be protocols standard one that we support in order to enrich the service provider environment with the ability to do a software defined networking so things like BGP link state which would give you the topology or PSAP in order to control your traffic engineering so these are standard APIs or protocol which lets you which lets you speak with the DriveNets platform and configure it, or provision, it a part of that of course things like netconf in order to do NETCONF configuration. But more than that we’ve developed a set of open apis which we expose to our customers both at the router level and at the orchestrator level that let you control the system from an orchestration perspective for example deploy the DriveNets software upgrade the DriveNets software get matrixes that you otherwise probably couldn’t out of the systems so we really enrich your environment in a way that let you automate over the DriveNets solution probably in a much better way that you could have done so far.
so there’s a northbound and a southbound API I think I just heard you say because I can talk to the orchestrator and get things done that way but then there’s also southbound from the controller down into the data plane
So there’s a question here if I’m reading and writing to an API and on my DriveNets software is clustered across multiple devices and I heard you, I have this in my mind you know 20 or 30 devices in a single cluster to be to be equivalent to a chassis based router, you must be able to synchronize that very quickly between all of those devices do you want to talk about that at all
Drivenets: I think it’s a wonderful question and it should be a very simple answer. It should for you it should look exactly like programming any other router the fact that there’s 20 boxes there shouldn’t matter the fact that there’s 100 boxes there shouldn’t matter if you configure an ACL over a log bundle and that log bundle spends over 20 machines, why should you care it’s DriveNets problem to make it it’s DriveNets problem to make it look as seamless as we can to be done in a transactional way.
Because in the chassis that’s often done with mystical magic and you know they actually have out-of-band buses and special cards that distribute the communication between the the control planes is actually done on. You’re in a distributed open environment you’re doing it in some way, all I really want is an assurance that yes when I program an API within milliseconds or less that is the the cluster cohere reaches a point of coherence and stability
Drivenets: Well I think it’s it’s not that mystical if you think about it essentially what they have internally is just a management switch. Now we have the same power just disaggregated. We did our solution we have a dedicated switch and his sole purpose in life is to forward your API calls to all of the devices in the description and what Amir mentioned about having a transactional API is exactly the type of you would call magic that makes it happen right you forward those calls to all of the devices and they all immediately acknowledge that they accept the if it’s either a fee programming or a new configuration that they need to apply to all of the line cards at once. .
I get it right so that makes that makes perfect sense because that’s straightforward then there’s a there’s a dedicated communication pathway and then your software will be coherent because the only thing that needs to that that is elastic now is the processing power of the of the coherent protocol that you’re using between the switches
Drivenets: Yes yeah exactly
We’re talking about a specific we’re talking about the control plane or I don’t know the orchestration layer specifically right as far as this cluster goes
Drivenets: So if you think about it Amir talks about how do you actually program or configure a monolithic chassis. You just use CLI today I mean there’s not a lot of orchestration or external management system but within our solution if you think about it we brought to the table something that didn’t exist in the past when you insert the line card into the chassis it’s already there there’s a slot it’s numbered you just put the line card in and everything works as planned right there’s nothing special you need to do in the field it just goes live and you can start using that line card. In our solution when it’s disaggregated you have to connect it in some way to the disaggregated chassis and something needs to manage that platform and make it part of that cluster because otherwise it’s just another white box. So we actually introduced our we call it DNOR DriveNets Orchestration system and its purpose is to actually run the life cycle management of the product when you plug in a new white box it actually calls home what you would know as ZTP and it immediately gets the software gets validated. We know it’s part of DriveNets solution and it’s part of the cluster immediately. So from that point on deployment upgrades life cycle management and everything, it looks essentially like a single router.
So the the the cluster then is distributed across all of my devices that are running DriveNets software, so every part is in a new x86 box I plug in a new white box switch, they’re all part of participating in this control point cluster
Drivenets: Yes if you take a let’s take a service provider network usually in a lot of CO you would have two large routers just for redundancy perspective each one of them could have let’s say 16 line cards if you take our solution and try to put it on the same way same site you would have the same thing two clusters each one of them could be let’s say even up to 48 line cards and every line card that you want to connect to one of the clusters you just do use the orchestration system connect it to be virtually part of the chassis and you have two running routers at the same time.
Okay this brings together an idea in my head here that you you have said it earlier in the show but it really cements it I’m really dealing with one gargantuan switch the way this all goes together, well is that the only way I would architect it is one massive switch with as many line cards as I have devices or would I actually split this up into different domains somehow
Drivenets: It actually depends on the use case. If you take a look at the segments that you have in the network as a service provider it starts from the last mile to access maybe metro aggregation access and then peering and core network and the main difference other than the I would say pop of under the pop sites or the connectivity of the services is the scale. Usually if you talk about New York City naturally you’ll have large site area there with a lot of capacity and if you’re talking about some remote site in North America you’re going to have very low benefit capacity. So it’s up to you the provider to decide what’s the scale you want to use but the benefit here that you don’t have to pick and choose up font and then you stuck with it for five years and you have to do some kind of a forklift replacement. You can use the same line card which supports four terabit you can use it as a single line card just as a pizza box standalone box, and put it in some remote site if you want to move that exact box to New York because now you need another line card just move it there connect it virtually to the cluster and that’s it you’re done. So you can use the same solution in any site or in any location that you want, you don’t have to replace the entire cluster.
okay my brain just broke a little bit [Laughter]. So with this architecture the flexibility I have to design things gets really interesting like I could do something, I’m not saying this would necessarily be a wise design, but I could do something like all right I’m gonna stack up in this one pop a hundred devices with thousands of ports and have it appear like one BGP router on the network that’s the thing that I could do it sounds like
Drivenets: Yes think think about what happens today well everyone’s talking about Covid-19 right but think about what happens we are watching our netflix and youtube and we’re always streaming now we’re watching live from home now what really happens in those pops in those service providers from supporting 10 terabits in a single location suddenly they need to support 50 terabits in a single location and the only way to do it is start buying more chassis because they’re out of room and if you stack everything today and just add more and more and more bandwidth to the same cluster you don’t have an issue, just add those line calls.
Packet Pushers: And that that chassis is still a traditional ECMP so the physical infrastructure in there is still an easy and b type design traditional two-tier ECMP it’s not some radical loop or you know optical trigger it’s just a straight up ethernet fabric ECMP it’s the software is where the focus is I just wanted to cover that in case we hadn’t mentioned it
Drivenets: So do people stand this up in like a a leaf spine sort of topology typically depending on what they’re trying to achieve.
Essentially the way to look at it from a physical standpoint is that you take the fabric white box you take the line card white box to say and you do put them in a CLOS formation but from that moment on when software is provisioned software is deployed this is one big router, and had you ripped open a chassis behind the scene this is a CLOS network as well.
Well right yeah we’ve actually got a presentation on our Youtube channel that explains that exactly that when you tear open a chassis switch and actually look at what’s going on right it’s a crossbar fabric in there then you’re connecting those line cards and such and so effectively your in other words there’s got to be a smart physical design that underpins the flexibility of this architecture so that you’re not subject to a poorly thought out cabling you know or some such you really really do need to think about those things that those matter.
Drivenets: But then once that is in place leaf spine presumably you know limited or at least well-defined and understood over subscription between tiers and now now right I’ve got this a massive switch that I can do things with well, router, switch whatever we mean these days.
Let’s talk about money guys let’s talk about money so because service provider networks are notoriously expensive you know that’s those these are big dollar networks they’re they’re large they’re spidery they you know the bandwidth is expensive it’s all expensive and so how do you guys fit into this because cheap cheap to buy like I guess you’re saying that you’re fairly inexpensive to buy compared to a traditional model but does that also mean it’s cheap to operate because it doesn’t necessarily and to be fair some of this sounds it’s different and part of my brain’s going is this complicated is this more complicated to have you know an external leaf spine versus a collapsed chassis where it all tidally fits in the rack so anyway talk to us about money.
Drivenets: Yeah so when you take it take a look at the overall TCO, so our overall TCO is much cheaper than naturally what you have today in the incumbent vendors. Some parts of it is the operation, as you mentioned we’ll talk about it in a second, some part of it is naturally because of white boxes right because overall when you’re not dependent on a single vendor and there’s no vendor lock-in and you have other ODMs that they can reduce their costs and naturally there’s a competition in the market then the customer eventually wins right he gets he gets to pay less but let’s talk about operation I think that’s that’s a myth there’s a lot of telco providers and customers that’s the first friend that comes into mind mostly in engineering that says look it looks huge it looks complex it looks like we’re gonna need extra systems I don’t know if it works as a single router or 20 routers that now I need to operate in my network and our answers to them is: let’s take a look at the the bright side and let’s take a look at the entire solution and see what it actually brings to you or actually brings to the table and talking about the operational complexity.
If you take a look today on a chassis and you’re out of capacity and there’s let’s say 8 slots still and you need slot number nine there’s only two ways you can really solve the issue today one of them is go get another chassis and do forklift replacement. And forklift actually means forklift you need to bring a forklift to take out those gigantic chassis move them aside and bring something else in.Or the other way is to introduce a new router to your network which has is all entire our problems of introducing another node into your complex IGP or MPLS network. In our solution it’s just a white box it’s a pizza box it’s exactly as it sounds just insert another small pizza box into one of the racks in the room plug it in and that’s it you’re done so from complexity perspective you’re pretty much free 50% cheaper for not for the actual service provider but for the company that actually builds it for the service provider, they don’t need to bring the forklifts anymore that’s it. All they have to do is store it in some warehouse, send some guy in the middle of the night with just a simple pizza box, who can bring it in his car, open the the back door, put it in the rack and just insert it to one of the routers and that’s it.
Don’t need a hydraulic jack to lift it up, exactly three people, you don’t need special screws, you don’t need to worry about weight on the floor. I’ve had to do get special dispensation from providers to validate that the weight distribution on the floor was correct because one rack was so heavy that the network kit was so heavy that I had to have it placed in a particular place in the data center where the stress wouldn’t cause the floor to collapse or something like that. These things matter much more than we think.
Drivenets: They matter a lot. If you take a look at the chassis today it’s not really limited by the actual capacity that the chip can handle or the chassis that the vendor can manufacture has a limit. It’s actually the cooling and the power that they need if you have a huge chassis you actually the entire room is actually empty all you have is that gigantic chassis in the middle because there’s not enough power or not enough cooling in the room. I would say hyperscale or data center style, you just distribute those boxes around the room and you have enough cooling and enough power for the entire solution.
Another point here is that management didn’t get any more complicated for me just because i’ve added more devices that is I didn’t come up with another mouth to feed it comes up ZTP joins the cluster and now it’s part of the collective that I’m managing centrally anyway so i’ve added capacity, but I haven’t added like I said another mouth to feed.
Drivenets: Exactly yeah from your point of view you see another line card in a virtual chassis you operated it you operated the way you always did you configure it the way you always did you get the same SysLog you get the same SNMP if you want to use old method you get streaming telemetry if you want to use new method but from your point of view this is a single entity this is the exact beauty of our design
Yeah but there is one key aspect here that you’re saying we need to match the legacy network as I mentioned saying like SNMP and CLI because you know how it is engineers. We want to be engineers we love the CLI I still use CLI because I love it you want to see the commands you want to see the output you know I want to see what’s going on everyone still loves CLI that’s not going to go away we still develop CLI but on the other hand when you want to match hyperscalers when you want to see what’s going on in the cloud networks you want to see orchestration if you want to have value-added services you want to see ddos you’re not going to do it with CLI and do some kind of a great package from internet it doesn’t work like that today you want to see orchestration system you want to have some kind of a marketplace where I click a button and the entire cluster just gets upgraded and that’s how it’s done today I mean that’s from operator perspective that’s mind-blowing they’re saying wow that’s a whole different way to actually manage a router today.
Your solution would snap into whatever OSS or BSS the communication provider is running you’re not telling him that you have to come and buy my special magic multi-million dollar orchestration platform and run it with my custom hardware and my custom operating systems. You are indicating that I’m going to give you a replacement for this massive router that you have this this particular pain point, which is this multi-slot routing engine with very large forwarding performance has had a large number of interfaces. I’m going to replace it with this disaggregated solution. I’m going to give you an operating system that brings it all together and makes it look like a single thing. So it runs and just like your network does today you still see one chassis you see one API point you see one control plane even though it’s clustered or unified across multiple devices and then you bring whatever OSS, BSS you want if you’ve got an in-house developed one you just send you know standardized API calls where where the standard APIs exist you know whether it’s RPC or Yang or whatever and you’ll configure the system accordingly and then and and you’re almost.
Drivenets: The vision I had when I was reading and prepping for the show was a duck paddling it looks like a duck on the top and underneath there’s all this stuff happening to keep the clusters going and bring together this ODM into it you know into a unified hull and I think it’s that’s that’s the trick here is to see this as a instead of it being a chassis it might be 20 30 40 ODM switches but it looks and operates like a chassis switch or a chassis router, chassis-based router.
Drivenets: Yeah I thought you’re going to say it looks like an operates it’s like a duck but.
Is there a scaling concern here like Greg you just happened to mention in passing you know 10 20 30 40 switches I mean how many switches could I join to the cluster and have it have it work
Drivenets: Well each white box today is four terabits which is actually limited by the chip the silicon that the botcom supports today so it’s 4 terabit today and we support up to 48 of those so it’s 192 terabits which is much much more than what service provider needs today for a single router and it’s actually designed to support almost 800 terabits so there’s pretty much almost 200 line cards under the same router now admit it’s not in every corner in every co that the csp has right but keep in mind that the same box can be used anywhere using the same fabric and line card so you just buy one solution and place it whatever you want.
So this is a production solution at this point or is it kind of like beta testing with key customers where are we at?
Drivenets: So today we actually have a deployment in one of our North American service provider. We’re working with pretty much all the big one T1 providers in the world because they all experience the same issue and the solution and the product is actually well accepted by all of them so within testing in several places and we all ready now.
So if I was a customer and wanted to see you know talk about real-world scenarios and talk to other customers you’ve got people I can talk to
Drivenets: Well essentially naturally you can contact us whether for your podcast. Everyone goes through LinkedIn or for our website where you have pretty much all the information you want to see. There’s an overview of our products solution. References for all the product webinars and various analyst reports. Those kind of things that actually talks about disaggregation and our solution that matches the future of IPs.
Talk about licensing real quick It sounds like you know since you’re already in the all the tier one players and such there’s a cost model there that’s attractive it sounds fair to say, yes No one expected the increase in capacity demands to happen because of the the global pandemic.
Drivenets: yeah exactly
Exactly and I think even one more comment here is the fact that if you’re a large service provider, you can look at the beauty of seeing it as software license and not as a perpetual license. You don’t grow your the payments you pay does not grow at the rate of you putting new routers new hardware we can go into a fixed model in which we will pay us a fixed amount of money year over year for a certain number of years it’s to begin with. No matter how large, no matter how steep would be the jump in your traffic, you would not jump in the payments you pay us.
You wouldn’t have to pay a tax, I call it a tax as you sell more bandwidth or offer more bandwidth to customers you have to keep paying and the trick here is of course for communication service providers they don’t necessarily generate more revenue from more bandwidth. So it’s actually a trap for communication service providers to be caught into you know the licensing cost increases the capacity increases. It sounds perfectly reasonable until you realize that in communication service providers you don’t get more money for the bandwidth you get the same money for more bandwidth.
Drivenets: Yeah I can tell you a story about what happened here in Israel a few years ago in the cellular industry I mean we paid a lot of money I think almost in the states about 80 dollars a month just for our cellular bill and some company came in and kind of broke the entire market and they offered something in 10 dollars. A year later everyone pays ten dollars and those providers they haven’t changed anything they still need to meet the demand so the solution for them would be the same thing they need to lower their opex and capex costs on the network.
People miss that people do people forget to take that into account and paying paying for your vendor suppliers as a taxation model that is the more you earn the more you earn or the more you consume you pay more there’s actually a catch in the back end there as long as it’s linked to generating more revenue it’s fine but in the communication service provider industry that’s not true.
Drivenets: yes that’s right
Now as I was doing my homework for the show I noticed that there’s some involvement that you folks have with the Open Compute Project and since we’re talking about licensing and purchasing and all that this seems like a good time to bring that up what what are you offering to the open compute community
Drivenets: So our software and our entire architecture is completely compliant with a concept called DDC the distributed disaggregated chassis which was offered by AT&T and was accepted to the Open Compute Project and essentially the beauty about it is not only that it materializes this design for service providers, but it also brought it into reality in the fact that more than one ODM vendor took it and more than one ODM vendor was willing to create it and for us this opens a lot of possibilities lowering the price because there is more than one vendor but also from a service provider point of view they’re laying the ground here for a bigger thing running DriveNets and running other companies who would be willing to run in such architecture offering new services extending existing services. So from from overall perspective this is a a great move and OCP for us is important not only because of this specific hardware but also for other things which are important for cost. Optics we do not brand optics so any OCP compliant optic would be working here it will be certified it will be part of the solution so from the grand scheme of things this is lowering the cost in many many aspects.
Packet Pushers: Hey just some business advice if you guys make your own optics and brand them and then mark them up like a thousand percent maybe two thousand percent and then make it that that’s the only one your customers can run there’s so much money to be made there.
Drivenets: That’s I think that’s what other vendors thought over the beginning.
Ethan I think somebody’s already thought of that model but oh oh I’m behind yeah good idea but yeah.
Drivenets: You do know there’s a hidden command for every SFP to be running the network right that’s yes kind of the magic that happens in every vendor.
Yeah we’ve heard all about it yeah it’s it’s scary but how little the differences are. So okay with open compute then I can buy off the open compute list i’ve got multiple vendors that are writing or making hardware to that spec that that you support so like you said that’s important to you that just makes it easier for you to integrate into that ecosystem
Okay well let’s talk about roadmap if you can I don’t know if you’ve got anything big on the roadmap that you’re able to share but that’s always a fun question to ask what’s coming next
Drivenets : Okay so we’re talking about service provider right service providers unique market and it has its own demands it holds issues and there’s different segments within that market so we now we support today the call some of the aggregation and the peering functions or use cases within that market but we’re working to expand that to other sections within the market if you’re talking about Metro, Access, last mile those kind of segments within the same market. Now that’s one market naturally like any other company we’re trying to expand our market and our strategies to kind of focus on more use cases that you see today like 5G and IoT and mobile backcalling and all of them experience the similar issues of having to run disaggregated types of architectures and our software and orchestration really fits there very well so that’s two kind of main paths working on in parallel.
And of course important note here is the enterprise market you need a lot of ports you need a lot of interface here’s a great solution for data center interconnect or for your pod, for your super spine a lot of very other interesting elements that we’re going to tackle in the near future with very much the same offering.
You mentioned OCP by the way which is a good point there’s not a lot of vendors today which manufacture commodity chips or commodity hardware to support that white box model but we’re obviously working with the industry to expand that as part of our world map as well whether it’s new optics new white boxes new chips that’s going to come in the future with the vendors or the partners that we work with today that’s naturally part of our roadmap because there’s going to be new line cards new fabrics within a few years as well.
All right how do people find out more it sounds like well if you’ve got a page to drive people to or just a website or blogs anything you want to let people know that they can find out more about DriveNets how do they do that
Drivenets: Yeah you can actually access our website www.drivenets.com you have the complete overview of our products solutions references to pretty much every podcast webinar various analyst reports on our solution or whether our customers how they see our products you can just log in take a look at what we offer there and there’s always ‘contact us’ if you have any more questions.
Very good and I’ve got a couple more urls if you’re out there listening drivenets.com resources and there’s a network cloud whitepaper that you can consume as well get.drivenet.com/network-cloud-white-paper which you will not remember because you’re driving in your car. I hope at this point you’re going back to work. So don’t worry about remembering it all right now. Just go to www.packetpushers.net do a search for DriveNets to find the blog post for this show and all the links for all of these resources will be there for you. You can find this show and again many more free technical podcasts, our community blog etc. All of that is at www.packetpushers.net. We’re on Twitter, so you can keep up with the shows and communicate with us we’re at Packet Pushers. We do monitor that account. We do respond to your questions and queries and so on. We’re on LinkedIn as well. Rate this on Apple podcasts if you would that that helps us out, and last but not least remember that too much networking would never be enough you.