India Mobile Congress 2023
Harnessing Automation & AI: Empowering the Networks for Tomorrow
Our IMC2023 session featuring DriveNets COO, Ryan Donnelly, as well as the following distinguished participants:
* Sonica Bajaj, KPMG in India
* Shamik Mishra, Vice President and CTO Connectivity, Capgemini
* TK Hitesh, CIO, Vi
* Nilekh Kumar, Product Owner, TCS Cognitive Network Operations Platform, Tata Consultancy Services Ltd
* Rajesh Thakur, Managing Director – Communications, Media and Technology, Accenture in India
* Ari Gentis, Head of Service Line Optimize, Ericsson
* Rajesh Mishra, Co-founder & CEO, A5G Networks
* Ryan Donnelly, Chief Operating Officer, DriveNets
Full Transcript
So, in the spirit of keeping everyone awake after lunch, let’s do a poll.
Who in the room is an operator or an aspiring network operator? Let’s see a show of hands.
All right, cool. All right, this is for you. So I think Rajesh kind of nailed it earlier when he said this is a leap, right?
This is not a iteration. This is not another generation of the same kind of tech. This is something that is different, right?
I mean, who who as an operator doesn’t want to know that an optic is going to fail before it fails? Who doesn’t want to know where the pernicious piece of kinked fiber is in my network that’s inducing a 1% packet loss rate? These are all things that I think, as operators, we’ve wanted for a long time but haven’t been able to get. And as a result, it takes us a long time to find them and a lot of money to solve the problems. So let’s talk about the circular dependency that trying to do this creates. Right?
I want to leverage AI and ML against my infrastructure. The problem is, to do that, I have to build infrastructure, right? And a lot of it, and it’s going to be complex. And that’s kind of what I want to dive into, is what does that infrastructure look like? So I could talk about this for an hour. I will spare you that, but I could talk about it for an hour. But I’ll try and cover it at sort of a high level.
So there’s three big buckets that I think we have to focus on, right?
The first one is space and power. And if you think about everyone talks about GPUs and how many GPUs can I deploy and how fast, the reality is there’s a space and power implication to that. So as you’re deciding how to go build out your, let’s call it your AI infrastructure, thinking about not just how many kilowatts of space of power and cooling I can cram into a rack, but how many can I cram into a row, right? If I build a ten kilowatt rack and the row can provide 15 kw, it’s not very helpful. And then thinking about that at the room level and at the facility level, and at the campus level, in the metro area level, right. There’s a lot of places where power and cooling can create a bottleneck in your overall deployment strategy. So making sure you think about what your scaling needs are up front, and partnering with the right infrastructure providers to make sure they can enable your project plan 12, 18, 24, even 36 months down the road, thinking about that upfront is really critical.
The second piece is supply chain. And I know we all love talking about GPUs, but the GPU story is only part of it, right? So as you’re going through and standing up, your AI infrastructure, your application, whatever it is, you have to train those models, right? You have to have data to train those models from. And so chances are you’re going to need some sort of data lake right, to store all this information you want to use to make your AI smarter. And that has implications for compute, storage, all kinds of other stuff that you need to be able to field that data lake. So as you’re thinking about how to be able to stand up these giant training farms, also think about all the other supporting compute accessories you’re going to need to be able to field, let’s call them the data lake elements as well.
And last, which is by far the most interesting because I’m a network guy, is the network training farmers create a lot of east west data flows inside of a network, right? And so that drives a lot of interesting design decisions that you get to undertake as you design your infrastructure. Things like do you want to deploy Ethernet or InfiniBand? Do you want to deploy a spine leaf network or a cluster? Do you want 100 gig, 400 gig, 800 gig interfaces? Right? And the reason this is important is because all these choices eventually drive your job completion time, which influences the ROI of all this infrastructure you just built.
And JCT is one of the things we talk about a lot inside of DriveNets as we’re designing our AI product portfolio. Because if I can optimize our load balancing tech and drive your JCT costs down by a double digit percentage, then I’m creating meaningful improvements in the TCO of the infrastructure you’ve just gone and built. And one of the really cool things about that is this sort of technological innovation isn’t just limited to AI, right? AI is driving the innovation by its own necessity. But let’s say you’re a large enterprise with large ETL workloads. This technology works for you too, right? And so AI is transforming this sector of the economy, the sector of technology, but it’s going to spill over into all kinds of other parts of commercial and high scale networking and I think that is probably one of the most exciting pieces of all of this. Thank you.
Thanks Ryan.
But slightly out of syllabus question, but how do you I mean, you’re talking about the GPUs and the consumption of infra. How does one balance that with the ESG kind of objectives that the organizations have? Does it come at a cost or compromise?
I think it’s an interesting question. So I think there’s two ways that users can kind of consume AI for a need like this. One of them is you wait for somebody else to build it and train their models and then you consume it through an API, right? You don’t try and do it yourself. And then I think the next step is to try and bite off the investment of really building out the actual infrastructure to train your own models. So unless you really know what you’re doing and what you want, and know that someone else’s model and service won’t work for you. I think it’s a step function, and that lets you sort of smooth the investment and decide what your real appetite for running that kind of infrastructure really is. Got it. We all talked about.
CTO Conclave: Decoding the success of 5G’s Grand Debut
Our IMC2023 CTO panel featuring DriveNets COO, Ryan Donnelly, as well as the following distinguished participants:
* GIRISH BHATIA, SE Lead, Ciena Corporation
* Harpinder Matharu, Senior Director DCCG Business, AMD
* Nitin Shah, Partner, KPMG in India
* Ryan Donnelly, Chief Operating Officer, DriveNets
* Dr. Badri Gomatam, Group Chief Technology Officer, Sterlite Technologies Ltd.
* Jayanta Dey, Executive President- 5G Business, HFCL Ltd.
* Azhar Sayeed, Chief technologist, Global Telecommunications, Red Hat
* Mike Fitton, VP & GM, Network Business Division, Programmable Solutions Group, Intel
* Dr. Magnus Ewerbring, Chief Technology Officer, Asia-Pacific, Ericsson India
Full Transcript
So with that, I will turn towards my colleague Ryan.
Ryan has a fantastic experience in managing the large deployment of infrastructure. So what are some of the challenges, Ryan, that you see when you’re looking at such a large critical infrastructure deployed at such a large scale and with such a fast pace? So what are those challenges that you can help and share with us?
Yeah, great question. Thank you for that. So first of all, thank you for having me. And secondly, I think we could probably have a panel just about this one topic itself. So I’ll try and cover this at a pretty high level. But if any of you want to find me after the session, I’m happy to talk about it in more detail.
I think there’s a couple of assumptions we have to make before we dive into this. One of them is that 5G is going to drive an increase in the number of physical devices on the network. And by that I don’t mean handsets. I mean physical, IP consuming devices that are sitting behind hotspots and FWA CPE. And I think we also have to assume that generally over time, the bandwidth consumption profile of all these devices is going to go up. Right. So if we assume all these things are true, then I think there’s three key areas we have to think about.
The first one is edge computing. And I know we talk about edge computing all the time and it’s this sort of amorphous concept that is different on an industry by industry basis. But let me give you an example of a place where I think this has a meaningful impact in the 5G world. So let’s say that you as an operator want to go train an AI, right? And you want to use anonymized tokenized data from your user base to be able to do that. Moving that data around, especially to a data center where you’ve got your training farm, has an implicit cost in your infrastructure, right? So being able to anonymize and tokenize and summarize that data closer to the edge has meaningful benefits and the TTL of, let’s say, your backhaul and connectivity infrastructure. So that’s one piece of it.
The second is capacity, right? And this is one that has so many dimensions it’s almost incomprehensible. The first one is if you’re going to do edge computing, where do you put the compute in? How much can you put there? Right? Some of these cell sites, if that’s where you’re putting it, you may not have a ton of space and power to be able to accommodate growth. So understanding what your strategy is when you’re capacity constrained, what actions do you take? Do you deploy and then undeploy services based on what’s relatively available? Thinking through your responses to those kinds of constraint scenarios I think is really critical before you dive into that end of the pool. The other is the physical network capacity itself. Right. I could talk about D-RAN capacity. Actually I couldn’t because I’m not really a radio guy. I’m sure everyone else in this room could probably talk about RAN capacity planning better than I could.
But then there’s the backhaul piece, right, but we have to think more longitudinally about capacity. Don’t just think about backhaul and RAM, think about your edge, your aggregation layer, think about your core. Think about your NNI edge. Right. With all your peering and transit partners, 5G has the capability to more rapidly and more severely impact your overall capacity utilization profile more quickly than previous technologies. Right. So being able to quickly respond to capacity demands in all those dimensions of your network are really important. Right. How quickly can you add capacity with a peer? Do you know the answer to that question? Right, you probably should, right. And if you’ve got hundreds of peers, being able to address each one of those needs independently is pretty critical.
And then the last is the operations and maintenance piece of this. So you assume you get this network deployed, it’s large and complicated and has a ton of utilization on it and you have to go do work, you have to add capacity or do upgrades or change devices. Your goal should be to sort of decrease the scope of the thing of the widget you need to touch to the smallest possible piece. Right? Like if you can disaggregate your hardware and your software, that means that you can touch those two pieces of your infrastructure independently without having to interact with a single supply chain. You can interact with two or three or four or five. And then if you think about the compute side of things, a microservices approach I think is pretty critical. Right. The thought of having to reimage a server every time you want to go deploy a new service in your edge computing infrastructure is sort of crazy, right. What you want to do is destroy a container and recreate it. So being able to address those compute needs in a microservices way I think is critical. And then last is be open. Locking yourself into proprietary technology just means that you’re locking yourself into fewer options with regard to software and supply chain and all these different things that are relatively open and mature today. Right. So use those technologies where they make sense for you and I think you’ll end up with a much more scalable and smooth deployment experience over time. Thank you.
Great. Thanks, Ryan. So before I move, I think I just wanted to cover another aspect. When you talk about data and you talk about data anonymization, I think one of the other important aspect which is coming when you talk about the data is there’s amount of data that you capture, which is fine, you’re looking for taking the services, you have to provide certain information that’s the data. And I’m primarily focusing on the personal data, but there’s a huge amount of data which is getting generated and that obviously is a new oil that we talk about the data which is generated and there’s amount of data which is getting inferred as well so how do you manage that? And it obviously depends on the complexity of the use cases but that’s another important aspect and again impacting the deployment of 5G or more not as a technology, as a service when you’re talking about the use cases and keeping the cybersecurity aspect around that so we’ll cover that but that’s a very important aspect.