Mirantis Selected as Software Infrastructure Partner for NVIDIA AI Factory for Government | Learn More

WEBINAR

Register to access this webinar via Brighttalk.

Transcript

00:00:00 Hey everyone, thank you for joining the session. Welcome, welcome everyone. And yeah, I'm excited to present some AI stuff. That's all the hype, right? Before we get into the session, a couple of minutes about me. I'm Bharath Nallapeta,. I work as a senior software engineer for Mirantis. I've been in and around Kubernetes for a few years now. I worked with companies that are almost like startups to large enterprise companies as well. So my career path itself is sort of mirroring the

00:00:39 platform engineering and the Kubernetes at scale kind of journey. And then when I moved on to another startup a Swedish-based company I was working on soft multi-tenency. So it's perfect for companies or people who are looking for solutions that are probably - let's say - one, two, or three clusters and then you have small teams and you have a small number of people that's ideal. So soft multi-tenency is nothing but you have a single cluster and then you divide them into isolated tenants right? So you achieve isolation

00:01:17 using software where the hardware is still the same so that's good. There's been quite some solutions in this space but then it certainly has its limitations, that's where the real multi-tenency comes into picture and that's where a lot of companies a lot of people are trying to find a good solution for right so k0rdent - I'll come to k0rdent in a bit but before that the problem statement is this that multi-cluster/multi-cloud solution is really important right I know the title

00:01:54 says AI you all are waiting for you know to see some AI apps run and I'll get to that soon but I'm just taking a few minutes in the beginning to lay the foundation that be it AI apps or any other application that's running at scale, you need a solid platform right, you need a scalable, extensible platform and that's what k0rdent provides.So let's get started. Like I said platform engineering has been around for a while and all of us have used one or more of these clouds at some point in

00:02:34 our lives right? And companies usually have a mixture of these things. So there's private data centers and then private cloud and then public clouds and edge networks and all that. So as Kubernetes has matured, we have found it more and more difficult to manage this entire cloud. So although it was supposed to be just one single contract or whatever with a particular cloud and then we're done, it is not that - like I said - the ecosystem we've built around Kubernetes requires us to

00:03:09 communicate with so many components at scale between all these you know let it be services or components or companies and products right? The need for platform engineering in order to run scalable AI apps or data science apps or any apps that require a lot of compute is quite strong, right? So from a developer perspective, we'll be looking for a self-service platform and from a managerial perspective, from a DevOps perspective, we'll be looking for operational simplicity

00:03:42 combined with ease of customization, right? So let's say a particular team has asked for 16GB of RAM and then four vCPUs and then you know three nodes and you get that but a different team has different requirements that's where the customization requirements become quite strong. So in order to support a wide variety of this audience, you'll need to properly have the power to, you know, perform these customizations. Visibility and control is again really important. The cost can get out of hand really quickly. So

00:04:22 if you don't have a complete view and understanding of how many clusters you've provisioned, who's using what, who's using how much and all these things, the control mechanisms you know will become really important and security and compliance [will] definitely [be] right? So security and compliance should form the bedrock of any enterprise platform engineering team or any infrastructure that you're trying to build. This should always be there. This shouldn't be a peripheral thought but rather this should become [one of] the core

00:04:58 values on which you build your platform. Now in order to accomplish this and in order to get this done and reach this, platform engineers also have certain needs and what are they building for exactly right? So like I said, multi-tenency - the example I gave you between soft multi-tenency and hard multi-tenency - becomes all the more important here. Platform engineers and platform engineering teams are usually designed to handle things at scale which means catering to hundreds of customers. Then there are these AI/ML and

00:05:31 edge and IoT applications which are quite important to - it's quite important for the hardware to cater to run these things. So there's the generic hardware and then there's maybe like slight tweaking of the hardware in order to optimize the running of these applications right? And then, of course, like we've already covered this hybrid multi-cloud and High Availability - all these things so just to again reiterate multi-cluster multi-cloud is becoming the new norm whether you want to run your enterprise

00:06:08 applications, scalable AI/ML applications, data science applications, or any domain, right? This sort of becomes a norm. I've just put up some stats here. As you can see, the survey reveals that 76% of the organizations have implemented multi-cloud strategies, but only 29% of them have actually adopted proper platform engineering. Most of them want to run multi-cloud apps. Most of them want this multi-cloud, multi-cluster environment, but in order to properly get [a] philosophical plan with proper architecture and adopting

00:06:43 platform engineering there's a - you know, there's a there's a case of, you know, lacking in this area and that's why I think there are choices where we start with nothing like at the start right when we want to build a platform engineering service we understand that the cost of building this whole thing is expensive. Okay. And it is not scalable at all. Then the next immediate solution or the next immediate process or thought that we'll go into comes DIY open source. This is great, right? Especially

00:07:17 Kubernetes has accepted and accelerated open source development so much that we really can't imagine a world without it, right? So it's really good but there just too many options for any single thing that you pick where you have to pick and choose between components for any particular result right then there's a proprietary solution of course famous clouds like you know AWS, GCP, Azure - all these guys provide solutions but then they provide solutions for their own cloud right so we've all seen cases recently. What

00:07:56 happens when you have a vendor lock-in? It's never really good, especially when you're thinking of a huge infrastructure in a huge cloud. So this becomes a lock-in and of course you're completely dependent on them, right? So the feature set and the extensibility completely becomes dependent on that particular cloud. So then there's the fourth solution which is enterprise-grade, open source solution. So what we're talking about here is using the existing open source tools and

00:08:28 then combining them in a way which presents the best of both worlds. So we've got the open source solutions combined with the enterprise-grade solutions as well. Let me jump into k0rdent. Now this is what k0rdent does at a very brief scale. k0rdent empowers you to do multi-cluster, multi-cloud provisioning and management of Kubernetes clusters, at scale. In order to work with this, you don't need to understand or learn any of the IaC code (Infrastructure-as-Code) solutions. You don't have to use things like Terraform or Crossplane, you

00:09:09 know, that that's a whole different world, right? You would just need to work with YAML directly and Helm charts. That's the configurability and ease-of-use that k0rdent provides. So we'll talk about two things: cluster templates and service templates. I'll jump more into it when we jump into the demo section, but essentially what you have is cluster templates and service templates. That's basically operating your entire, you know, k0rdent. The architecture of k0rdent goes like this,

00:09:42 where you have KCM, that's the cluster manager, then you have KSM, that's the state manager or the service manager, then you have KOF, which is the observability and FinOps right? So the k0rdent cluster manager is responsible for actually provisioning the clusters, which is the Day 0 operation. It interacts with all the upstream cloud components, whether it be AWS or Azure or GCP, right? VSphere, OpenStack - and then service manager is responsible for the Day 2 operations where you'll actually want to

00:10:21 deploy applications. We'll see more of this in this particular session as well because in order to run AI apps, you got to prepare your cluster to be AI-ready, right? And that's where the service manager comes into - comes in handy, right? Just a quick note before we move on: so, essentially what we're talking about here is a complete internal developer platform, right? So IDPs have been there for quite some time. It's been congregated into specific domains. And when we talk about Kubernetes and IDPs, what we want is a

00:11:01 repeatable platform engineering mechanism where you as a platform engineer are able to provide your team with a platform that's catered for their use case. At the top layer where as you can see there's MLOps, AIOps, HPC, edge/IoT and so on, right? This could be anything. And today we're going to focus on machine learning and AI. This entire stack remains the same and the infrastructure provider remains the same and your application layer for the workloads could be anything. That's the flexibility that k0rdent provides in

00:11:43 order to create an IDP. So before I talk about this right, briefly I want to mention why AI is suited so well with Kubernetes. You can run AI, I mean it's moving so fast that we can literally run AI on Android devices now, we can run them on our laptops, we can even run them on CPUs, we don't even need to depend on GPUs right? So there's all sorts of hardware that's getting enabled day-by-day, but for AI-based applications - but if you're talking about enterprise AI applications and you want

00:12:24 to scale stuff and you want to maintain that, you know, availability, then Kubernetes presents itself as a very good solution, as a very good platform right? So let's say you have an AI App that's using PyTorch, and then - and you run it on your Linux machine, which has a GPU node, right? This would be okay, good enough for development purposes, but when you think about serving it up for users, optimizing the usage, there are some limitations, right? So, Kubernetes presents itself here as a platform that

00:13:08 can do these things well. Kubernetes can help autoscale models. It can scale down to zero. It can scale up as the requests are coming in. You can easily control the number of replicas, right? You can divide this among multiple servers across clouds. You can distribute the workloads, right? All these things would become extremely difficult if not impossible to do it on traditional server-based methods. Of course I use the word impossible very, you know, carefully here. Of course anything's possible but then there'll be

00:13:47 a lot of, you know, manual intervention and a lot of shortcuts that we'll need to take to make that possible whereas Kubernetes will provide this as, you know, as rather a feature [that is] native of the platform itself. Then comes the choice or selection of the compute, and then how do you make that compatible for Kubernetes. So to go back a step and then understand in order to run AI/ML applications you'll of course need a GPU node right? That GPU would need to be enabled by installing certain drivers

00:14:30 from NVIDIA. So again here we'll focus exclusively on that and like, the whole demo is based around that where we are using NVIDIA Tesla t4-based GPUs and then you'll need to make sure that the Container Runtime is also installed in order for Kubernetes control plane and Kubernetes nodes to understand that that particular node has GPU nodes and we can schedule, you know, workloads there right? I know this might sound confusing a little bit but soon when we jump into the demo I'll make these things a little more

00:15:09 clearer right? So if any of you guys have tried doing all this manually, it's a lot of effort, there are just one too many things that one has to do. In fact, there's a running joke that NVIDIA drivers have been the same for like five plus years and nobody's just gotten it to work smoothly in the first go right? There's always some issue or the other that keeps popping up. What k0rdent does with the power of service templates, which I mentioned, is it'll help all these [be made] possible

00:15:46 and it'll help the developers to deploy these things quite easily right and for this demo we have chosen KServe and Knative in order to do the scaling up and down right? So this is a second layer of things. So the first thing I talked about is preparing the hardware and making it possible for - in order to run GPU-enabled workloads - and then the second layer becomes installing KServe, Knative, and letting that handle all the, you know, autoscaling bits right? So once again, what I want to emphasize is that

00:16:28 the NVIDIA GPU operator, KScale Knative, Prometheus, Grafana - all of these tools are open source and all of these have you know most of these have Helm charts and this is continuously being developed by different communities at different scales right? So all we're doing is trying to bring all these things together into a single platform and make the experience as smooth as possible so a little bit of introduction or - trying to set the context here - this is my kind cluster. And this is where I've

00:17:05 installed k0rdent. And these are the typical components that are installed, you know, post-installing k0rdent. You have a bunch of controllers here. So this is the upstream CAPI controller and then Capa for AWS, OpenStack, VSphere, Azure, right? Earlier I mentioned that k0rdent uses upstream open source projects and that's CAPI right? So it utilizes the controllers and the entire CAPI ecosystem in order to actually provision and deploy the clusters on whatever cloud we wanted to deploy to. Then there's a

00:17:47 bunch of k0smotron controllers. So k0smotron controllers is the control plane manager sort of for k0rdent. So k0smotron itself is an independent project which handles the provisioning of Kubernetes clusters, right? And then there's a bunch of KCM nodes itself - KCM pods itself, right? So these are the components. Now if we see - let's first look at cluster templates right? So I mentioned two things: one is cluster templates and one is service templates. So the cluster templates are templates which

00:18:36 basically contain a reference to the Helm charts and the Helm chart contains references to all the templates that has to be applied in order to provision a Kubernetes cluster. So right now we have support for all these clouds. GCP was released quite recently too, and what it essentially does is reference this template in your cluster deployment and then it provisions all the resources such as your cluster, your Azure clusters. If I have to take an example for Azure, then your machines, Azure machines and so on.

00:19:17 These are all upstream CAPI objects - that's cluster templates. This is what you need in order to tell k0rdent where you want to install and how you want to install the particular cluster. Then you have service templates. These are the Day 2 operations like I mentioned. So we have a bunch of service templates and we have a website called catalog.k0rdent.io/latest where we have packaged all these service templates into our catalog and this is an ever-growing catalog and obviously

00:19:52 we welcome open source contributions to this. Right now, I think about 20 plus or 30 plus services are already there and you can see, right, some of the famous ones I've installed in here while I was testing for the AI thing so like cert-manager, secrets, gateway, gpu-operator, ingress and all that. And if you notice, there's ingress 4 11, and then there’s ingress NGINX 4 11 3, 4 1 1 0 and 4 1 1 3, right? So I installed both the versions here to showcase the versioning capability here as well. So like I mentioned, this is

00:20:35 the kind cluster. This becomes your management cluster, right? So while I'm using kind, this could be any Kubernetes cluster on any cloud itself. So that becomes your base and you'll use this cluster to provision multiple other clusters and if you deploy all the service templates with different versioning as well. What happens tomorrow is that if you want to reference a particular version of NGINX in one cluster and a different version of NGINX in another cluster, you get that flexibility right? That's why I have

00:21:15 multiple versions of cert-manager and ingress based on my requirements. And finally, let's see the cluster deployments so these are the available cluster deployments right? Yeah, before I explain that, let's first go ahead and apply a cluster deployment right now for this demo. All right. And Right. So that's started. Right. So we should be able to see a cluster in less than 10 minutes. It might be very surprising to some of you because unless you're really tied down to a vendor, if you do this you know

00:22:03 self-service Kubernetes cluster provisioning yourself, you would have definitely experienced the pain of you know provisioning a cluster end-to-end with everything just working. And this is just not provisioning a cluster, but also installing the other AI services as well. So let me open or let me show you this is a long-running cluster that I have until that is provisioned. I'm going to show you this or rather give me a second. All right. So let's look at the YAML directly. The YAML, which I applied so

00:22:50 every cluster deployment has like three or four main things. The one obviously important thing is the template which I mentioned, the cluster template. You got to tell k0rdent which template you want to use. There are different versions. And then there's stand-alone and hosted. Stand-alone is when you want the control plane and the worker nodes to reside in the same cluster. Hosted is when you want your control plane to be, you know, in one central place and then multiple child clusters where the

00:23:20 child cluster will just have the worker node. So it's a central management kind of thing. Then you have a credential reference which is nothing but a secret that's wrapped up into this credential object. Right? So these are nothing but the same credentials that you'll use to interact with your Azure cloud. And then down here we have the services right down here we have the services and in this particular one I've just installed the gpu-operator right? So we'll we'll go deep into this with our actual

00:23:56 cluster that we just provisioned and the last bit will become the config right? So config is cloud dependent, so for Azure you have all these variables, all these fields. And similarly for OpenStack, for AWS, and so on. There'll be different fields that are appropriate for that particular cloud. These are again following the upstream CAPI and CAPA cloud controllers and all the fields are referenced to whatever is set there. Right. Now, let's see what is the status of our thing. I think it's still coming

00:24:41 up. Then, in the meantime, let's take a look and… yup. So, here in our services, right, in our services for this particular demo talking about AI scaling, this is what is important: the entire manual process of installing all the dependencies and finally Knative and KServe, is all taken away and then abstracted into a list of services like this. So earlier before this I installed the service templates from the catalog and once the service templates are installed which I listed with “k-get-service-templates” all you need to do is

00:25:30 reference all of the service templates in this. So let's take a look at them. So the first one is the gpu-operator by NVIDIA, which is something I already mentioned. This is required in order to make the cluster GPU-ready and explain to Kubernetes that there are GPU nodes attached to this hardware instance and we can actually run AI workloads right? That's what this does. Then there's cert-manager, right, and then finally we have the Knative and KServe components, right? So all these are

00:26:12 service templates that are referenced and the service template is referring to a Helm chart. Yeah, let's take another look. All right, I think while that is still, you know, getting provisioned, let me go ahead and jump into the second part of the demo. So in order to showcase how to deploy stuff, right, I've created a bunch of inference services and then I have a central pipeline called neural-babel, which is nothing but a deployment supported with a config map and a service and a virtual service

00:26:50 right? Virtual service is a CR which is of I'm sorry which is of Knative right? And then there's these three in front services, right? To give a quick overview of what we're trying to accomplish, the demo will showcase an end-to-end display of translation, voice translation basically. So I speak in English and then I get an output in the desired language. This is done completely using AI and this is running on the Kubernetes cluster and I have split this into three three distinct services. So the kube-whisperer basically takes an

00:27:34 audio and then gives text. So that's basically you know the audio-to-text service. Then the translation inference service supports translation between languages. So it translates the text from one language to another. Then there's the tts-service, which does the final conversion from text to sound. So the kube-whisperer uses OpenAI’s Whisper Model, and the translation service uses Meta’s MLBB Model, and the tts-service uses Core Q's TTS model. All these are open source. All these are available for everybody to

00:28:18 use. But what I've done is I've taken the upstream models that are available and then built an app around it, and then built three similar apps like that for each purpose and finally neural-babel is the one that's converging them all. Right? So let's take a look at each inference service to get a better idea. So as you can see this is an inference service from KServe. And we have the container, the ENV variables, the image, and, you know, it's pretty standard Kubernetes spec, volume, and everything.

00:29:00 So just to brief about this, this is an image that I built of course from the repository that I have and then this is the customization that the image provides. So you can select all these and set to whatever is needed for you. I've currently used a whisper tiny model. There's base, there's large, and there's so many other models as well, right? And similarly, there are many other environment variables that you can set based on your requirement. And now the important thing. So we see these

00:29:40 annotations right. So this is how we actually accomplish the scaling part. So KServe is smart enough to understand when it's getting too many requests, it needs to serve those requests and it'll scale it up. So what I've done for demonstration purposes is I've set these two where minimum scale is one and max scale is two. Typically in enterprises, it's a good practice to actually set this to zero because you don't want your GPUs to be running and consuming and costing you money when there's actually no real

00:30:20 request coming your way, right? So this is typically set to zero. That means the GPU is sleeping and then you set whatever number you want for max scale. What happens is when the first request hits KServe, it intercepts that and then immediately scales to one to start with and then addresses that you know request. Let's take a look at the other ones as well. This is not really that different. Again, it's a different image that does a different thing with exactly the same configuration

00:30:55 for scaling. And then a bunch of environment variables as well. And then the last one which is the tts-service [which] has pretty much the same thing, right? A bunch of environment variables. You can provide what model you want and then your image and then the same autoscaling analytic. And then we have the neural -babel, which is a pipeline that combines all these things. I've built an image around it and then it retrieves all these endpoints using a config map. This is a standard deployment file. In

00:31:37 the config map, I provide the endpoints. So for now, since this is a local environment, I've used nip.io so that it routes it back to my machine. And I've set the source language as English and then target language is French. And then, I provided these three endpoints, right? And where did I get those endpoints? See external IP. So you might notice that it's different. There was like 52 something in this and now 74 because I was working off with another machine right here. And that's probably why I edited

00:32:19 that. Yep. So this is the machine that has 52.17229109 and the file also contains that. Right. And before running the actual test, let's go back and see. Awesome. So our demo cluster is ready. The kubeconfig for this demo cluster is actually stored as a secret in the same name space right so it's suffixed with “-cubeconfig” with the name of the cluster. I’m going to do this and export it to basically that path, open a new tab and then we named it demo right demo Azure, yeah. Then when I do this, here we go. So

00:33:13 this is the cluster that we just provisioned, has one control plane and three worker nodes as mentioned in the configuration in the cluster deployment. And then let's see the namespaces. Let's see cert-manager right, S2, and so on. So finally let's verify with KServe right? Perfect. And also verify Knative. All right. Awesome. So you see what just happened, right? So this is the cluster and to review the cluster deployment file again because we've submitted all these service templates in the service spec region of

00:34:07 the cluster deployment. It has automatically provisioned all these services for us. Right. Another check we can make is - I think this would take a little bit more time - yeah, as you can see, there's - there are so many parts that need to be,you know, actually deployed, and then NVIDIA does a bunch of stuff. So, we got to give it a little more time and then all the parts will be completed and then we'll have a node ready. I'm going to switch back to the server that was already there, just

00:34:45 to show you the, you know, yeah this is what you see. So allocated resources. Yeah, this is what I was trying to find. So this server right here has already, you know, been provisioned or been configured with the same set of configuration that we just saw and in the interest of time I just provisioned this cluster before the demo and once the NVIDIA GPU Operator is done, this is what you'd expect to see that it recognizes the NVIDIA GPU - I mean it recognizes the GPU and then it puts these labels. nvidia.com/gpu is one along with the

00:35:28 rest of the stuff, right? CPU and all that. And you can even observe this in the label section of the node as well. It has a bunch of stuff. So finally, let's test this out. So I've got a script here in order to test this out. Let's do a test. I have this cluster. Everything's set up. I'm running the test script. I'm not sure if it'll actually play back the audio. The script actually does this entire conversion and then plays it loudly. But I'm wearing earphones and I'm not

00:36:01 sure if you can actually hear the sound or not. Let's see. So, we've got an audio file. It'll take that audio file. It is converting that to text and then text-to-English, text-to-French, and then French text-to-audio back. All right. I think we - I really can't play it back right now because I'm wearing earphones, and I could hear it. So while it's playing translated audio, it actually played it back. Right. So this is the end-to-end pipeline. Now let me open the same thing in another tab. While we created these

00:36:39 three inference services, what KServe does is it deploys these pods, right? So these pods are maintained and controlled by KServe, and we just have to work with the inference services, right, and these pods are all spawned by that, okay? So, yeah, it's sort of the same thing that I wanted to show, but basically it scaled it up to two pods and then it's terminating. Now, let me run another script where we send concurrent requests and let's watch. There you go. So if you remember, we had set the initial minimum scale to one and then

00:37:32 max scale to two. Over here in the script what I'm doing, I'm just sending requests one after the other, one after the other, with a two second waiting period because if you just send like 100 requests at a time then I've seen that the app crashes - or “crashes” is not the correct word, it takes in the request but then immediately fails it so I'm just giving a 2 second gap and then as you can see here the first pod in the pipeline, which is kube-whisperer, comes up. It had one now, the second one comes pending,

00:38:04 container running, and eventually running, and then the second one is also coming up. So the reason why lexi-shift is not coming up is because the underlying model itself is not capable of handling this workload. So this is using Meta’s MLBB model, and text-to-text is somewhat as you can understand quite, you know, low power-consumption compared to audio files and video files right? So with the kind of requests that we are sending, 30 requests, lexi-shift has been able to handle all those requests with just one pod whereas the

00:38:50 text-to-audio conversion, and then audio conversion back to text which is the ____ and then kube-whisperer was not able to handle it within just one pod so it spawned up more pods that's where you see the smartness of KServe and Knative as well, that it's just not blindly increasing or scaling up the pods costing us money, It's actually looking at the workloads that it's getting, and then dynamically scaling up only when it's required right? So when I do this in this moment you can observe like there's two ____ and then two kubes

00:39:24 and then one lexi and then one so this is not an inference service this is just a pipeline so ignore this and you can already see that since we've stopped within a few seconds it went back from running to terminating and in a few seconds you can observe yeah even ____ is gone, and everything's again back to normal with you know minimum scale of one replica for each. So I really had to push the number of requests in order to get lexi-shift itself to scale up to two, and I was able to do that far better when I

00:39:58 was actually directly sending queries to lexi-shift model instead of the neural -babel model, but anyway it's the point of us to showcase the other world as well, which is really nice right? And a final confirmation we have everything back to the minimum scale of one. So I know that was a lot. I'll take just maybe a couple of minutes to summarize it all. So the base platform engineering, the need for scalable, repeatable, customizable platforms. How it's very complex and then how k0rdent helps there. Number one.

00:40:38 Number two, why Kubernetes is perfect for AI [despite] all the challenges that it provides. First of all, in order to set up the hardware that's required to run, that's one. And then, once you've set up all the hardware, the challenges it presents in order to scale it up and in order to basically cater to a larger audience. Not just a larger audience, even the other way is true, right? So even the other way of scaling down is also true which is extremely difficult to do it without Kubernetes and that's

00:41:12 why Kubernetes is a perfect fit and Kubernetes with k0rdent here, which provides the backend, underlying platform engineering and accelerates AIOps or MLOps - whichever way you look at it - is what I tried to present here right? So the NVIDIA GPU Operator, mention it as a service in the service back done now your GPU is ready, now your hardware is ready, it's GPU-enabled, scaling down and scaling up. Kserve and Knative install it as a service, again, prerequisite for that is cert-manager. No problem just

00:41:48 mentioned the service pack it's ready so before this if any of you guys still want to try it now, I suggest you guys to try it out manually in order to install KServe, Knative, and all its dependencies on a normal Kubernetes cluster - it's painful right? This gives you just one single template to just apply then the final bit. I had an AI application which was doing realtime translation between languages so it takes an audio, converts it to text, changes the language, and then converts the text back to the target language.

00:42:28 I tried to demonstrate English-to -French, unfortunately, I was not able to play the audio back - but that's the entire flow. Then we try to scale it up, I had developed a script where I was sending continuous, concurrent requests and as soon as it saw there are so many requests coming its way it, you know, spun up another pod because we had set the maximum scale to two for kube-whisperer and _____, whereas for lexi, the text-to-text conversion, it just stayed at one because it was still able

00:43:01 to handle this load with just one. So this is the catalog I mentioned: catalog.k0rdent.io/latest. As you can see, like we have a bunch of templates already provisioned here. Like if I just open KubeCost, right? This is what you'd need to do. So you'd need to install the Helm chart, right? This would do two things: it would install a Helm chart and also your service template. So the KubeCost Helm chart along with the KubeCost service template are installed together. So with just this one command,

00:43:52 you're good to go and then in your cluster deployment, you mention it like this. And of course values.yaml is configurable right? So based on your requirement you can modify this right. So basically this is what I want to show you that catalog.k0rdent.io/latest is there, and this list is expandable. We are like constantly adding more and more chats here, and in case like you guys are interested in any other ones like the scheduler is something that I've heard multiple times and that can

00:44:27 just be added here as long as there's a Helm chart, and I'm sure like most of these projects have Helms already, right? And then there's the website: docs.k0rdent.io/latest/, the documentation website where we have a bunch of resources in order for you guys to try it out on different clouds. admin guide and yeah, I mean - you guys can explore this. I'll share these links post the session as well. So docs.k0rdent.io/latest/ and catalog.k0rdent.io/latest/. All right guys, thank you so much for staying in and listening. I hope you enjoyed the

00:45:05session, and, you know, took some value out of it. Let me know if you end up doing some other crazy things with AI. Just every day, there's just one new crazy thing. So cheers. Thank you.

Choose your cloud native journey.

Whatever your role, we’re here to help with open source tools and world-class support.