There's no greater frustation of trying to troubleshoot on a system for many hours (or days) while the solution was under our nose for all the time.
I almost burned a 7K Euros GPU card (NVIDIA A100 PCIe GPU) to understand how a TorchServe could meet the increasing of ondemand inference requests at scale.
In a tipical MLOps pratice, among the various things, we need to serve our AI models to users exposing inference APIs. I tried a production ready framework (TorchServe) installing it on Azure Kubernetes Service and tested its power to the maximum.
Let's see how Helm hooks work and what are the real use cases.
Docker Registry HTTP API V2 demistified once and for all.
Failures could be a step to the success only if understood. I reported here my 2 startups experiences with a summary of the main events and some personal post mortem thoughts.
Delegate secret management to an external specialized tool is a very important thing to take in mind in order to manage in a secure and professional way a fleet of cluster Kubernetes. Let's see two main solutions in the Kubernetes ecosystem: Secret Store CSI Driver and External Secrets !