Slai.io is changing the way software engineers use machine learning.
In May of 2022, the Cambridge, Massachusetts-based startup received investment to develop its vision for creating a fast self-serve prototyping platform for machine learning (ML) apps.
According to Slai Founder Luke Lombardi, the company’s mission is to manage the entire ML lifecycle, enabling developers to build production-ready models in minutes. "Models developed on our platform are guaranteed to run the same way in production," says Luke.
Slai's platform allows developers to focus on building machine learning applications by removing the tedious, time-consuming DevOps tasks that interfere with development. "We believe technical breakthroughs happen by giving developers access to better tools," says Luke. "We built Slai because we wanted to remove the friction developers feel when they want to leverage ML in a project.”
Unfortunately, when it comes to using "better tools," Luke knew that’s exactly what his team needed, as their reliance on local development environments slowed development and compromised onboarding. "In January, we moved our entire system to Kubernetes, and then we set up a local development environment using K3-D. It was clunky with many scripts that were not representative of our production or staging environments," says Luke. "In addition, the experience of getting a new engineer setup and onboarded with this environment wasn’t great."
The company's two founders quickly realized that without a development environment that mirrored production, they would continue to struggle with migration mishaps and resulting scrap and rework. "If your development environment isn't a twin of your production — like a Kubernetes cluster — then things don't behave as you would expect them to when you deploy," says Luke, "You're going to have drift. This was especially true for us, considering what we're doing isn't as simple as an API with a database — we have eight different services running."
Making matters worse was laptops did not have the compute power to handle the dynamic provisioning of different tools. In short, local dev took forever to spin up, put the brakes on developer velocity and made onboarding a day-long endurance. A new development approach was needed.
"Okteto shows you as much or little as Kubernetes as you want — it doesn't force you to work on a system that doesn't directly interact with the cluster. But, on the other hand, when we need to work directly in the cluster it’s critical to see how it behaves.”
To overcome the limitations of local development, Luke and his team decided to explore the benefits of Okteto's cloud development environment. They were impressed that Okteto spins up an instant pre-configured environment that mirrors production to mitigate the risk of drift showing up in production.
The spin-up speed reduced onboarding time from half a day to half an hour. "Onboarding has become dramatically easier," says Luke. "When a new engineer starts, and they go into their namespace, they click and import the single repo. It's nowhere near as many steps now to onboard."
The issue Slai faced with local dev environments was that to see a live version of a pull request, they needed to clone the branch in a local machine, redeploy the local environment, reconfigure their system, and then QA the changes. With Okteto, developers can now deploy application stacks and see changes live, precisely as they would look in production – without the complications of having to commit, build, push, or deploy. This made deployment much more streamlined, and eliminated most of the differences between development and production.
And because Okteto was built by developers for developers — Luke was able to lean on Okteto's customer support team to make sure the cloud development platform was configured to meet the company's unique ML development needs. "Every time I had a non-standard request, I would send it over to Okteto who would immediately get us where we wanted to be," says Luke.
Okteto helped Slai overcome a common problem that plagues ML organizations, which is the purchase of expensive laptops with GPUs that sit idle. With Okteto, Slai could provision a cluster in any public cloud with GPUs, and launch their cloud development environments in the cluster and leverage the GPUs there to develop and train their models. Additionally, a timer checks if an expensive GPU resource is still active — if not, the resource is put to sleep.
“Okteto will automatically scale your cluster needs, ensuring that your ML team has all the resources they need while they work, freeing them when they are idle to save money,” says Luke. “Cloud solves the need for heavy duty equipment to run GPUs and removes the stress factor on local machines.”
With Okteto, Slai experienced a 10X increase in spin up time with a 50% acceleration in development velocity. Additionally the company used Okteto to reduce the number of bugs making their way to production by 50%.
Okteto also simplified how the company's developers worked with Kubernetes. For example, with a cloud development environment, Luke's team had access to a Kubernetes namespace in a shared developer cluster that mirrored their production environment. "I'm not a Kubernetes expert, so it was great having support from a team that understood our infrastructure and could help smoothly influence the design of our system to mirror production closely."
With Okteto, developers work in a true Kubernetes environment and enjoy the power of containers without needing to learn the intricacies that could otherwise hold up development. Okteto integrates with Kubernetes node scaler to make sure developers have what they need.
"We have a front-end engineer who knows nothing about Kubernetes, and he doesn't need to to do his job," says Luke. "Okteto shows you as much or little as Kubernetes as you want — it doesn't force you to work on a system that doesn't directly interact with the cluster. But, on the other hand, when we need to work directly in the cluster it’s critical to see how it behaves. For instance, AWS and Kubernetes may change how something performs, and we can accurately test against those changes."
Okteto also helped Slai speed up onboarding, streamline workflows, spin up faster and accelerate developer velocity. "At the highest level, the real reason we chose Okteto was to improve our developer sanity and help us ship things faster," says Luke.
By providing each developer their own development registry, allows for experimentation without fear of breaking the main code. “With Okteto, I know there's an escape hatch where I can revert to a known good state of development in one minute,” says Luke. "In the past, if a developer wanted to make a change to a Docker file, they'd push it as an engineering change request and tag it; unfortunately, that process might cause a developer to pull down the wrong image. By giving developer their own registry as part of their development environment has ultimately reduced shipping bugs by 50%”
With Okteto, developers no longer fly blind in the dark, resulting in a very positive developer experience. "Separating concerns between developers working on different development tasks is huge,” says Luke. “Developers don't have to know all of the systems or services and can work on one script that replaces one existing service. It reduces the cognitive load — the number of things they need to think about — when working on one of these services. So if I can replace the service I'm working on and leave the rest untouched is critical. And Okteto makes this happen."
Deploy realistic development environments in one click.
Sharable preview environments for full-stack applications with every pull request.
The power of Okteto Cloud, with the control of running in your Kubernetes infrastructure.