From minimal VMs to AI/ML GPU rigs with Kubernetes
A recurring theme throughout my career is frustration at not being able to easily utilise a remote machine or a local VM.
In some cases this was because I had a local VM running a stripped down kernel with very little userland in it. Creating such an environment was useful for quickly building and redeploying the Linux kernel, but left very little in userland to work with. It moved a lot of stuff outside the machine where the software is being deployed.
In others I had a full blown Linux distro on a remote machine, but quickly getting updated versions of my software on to there is still a pain. Often involving SSH’ing into a remote machine and doing a Git pull or opening a file in Vim/Emacs and tweaking it.
Even worse are occasions where I had to upload a Docker image to a repository just so I can run the software in a Kubernetes cluster. In this case there are a bunch of tools to help (e.g. Skaffold). However the mind boggles at how complicated anything involving Kubernetes can get.
In pretty much all cases I have a local source repository with some code in. That could be a Linux Test Project test or an LLM example for the Prem Kubernetes operator. I then need to get that software built and into a remote machine which has the appropriate hardware or OS level software.
There are plenty of CI/CD options out there, but they lack the kind of tight feedback loop I want in any given situation. I certainly don’t want to involve GitHub or GitLab when tweaking one line of code in a series of experiments.
There are some tools that do a very good job, but typically require a fair amount of setup. To the extent that I think it’s more effort than its worth and resort to some variation of logging in over serial and doing things manually.
Ayup: A solution for some of this
What I generally want in these cases is a tool I can easily install and connect to on these systems that does the full CI/CD in a very rapid manner. Meaning it has to be statically compiled with just the kernel as its dependency and it has to have an easy, secure connection mechanism and it needs to cache builds.
Importantly it needs to do the good stuff by default. It’s amazing what Nix, Docker or Kubernetes can do. The tools and the features are all there, it’s just they’re not put together in the right way for the user experience I envision.
So here is Ayup, a build and deployment tool based on Buildkit and Containerd. It’s initial focus is on AI/ML projects, but in theory there’s nothing stopping it from deploying an LTP test.
Presently it doesn’t bundle Buildkit or Containerd, but that’s what a bunch of Kubernetes distros do and it results in a 200MB executable. This can be dumped as a single executable on a system and off we go.
There are a bunch of other problems that Ayup is trying to tackle that you can see on Prem’s blog.