We can all be AI engineers – and we can do it with open source models

The barriers to AI engineering are crumbling fast

Nov 12, 2024

A couple of weeks ago, I gave a talk at Hannah Foxwell’s amazing AI for the Rest of Us conference about something that's been brewing in my mind after years of working in DevOps, MLOps, and now GenAI: the barriers to AI engineering are crumbling fast. The tools have gotten good enough that if you can handle an IDE and push some YAML to git, you're already qualified.

A Pattern We've Seen Before

Having lived through the evolution from DevOps (ClusterHQ) to MLOps (Dotscience) and now diving deep into GenAI with HelixML, I keep seeing the same pattern: complex tools get simpler, workflows get standardized, and suddenly what seemed like rocket science becomes just... more engineering.

Let's Break This Down

Building an AI application comes down to six building blocks:

Models: Just mathematical functions. Complex ones, sure, but at their core, they're just turning words into numbers and back again.
Prompts: Telling the model what to do in plain English. Sometimes you need to be really explicit - like talking to a toddler.
Knowledge: Your AI's personal knowledge base. Documents, websites, whatever you need it to learn from.
Integrations: This is where it gets interesting - connecting your AI to real business systems through APIs.
Tests: Because nobody wants their AI going sideways in production. Yes, you can test AI apps, and you absolutely should.
Deployment: Running it on a server. For example, versioning the entire app above in a yaml file and using Flux to manage deployment to K8s and Helix.

Taking It to Production

Here's where my DevOps background kicks in. You know all those tools you're already using? Git? CI/CD pipelines? They work for AI apps too. I demonstrated this live in the talk by building a Jira integration that could write code based on ticket descriptions.

The secret sauce? Something we're calling an "AISpec" - a YAML file that pulls all these pieces together in a way that feels natural to anyone who's worked with modern infrastructure tools.

Why Open Source Models Matter

This bit's important: when you use open source models, your data stays yours. It doesn't end up training someone else's model. For companies worried about GDPR or other national, regional or corporate regulation, this is crucial. You can run everything locally, in your own infrastructure, with your data staying right where your legal team wants it.

See It In Action

If you're reading this thinking "show me the code," I've got you covered. Here’s a complete reference architecture you can set up to do this stuff yourself: https://github.com/helixml/genai-cicd-ref

And here’s a complete tutorial of using and setting up the reference architecture:

The main thing I want you to take away is this: if you can handle version control and basic deployment workflows, you can build production-ready AI applications. The tools are there. The models are there. And they're getting better every day.

What's Next?

Check out aispec.org if you want to dig into the standard format we're proposing for all this. Or if you want to chat about any of this, find me on LinkedIn or drop me an email at luke@helix.ml.

This is about making AI engineering accessible to everyone who knows how to ship code. No PhD required, no magic - just solid engineering practices applied to these powerful new tools.

HelixML