Hi folks,
This is the first post for HelixML, a new project that I’m working on with some of my favourite people.
I plan to use weekly project reports to help us build in the open. I hope that by sharing openly our achievements and challenges in the form of a weekly R&D report, we’ll be able to involve you, our readers, in the journey. But first, some context.
What is happening in GenAI?
The Generative AI space is an interesting one. It feels like a new space with lots of scope for innovation and value creation in the same way that Docker and Kubernetes did in 2015.
We’ve been working on MLOps for a while, from the end-to-end MLOps Platform Dotscience (where Kai, Chris and I worked) which developed workflows from model training (with data & model versioning and provenance) through deployment and monitoring, through all the client work we’ve done with MLOps Consulting, most recently helping advise major enterprises on the transformative power of Large Language Models (LLMs).
ChatGPT launched a year ago, and it changed everything. The innovation to fine-tune a completion model on instructions (question-answering) suddenly gave language models the ability to interact with humans in a way that they’re super familiar interaction: via chat. Stable Diffusion also caught peoples’ imaginations as a way that computers could create images. It was open source, so tons of people across the world collaborated on improving it and making it run on smaller hardware.
So obviously OpenAI is the 800-pound gorilla in the room. But the really interesting thing that’s happened this year is the cambrian explosion of open source models, and the leaked Google memo suggesting “we have no moat” in the rapid face of development of open models. Meta threw their hat into the ring with Llama 2, and suddenly there’s a path to seeing a “stable diffusion moment” for open source large language models. Mistral launches, and suddenly you have a small open source language model that doesn’t suck. The open source community gets fine-tuning that model running on a single 4090 (consumer grade GPU). More capable LLMs come out of China and the open source community starts remixing them.
This thing isn’t slowing down. Our bet is that open source LLMs in 2024 at least catch up with GPT-4 in capability and that they can run on small hardware that open source contributors have access to — what are the implications of this?
So now, as 2023 draws to a close, we see the conditions are now quite interesting to create a platform that takes the best of open source AI — under the assumption that it gets better faster — and packages it in the way that companies need to consume it: a SaaS layer for easy experimentation, an ergonomic API with great developer docs, plus the ability to deploy it privately in your own cloud account or data center.
Add a UI layer on top that makes it accessible for everyone to fine-tune their own small LLMs, and we might have something interesting.
Hit subscribe to follow our progress!
Cheers,
Luke