Podcast Drop: Chatting Open Source With The Founders of Predibase
Fine-tuning and serving open-source models, founding an AI company pre-ChatGPT, and why bigger isn't always better!
Please subscribe, and share the love with your friends and colleagues who want to stay up to date on the latest in artificial intelligence and generative apps!🙏🤖
The other week we had a doubly special opportunity to speak with two of the there co-founders of the hot AI startup Predibase: Dev Rishi and Travis Addair. Predibase is a developer platform for fine-tuning and serving open-source models, so they are squarely in the middle of several big themes in the AI space.
Dev and Travis are both first-time founders coming from large tech platforms, and shared a gold-mine of valuable information for AI builders and practitioners, including:
Their journeys from working on next-gen machine learning products at Google and Uber to founding Predibase in the early days of transformer models
Building and scaling open-source projects to thousands of users and maintaining an engaged developer community
Selling into the enterprise and deploying to customers like Adobe, Tencent, Apple, and a top 10 US Bank
And much more. You can watch or listen to the podcast below and read the full transcript here, but in this post we’ve summarized our key learnings and takeaways.
What Does Predibase Do?
Predibase is a developer platform to fine-tune and serve open-source models. Two of the three co-founders - Travis and Piero Molino - had originally worked on developing and deploying machine learning models at Uber. In 2019, Piero introduced Ludwig, an open source framework to create deep learning models which now has 10K+ Github stars and thousands of monthly downloads. At Uber, Travis led the team behind Horovod, an open source framework to efficiently scale and distribute deep learning model training to massive amounts of data. Travis and Piero then teamed up with Dev, who was the first Google AI PM at Kaggle (an open source data science platform with 6M+ users) to found Predibase in 2020.
Predibase’s platform is based on three key tenets:
Fine-tuning: Developers can easily configure model training using Predibase, which is built on top of Ludwig. Predibase automatically applies quantization, low-rank adaptation (LoRA), and other optimizations through best practice templates with built-in orchestration logic that finds the most cost-effective hardware to run training jobs on.
Model Serving: Predibase offers serverless endpoints for open-source LLMs, and with LoRAX and horizontal scaling, can serve customers fine-tuned models cost-efficiently. LoRAX (which stands for LoRA Exchange) provides a series of optimizations like Dynamic Adapter Loading, Tiered Weight Caching, and Continous Multi-Adapter Batching, right out of the box.
Private Deployment: Developers can use Predibase to build in their own environments and deploy models within private clouds or the secure Predibase AI cloud. Users can take control of their IP and are ensured separation of data so all sensitive assets stay within their power.
Using Predibase’s platform, organizations like Adobe, Tencent, Apple, and many others can train models more quickly and save money on their LLM bills. Predibase acts as a “gateway” for businesses to start using open-source models. Predibase’s popularity has attracted interest from the VC community, and they’ve quickly raised $20M+ from investors like Felicis and Greylock.
Example use cases include:
Customer sentiment analysis to understand how customers feel about your products and services.
Customer service automation where Predibase can be used to fine-tune open source LLMs to automatically classify support issues and generate a customer response.
Document generation to automate manual docstring creation using LLMs.
And much more. Watch the clip below for a fuller product overview:
Dev and Travis shared key learnings from their founding days, how to successfully commercialize open-source technology, and why bigger isn’t always better:
On being founded “pre ChatGPT” and how that affects the vision:
Our vision initially was as a platform for anyone to be able to start to build deep learning models. We initially started with data analysts as an audience in 2021. We had an entire interface built around a SQL-like engine that allowed them to use deep learning models immediately. That’s where the name Predibase came from. Predibase is short for predictive database, and the idea was let’s take the same types of constructs that we’ve brought for database systems and bring it towards deep learning and machine learning.
In 2023, when large language models came out in a large way, we started to think about what we wanted our platform to be. One of our very first takes and maybe one of my very first takes specifically was LLMs are another dropdown item in the menu. We needed to recognize that the market had changed how it thought about machine learning. It was no longer thinking about training models first and getting results after it. It was thinking about prompting, fine-tuning, and then re-prompting a fine-tuned model overall. Our tactics significantly shifted. We considered a product pivot that we did in 2023 to be able to actually better support large language models.
Funnily enough, it’s still in service of the vision of how do we make deep learning accessible for developers.
On how to balance maintaining an open-source community with selling to enterprises:
When it comes to open source and particularly open core models, I think the easiest argument to make is that at Uber we had a team of 50 to 100 engineers that were working on building infrastructure for training and serving models. The cost of that is quite significant, even for a company like Uber. For companies that don’t consider this part of their core business, maybe they consider it core infrastructure but it’s not differentiated IP for them, you could invest in building an entire team around it or you could just pay a company like Predibase to help solve those challenges.
The way I think about it is for us, Ludwig and LoRAX are sort of the engine, and what we’re trying to do is sell the car. There are some people, maybe advanced auto manufacturers, who just want an engine and they want to put it in their tractor or some other kind of setup. Most people want to be able to buy the fully functioning car, something that’s going to allow them to unlock the doors and have a steering wheel and other things along those lines, which in our world is the ability to connect to enterprise data sources, deploy into virtual private cloud, give you observability and deployments that you don’t necessarily get if you’re just to be able to run the open-source projects directly, and finally connect that engine to a gas line, which I think in our world will again be the underlying GPUs, cloud infrastructure that this is going to run on.
On why "bigger isn’t always better”:
My favorite customer quote is, “Generalized intelligence is great, but I don’t need my point-of-sale system to recite French poetry.” Today, they have a model that can do everything from that to French poetry to write code. There’s this intuition that when you’re using a large model like that you’re paying for all that excess capacity, both in terms of the literal dollars but also in the latency, reliability, and deferred ownership.
When I talk to customers, I think they’re very enamored with this idea of smaller task-specific models that they can own and deploy that are right-sized toward their task. What they need is a good solution for something very, very narrow. Where I think the trick is that customers have is, well, can those small models do as well as the big models? It’s very fair if you’ve played around with some of these open-source models, especially some of the base model versions, you have that intuition that they don’t do as well as the big models as soon as you start to prompt them. That’s where we’ve spent a lot of our time investing in research to figure out what actually allows a small model to do and punch above its weight and be as good as a large model.
What we found is, that if you fine-tune a much smaller model, a seven billion parameter model, a two billion parameter model…you can get at parity with or even outperform the larger models, and you can do it in a lot more cost-effective way and also be able to do it a lot faster so you don’t have to wait for that spinner that you often see with some of these larger models.
Listen here for the full recording. Thanks to Dev and Travis for joining us!
We hope you enjoyed this edition of Aspiring for Intelligence, and we will see you again in two weeks! This is a quickly evolving category, and we welcome any and all feedback around the viewpoints and theses expressed in this newsletter (as well as what you would like us to cover in future writeups). And it goes without saying but if you are building the next great intelligent application and want to chat, drop us a line!