Open (Source) Sesame

Why are open-source models gaining traction, and why might they win in the long term?

and

Oct 27, 2023

Thanks to our now 1.6K+ subscribers! Please continue to subscribe and share with colleagues, friends, or anyone you think would enjoy staying current on the latest trends in AI :)

A few weeks ago at Madrona’s Intelligent Applications Summit, we heard a keynote from Ali Farhadi (CEO of AI2) on “The State of Open Source Models and the Path to Foundation Models at the Edge”. While the title may be a bit long in the tooth, the central subject certainly wasn’t.

At its heart, Ali made a passionate case for why open-source AI models can be so powerful, and how they can help advance the entire AI community forward in a number of ways, including their proliferation from the domain of AI researchers and now fully into production through real-world applications. In this week’s post, we thought it would be interesting to double-click into the history of open-source models and share why believe this framework will lead to meaningful value generation in the long-term, and is a necessary and powerful counterweight to open-source models.

As this is an evolving market with a variety of vendors, consumers, and end goals, we expect this post to be the first in a mini-series where we explore different parts of the open source AI stack.

You can listen to Ali Farhadi’s keynote here:

A Brief History of How We Got Here

Open-source refers to the approach by which source code of a project or product is available which allows anybody the ability to modify, inspect, or re-use the software, which ultimately drives more collaboration, transparency, innovation, and accessibility. One of the earliest pioneers of the open source movement was Linus Torvalds who developed Linux in the early 1990s. Over the years, there have been many AI-related open-source projects, most commonly PyTorch Foundation, Linux Foundation, and Apache Foundation. There are also open-source projects managed by companies like Databrick’s MLFlow and Google’s TensorFlow.

While the open-source model movement began in 2018 when Google first released BERT, since then, we’ve seen the release of T5, OpenAI’s GPT-2 and ElethurAI’s GPT-J. In 2023, we really started to see an influx of new Open-source LLMs including Meta’s LlaMa-2, Databrick’s Dolly 2.0, MosaicML MPT-7B, TTI’s Falcon 40B, BLOOM, Stable Diffusion, and many more. These open-source models have taken the world by storm given their increased performance and accuracy. HuggingFace has built its entire business model around open-source models and making these models easy to access and understand.

So what about these models makes them “open”? Open-source models are binaries of ML models that are pre-trained on very large datasets (billion-trillion parameters). The binaries themselves are released to the public for everyone to use. The underlying code and architecture include the model, the weights, parameters and optimizer state, the data set (in some cases the raw data), and scripts for configuration which ultimately allows the developers to experiment for free. We have seen a significant uptick in the release of open-source models along with applications being built on top of open source models.

Note that in the context of open source models, some refer to the idea of “Open Science” as being a more precise way of understanding the models compared to Open-Source.

A map of top open source AI projects per Decibel Ventures

Open Source Models - Pros vs. Cons

What are some of the pros and cons of open-source models vs. closed-source models?

Pros of Open-Source:

Cost - Open-source models are pre-trained models that can be developed cheaper (and sometimes faster) than closed-source models. Vicuna is an example of a cost-efficient LLM that achieved 90% of ChatGPT’s capabilities despite costing only $300.
Control - Flexibility for the developer to adjust the weights and parameters in the model, fine-tune, train, and ultimately enable better outputs from the models. For example, techniques like LoRA, a training method that accelerates the training of large models while consuming less memory and has other key advantages around memory and scaling, can be useful for fine-tuning LLMs.
Security - Ability to control the data (inflow and outflow) and bring your own proprietary data. Models are typically run on native devices which are also more secure, especially for regulated industries like the healthcare and financial sectors.
Transparency - A generally better understanding of what is inside the model, compared to closed-source models which are a bit more “black box”. Adopting open-source approaches for AI technology can also create more checks and balances, with more developers reviewing and deploying in a community-driven way.
Latency - Open-source models are typically smaller and run on your own dedicated instances which leads to lower end-to-end latencies.
Democratization of AI - Instead of having models be ruled by a few, models can be broadly distributed and understood by many.

Cons of Open-Source:

Requires Technical Expertise: Deploying open-source models is challenging. Deployment of open-source models is more cumbersome today than calling an OpenAI API. Open-source models are most performant when paired with techniques like retrieval augmented generation (RAG) and fine-tuning which require more time, understanding, and know-how.
Model Performance - Model performance of open-source models still isn’t as high as closed-source models and will typically result in hallucination. Some industry observers make the case that open-source models are still a year or two behind closed-source models. It will take some time to get the open-source models to be as performant as closed-source models, but we are already starting to see significant improvements here.
Misuse of LLMs - Ability to get in the wrong hands and use for malicious reasons. While this could technically be argued for both open-source and closed- source models, the open nature of these models can expose sensitive data and proprietary information. We may see more deep fakes, automated misinformation and disinformation, and additional security hacks.

Why Might Open-Source Models Win?

Why would open-source win over closed-source models in the long term?

LLMs become business critical and differentiating factor: If you are building an AI native product, LLMs may be at the core of the business. The ability to create more performant and accurate models may be the differentiating factor for generative native applications. Would you trust OpenAI, and other closed-source model providers to ingest your data and have it be used to re-train their models? Would you want your entire business to be at the mercy of one model-provider? If you’re a Gen-native business relying on LLMs as your core differentiator, will you want to “own” that LLM in the end of the day and have ultimate control over it? Today we see companies building and training their own LLMs (which results in expensive GPUs), but open-source models may be a strong alternative to this as it’s much lower cost, but also alleviates concerns of relying solely on closed-source models.
Open-source models will allow for more efficient and niche use cases to emerge (e.g., Biology or Finance): We believe that open-source machine learning and open science can really accelerate advancements in industries that require a lot of research along with unique and large datasets. Open-source models may provide more decentralized and collaborative research avenues where researchers can discuss and develop together while sharing different methodologies and research sets.
Open-source models are not “black boxes” the way closed-source models are: Developers are able to tweak the weights and parameters of the model as well as fine-tune them for more specific use cases, which today is much harder to do with closed-source models.
We’re already starting to see signs of generative-native AI apps being built on top of open-source models, resulting in significant traction: Open-source models are being deployed and implemented across AI applications like RunwayML and MidJourney. Beyond open-source model deployment, companies like LangChain and LlamaIndex have become the developers' open-source haven for building and deploying intelligent and generative apps.

Open Source models and projects have exploded in popularity

Conclusion

Today, the most popular way to access AI is through closed-source models (like OpenAI or Anthropic). Part of their success has come from developing incredibly powerful models and attaching those to killer consumer use cases (like ChatGPT). The magic lies in the ability for virtually anyone (technical or not) to call OpenAI’s API and immediately get up and running with their latest technology, along with the flexibility of building “out of the box”.

However, the proliferation of powerful open-source models, particularly ones like Llama-2 from Meta, have indicated that high-performance and low-cost do exist, and have elicited a highly positive response from the developer community. We fully expect that these models will continue to get better over time, and ultimately overcome near-term obstacles (such as friction around implementation, ease-of-use issues, required resources, etc.).

To be clear, we believe “open source vs. closed source” is a false dichotomy. These two are not mutually exclusive. Rather, we envision a future where companies, teams, and developers will mix and match which models they use, and employ both open-source and closed-source models depending on the task at hand (such as using GPT-4 for text summaries and Stable Diffusion for image generation).

The open-source software movement of the 1990s and early 2000s helped lead to the evolution of massive OSS companies like Red Hat, Hashicorp, Confluent, and MongoDB. We are excited to see which new giants come out of the current open-source AI model movement!

We hope you enjoyed this edition of Aspiring for Intelligence, and we will see you again in two weeks! This is a quickly evolving category, and we welcome any and all feedback around the viewpoints and theses expressed in this newsletter (as well as what you would like us to cover in future writeups). And it goes without saying but if you are building the next great intelligent application and want to chat, drop us a line!

Share Aspiring for Intelligence

Aspiring for Intelligence