The death of RAG

RAG vs Long Context Length

Louis-François Bouchard

Francois Huppe-Marcoux

, and

Omar Solano

Jul 11, 2024

∙ Paid

Good morning everyone!

Today, we’re diving into “the death of RAG.”

Many clients told us, “But why would I use RAG if Gemini can process millions of tokens as input?”

So, is RAG dead?

With the rapid advancements in LLMs and especially their growing context window size (input text size), many people now think doing RAG with Long Context models is no longer necessary. For example, OpenAI’s gpt-4-0314 model from March 14th, 2023, could only process up to 8k tokens. Now, gpt-4o can process up to 128k tokens, while gemini-1.5-pro can now process up to 2M tokens. This is ~3,000 pages of text!

We'll demystify the differences between RAG and sending all data in the input, explaining why we believe RAG will remain relevant for the foreseeable future. This will help you determine whether RAG is suitable for your application.

About RAG…

Keep reading with a 7-day free trial

Subscribe to High Learning Rate to keep reading this post and get 7 days of free access to the full post archives.