Is RAG Dead?

RAG vs. Long Context Models

Nov 27, 2024

Good morning everyone!

Here's the third video of 10 for the "From Beginner to Advanced LLM Developer" course by Towards AI. In this one, we talk about a highly-quoted topic: RAG or long context. Many people ask us why we use RAG if Gemini can take millions of tokens as inputs.

So... is it over for RAG pipelines? Let's discuss it...

But first, here's a quick word from our sponsor: Ryse!

1️⃣ Missed Out on Shark Tank’s Big Wins? Don’t Miss RYSE (sponsor)

The hit show Shark Tank has introduced the world to some of today’s most successful brands:

Bombas – raised $200M in follow-on investment
Scrub Daddy – over $300M in sales
Ring – valued at $7M on Shark Tank, later acquired by Amazon for $1.2B, after all the Sharks passed!

Now, Dragon’s Den (the International version of Shark Tank) is proving to be another launchpad for promising brands, and RYSE Smart Shades secured not just one, but two offers from the Dragons. For a limited time, you have the chance to invest alongside the Dragons in a brand that could become the next household name.

With their breakthrough smart shade technology and distribution already in 127 Best Buy stores, RYSE is poised to be a big thing in tech!

This is your chance to invest early in a smart home company with big momentum

Invest in RYSE and learn how to earn bonus shares!

2️⃣ RAG vs. Long Context Models: Is Retrieval-Augmented Generation Dead?

Many people say that RAG is dead now that we see all the new models coming out with large context windows, like GPT-4o Mini, which can process up to 128,000 input tokens, or, worse, Gemini 1.5 Pro, which can process 2 million tokens. For context, 2 million tokens are is equivalent to 3,000 pages.

So, do we still need to do retrieval-augmented generation, knowing that better models will continue to emerge in the short and long term with an increased context window and capabilities?

I don’t think so. Here's why (or read the article here)👇

Follow the YouTube channel to see all the others!

3️⃣ From Beginner to Advanced LLM Developer Course

We did our best to create the most practical and in-depth LLM Developer course out there (~90 lessons) for software developers, machine learning engineers, data scientists, aspiring founders or AI/Computer Science students.

We’ve gathered everything we worked on building products and AI systems and put them into one super practical industry-focused course. Right now, this means working with Python, OpenAI, Llama 3, Gemini, Perplexity, LlamaIndex, Gradio, and many other amazing tools (we are unaffiliated and will introduce all the best LLM tool options). It also means learning many new non-technical skills and habits unique to the world of LLMs.

Check it out here or look at our book Building LLMs for Production to learn more!

And that's it for this iteration! I'm incredibly grateful that the What's AI newsletter is now read by over 20,000 incredible human beings. Click here to share this iteration with a friend if you learned something new!

Looking for more cool AI stuff? 👇

Looking for AI news, code, learning resources, papers, memes, and more? Follow our weekly newsletter at Towards AI!
Looking to connect with other AI enthusiasts? Join the Discord community: Learn AI Together!

Want to share a product, event or course with my AI community? Reply directly to this email, or visit my Passionfroot profile to see my offers.

Thank you for reading, and I wish you a fantastic week! Be sure to have enough sleep and physical activities next week!

Louis-François Bouchard

Is RAG Dead?

RAG vs. Long Context Models

1️⃣ Missed Out on Shark Tank’s Big Wins? Don’t Miss RYSE (sponsor)

2️⃣ RAG vs. Long Context Models: Is Retrieval-Augmented Generation Dead?

3️⃣ From Beginner to Advanced LLM Developer Course

Discussion about this post