Retrieval Augmented Generation (RAG) for Production with LangChain & LlamaIndex Course
Our FREE advanced RAG course is out!
Good morning fellow AI enthusiast! This week's iteration focuses on our new course with Towards AI, Activeloop, and the Intel Corporation Disruptor Initiative, with great support from LlamaIndex and Jerry Liu!
We are introducing this new comprehensive course on Retrieval Augmented Generation (RAG).
In this third course of the Gen AI 360 course series, we go beyond basic RAG applications, equipping you with the skills to create more complex, reliable products with tools like LangChain, LlamaIndex, and Deep Memory.
Whether you're a beginner or a seasoned professional, join us to elevate your AI skills! 😎
1️⃣ But what is RAG?
If you are experiencing issues related to hallucinations or biases using a language model (LLM), you can try using retrieval augmented generations or RAG.
RAG is simple. It is the simplest approach to injecting more content into your interactions with an LLM. Put simply: you have a dataset (required) and you use it to help the LLM answer the user queries. This is the simplest form and requires a few steps to make it work, but this is the gist of a RAG-based system:
user query -> search in a database for relevant information -> give query + info to LLM -> answer user
Jerry Liu, CEO of LlamaIndex gave a very interesting view on how to see RAG in my recent podcast episode with him:
If you think about it, rag is basically prompt engineering, because you're basically figuring out a way to put context into the prompt. It's just a programmatic way of prompt engineering. It's a way of prompting so that you actually get back some context [from a data base of yours].
In RAG, you first need documentation and only allow the LLM to query it and answer if the answer to the question is inside the documentation.
This is (currently) the best way to control your outputs and make your model safer and aligned.
For example, we recently built an AI tutor to answer AI-related questions. We wanted accurate responses for our students both in terms of accuracy (give the right answer) and in terms of relevancy (up-to-date information). With RAG, you can simply update your database if things have changed. There's no big deal if the whole PyTorch library had a big update yesterday: scrape it again and update your dataset. Voilà! You don't have to retrain a whole model or wait for GPT-4 to finally update the knowledge cutoff date!
For the most technical of you:
To build a RAG-based chatbot or application like our AI tutor, we start by ingesting all our data into memory. This is done by processing all the content we have split into chunks of text (split our textual data into fixed or flexible parts, for example, 500-character parts) to an embedding model, like OpenAI’s text-embedding-ada-002 model. This will produce embeddings that are just vectors of numbers representing your text. It will facilitate your life and allow you to compare text together easily. You can save those vectors in a memory. Then, for a new query, such as a question from the user, you can again embed it with the same model and compare it with all of your current embeddings in your memory.
Once it finds the most similar embeddings, ChatGPT is asked to understand the user’s question and intent and only use the retrieved sources of knowledge to answer the question. This is how RAG reduces hallucination risks and allows you to have up-to-date information since you can update your knowledge base as much as you want and ChatGPT, or your current language model, simply picks information from it to answer.
Plus, as you see, it cites all sources it found on the question for you to dive in and learn more, which is also a plus when you are trying to learn and understand a new topic!
So the overall process is quite straightforward; we validate the question, ensuring it is related to AI, and our chatbot should answer it, query our database to find good and relevant sources, and then use ChatGPT to digest those sources and give a good answer for the student.
Then, there are still many things to consider, like how to determine when to answer a question or not, if it is relevant or in your documentation, understand new terms or acronyms not in ChatGPT’s knowledge base, etc. Lots of things that we’ve fixed through different prompting techniques that you can learn more about in a free advanced RAG course we've built with Towards AI and Activeloop that I linked below.
A last alternative to RAG would be to fine-tune your model on your specific task! Basically, further, train a model on your own data to make it more specific and ingest the knowledge you have. I have a video comparing fine-tuning and RAG to teach you when you should consider each, but in short, RAG stays relevant as it is much cheaper to build and is better to reduce (undesired) hallucinations as you force the model to give answers based on documentation YOU control, and not simple things it ingested and hopefully will regurgitate correctly as in fine-tuned models.
We discussed both these topics with in-depth coding examples for both RAG and fine-tuning in our LLM and RAG courses for those interested.
2️⃣ Introduction video: Retrieval Augmented Generation (RAG) for Production with LangChain & LlamaIndex Course
If you are unsure whether to follow this course or not, check out the short intro video we made for it:
There's also exclusive content in this course, such as unseen videos, discussions, articles, and projects that will give you lots of invaluable insights on building more reliable, powerful, and efficient RAG-based applications!
p.s. we have a special iteration of the newsletter coming with a new video for the course later this week!
3️⃣ More about our course...
Tl;dr: The course is about showing everything about RAG and build advanced reliable RAG-based applications, and it is completely free!
Is the course for you?
If you want to learn how to build better RAG-based applications, leverage LlamaIndex, LangChain, or OpenAI toolsets better and have intermediate Python knowledge, you should be all set to take and complete the course.
This course is designed with a wide audience in mind, including beginners in AI, current machine learning engineers, students, and professionals considering a career transition to AI.
We aim to provide you with the necessary tools to apply and tailor Large Language Models and blend with your own datasets across a wide range of industries to make AI more accessible and practical.
And that's it for this iteration! I'm incredibly grateful that the What's AI newsletter is now read by over 14,000+ incredible human beings and counting. Share this iteration with a friend if you learned something new!
Looking for more cool AI stuff? 👇
Looking for AI news, code, learning resources, papers, memes, and more? Follow our newsletter at Towards AI, which is going out weekly!
Looking for other AI enthusiasts? Join my Discord community: Learn AI Together!
If you need more content to go through your week, check out the podcast!
Please reach out with any questions or details on sponsorships, or visit my Passionfroot profile to see my offers.
Thank you for reading, and we wish you a fantastic week! Be sure to have enough rest and sleep!
Louis-François Bouchard
Click here and send your custom link to your friends or on your socials!