Good morning fellow AI enthusiast! This week's iteration focuses on how we built an open-source RAG-based ChatGPT Web App.
Let's dive into how to leverage the best model by building with ChatGPT, introducing memories into LLMs, and the challenges (and our tips) when building RAG-based apps!
If that sounds interesting, you'll also love this report I found online that is completely free, and they were nice enough to sponsor this iteration and the video!
Receive the weekly digest right in your emails ⬇️
1️⃣ Check out the AI Infrastructure Alliance’s Guide to Agents, LLMs and Smart Apps (sponsor)
This big guide covers everything from prompt engineering to major frameworks like LlamaIndex, LangChain, Haystack, and Semantic Kernel, to vector DBs, fine-tuning, open source and closed source generative models, legal implications, common app design patterns, LLM logic and reasoning and more. Last year, chat apps like GPT created the current generative AI explosion, but it’s apps that do real work for people that will unleash the coming wave of next-gen AI software, and this guide helps get you ready to write or use those apps today.
2️⃣ How we Built an Open-Source RAG-based ChatGPT Web App: Meet Our new AI Tutor!
This is Towards AI's new AI tutor, a question-answering chatbot built to answer anything about LLMs with up-to-date information!
Discover how Retrieval Augmented Generation (RAG) technology can provide precise, real-time answers to your questions with up-to-date information.
p.s. This is a work in progress. Please try it and give us feedback; it is free!
3️⃣ For the more technicals of you... Here's how the AI tutor is built!
Our Generative AI retrieval augmented generation (RAG) Application was developed by Towards AI Inc building on top of the open source QA documentation chatbot Buster project. We make use of OpenAI’s ada-002 embeddings model and GPT-Turbo 16k large language model for embedding and querying respectively. We experimented with multiple different OpenAI models to find the best balance of cost, context length and capability. We experimented with multiple changes to implementation, prompts, sourcing and chunking methods in combination with testing and user feedback to get to our final model. We use Activeloop to store and query our data as well as their Deep Memory feature to improve accuracy when querying our embeddings. We currently have over 21’000 chunks of data of around 500 words each from our ~100 lessons (including code projects) from the GenAI 360 “LangChain & Vector Databases in Production” and “Train & Fine-Tune LLMs for Production” courses, up-to-date Towards AI’s historic content library of ~4,000 AI tutorials and articles written by our contributor network, AI wikipedia pages and Hugging Face Transformers library documentation. Many of these contain information on new models and techniques which did not exist at the knowledge cut off dates of GPT-3 and GPT-4 training.
And that's it for this iteration! I'm incredibly grateful that the What's AI newsletter is now read by over 13,000+ incredible human beings and counting. Share this iteration with a friend if you learned something new!
Looking for more cool AI stuff? 👇
Looking for AI news, code, learning resources, papers, memes, and more? Follow our newsletter at Towards AI, which is going out weekly!
Looking for other AI enthusiasts? Join my Discord community: Learn AI Together!
If you need more content to go through your week, check out the podcast!
Please reach out with any questions or details on sponsorships, or visit my Passionfroot profile to see the sponsorship offers.
Thank you for reading, and we wish you a fantastic week! Be sure to have enough rest and sleep!
Louis