RAG, the Practical Path to AI
When ChatGPT went viral in December of 2023 people flocked to AI like crazy. 18 months later we all know about AI hallucinations. One dreaded example being a lawyer who used ChatGPT without verifying its output resulting in ChatGPT citing 6 non-existent cases.
RAG can easily fix this. RAG implies Retrieval-Augmented Generation, and it allows you to "tilt" the AI in a direction, and force it to use your data as its "context". If the lawyer in the above story had used RAG he would have cited real cases instead of made up cases. The reason for this is because a RAG database can be given all court cases easily, and as the lawyer asks questions to the AI, relevant cases can be easily found in the RAG database, and given as "context" to the AI.
RAG almost completely aliminates AI hallucinations
RAG, an "IQ pill" for the AI
In addition to eliminating AI hallucinations, RAG also increases the AI's "IQ". It allows the underlying LLM to learn new ideas on the fly, without going through expensive training. Training can easily cost you thousands of dollars. Providing RAG-based context to the LLM rarely costs you more than some few cents.
If you imagine the AI as Einstein, a super smart scientist - Then you can imagine an AI with RAG as the equivalent of Einstein with a library
The idea being that even Einstein does mistakes every now and then. But if Einstein has a high quality library, where he can lookup information, obviously he will be able to answer questions more accurately, and also answer questions he's got no idea how to answer.
How RAG Works
The idea behind RAG is that you first create a database of knowledge, a "library" of knowledge. This library is a database of small facts. Each fact can be one individual piece of information, such as court cases for our lawyer example above.
When you've got your library of knowledge you use the AI to create "embeddings". An embedding is a multi-dimensional vector in what we computer scientists refers to as a hyperplane. A normal vector can have 3 dimensions, but an embedding can have thousands of dimensions. Then you associate each embedding with your facts.
When the AI is asked a question, you create embeddings for the question. Once you've got embeddings for both your knowledge database and your question, the rest is literally linear algebra to find training snippets from your knowledge database that is the closest to your embeddings for your question.
From here the rest of the process is basic distant calculations, stuff we teach our kids in high school
Then you extract some 5 to 10 of the top relevant training snippets from your RAG database, you pass these into the AI instructing it to only use your context to answer your question. Finally you add the question to the invocation to the AI and tell it to answer your question.
The end result becomes that the AI will exclusively use information found in your context to answer your questions. In such a way RAG actually has a lot in common to traditional web search, because the process of finding relevant context is actually a search process.
The process is similar to uploading a document to ChatGPT and have it answer questions about the document, except instead of using a single document the AI has access to thousands of "micro documents" from your RAG database automatically.
RAG-based AI Chatbots
At AINIRO we've delivered hundreds of such RAG-based AI chatbots. Some of our deliveries have been publicly available AI chatbots such as the one we've got in the bottom / right corner of our website. Others have been private AI Expert Systems only accessible for users with the right username and password combination.
This allows us to deliver customised AI chatbots, experts in one particular field, such as for instance legal AI chatbots, customer service AI chatbots, AI chatbots for real estate, etc.
The end result becoming that we can take your existing database of knowledge, couple it with OpenAI and their ChatGPT, and deliver a super high quality AI chatbot to you without AI hallucinations that's 100x better on your particular domain - Whatever your domain happens to be.
Wrapping Up
LLMs are amazing. However, they suffer from a lot of problems, such as AI hallucinations for instance. By adding RAG to the AI it becomes 100x "smarter". This allows you to use your existing company data as RAG information, and couple it with the AI, resulting in that the AI can easily solve your problems, almost regardless of what your problem is.
If you want to try it out for free, you can create a demo AI chatbot here. The demo is of course just a tiny taste of a real RAG-based AI chatbot, but it should give you a small taste of what a fully fledged professional-grade AI chatbot based upon RAG can do.
You will need a website to try it on, such as your company's website or something - But if you do, you can easily create a demo AI chatbot that's based upon the information found on your website. This allows you to create a free AI chatbot based upon your website, and play with RAG for a week without having to pay anything or commit to a professional plan. If you later want to turn it into a permanent AI chatbot, our prices starts out at $29 per month.
Have a Custom AI Solution
At AINIRO we specialise in delivering custom AI solutions and AI chatbots with AI agent features. If you want to talk to us about how we can help you implement your next custom AI solution, you can reach out to us below.