AI Chatbots, to Hallucinate or Not to Hallucinate

With a high quality RAG database you can almost completely eliminate AI hallucinations. This is a really big deal for a customer service AI chatbot, since you don't want it to make up facts about products, opening hours, or services. A couple of months ago, an airliner in Canada was actually sued because of false information provided by its chatbot.

FYI, this was not an AINIRO chatbot 😏

AI Hallucinations

AI hallucinations are always a bad thing. It originates from the fact that the AI model doesn't know everything - But when it's being asked a question it doesn't know the answer to, it will try its best to "guess" the answer. It's similar to how autocomplete on your phone sometimes suggests some ridiculous word for you.

To avoid hallucinations for an AI chatbot is actually very easy. Assuming we've got high quality training data, we simply create embeddings for training snippets, for then to match questions towards these snippets. Then we provide an instruction to the LLM as follows.

You MUST answer all my questions with information found in the context. If you cannot find the answer to the question in the context, inform me that you don't know the answer, and encourage me to provide some keywords and stay on the subject you're an expert in.

That's it, no more hallucinations. If you want automatic transitions to human service agents you can add another rule.

If you don't know the answer to a question, encourage me to leave my email and name in the chatbot prompt, at which point one of your human colleagues will come back to me.

The last parts allows for transitioning the entire conversation from the AI chatbot to a customer support email inbox, at which point a human support engineer can take over the conversation where the chatbot left it.

AI leaking

Sometimes the above is not optimal. We've got a partner, a startup in the shipping business. They've got only 6 pages in their website, yet still they're in a complex industry, with a lot of technical knowledge. This knowledge is well known also for GPT4.

This allows us to explicitly turn off hallucination guard rails, such that if the context doesn't explicitly provide an answer, GPT4's base knowledge will take over, and the AI chatbot will still be able to answer the question. This is accomplished by exchanging the above instruction with something resembling the following.

You should answer all my questions with information found in the context as long as the context contains relevant information related to my question. If the context does not contain relevant information, then answer my question to the best of your abilities.

To avoid having people using the AI chatbot for completely unrelated questions, you can add one additional rule.

If I ask you about anything not related to shipping then inform me of that you're not configured to answer general questions and that you're only configured to answer questions about shipping and similar subjects.

The above two instructions results in an AI chatbot that "knows everything" in one particular subject, yet still will not be helpful if you ask it for dinner suggestions. The last point is a big deal to avoid abuse.

Wrapping up

Sometimes allowing your AI chatbot to "leak" back to the base model is a good thing. For these cases, you can use the above leaky instruction. Other times you do not want the chatbot to go outside of its training data at all. For these cases you can turn off leaks completely with the first instruction example from above.

How you want to configure your AI chatbot depends upon a lot of different factors, such as the size of your training data, how well known your problem domain is, and how severe an AI hallucination is.

If in doubt, our suggestion is to start out by turning off AI hallucinations and avoid leaking - For then to try leaking if this is not optimal. Just be careful, since if you allow for the chatbot to leak, you do run the risk of AI hallucinations.

For the Canadian airliner not turning off AI hallucinations with a system message was (obvious) madness. For others it might be the right thing to do. Just please have in mind that there are pros and cons to both approached.