Keyword Search for Any SQL Database

Keyword Search for Any SQL Database

Semantic filtering is really cool, but it assumes you know which columns to filter on. It also makes it difficult to search for multiple keywords and score records having all keywords first. VSS again requires embeddings, which is a resource intensive process to generate, and also requires complex database and logic changes to your existing data. Meet distinct keyword density search.

Distinct keyword density search however, can be wrapped around any SQL database you've got. It will score records having more of the specified keywords higher, while still including records having only one of the specified keywords. This makes it far superior to finding the "needle in the haystack" when retrieving RAG data to seed your LLM invocation with context. The reason is because VSS search will very often not prioritise records with all keywords, and often match for instance "foo 123" to a query being "foo 567", which for some problem domains is simply wrong.

From an AI Agent point of view it's simply superior to finding "the needle in the haystack"

For a Shopify AI chatbot for instance, if I am searching for "Diesel watch for women", I really do not want to seed my LLM invocation with records matching "Diesel watch for men", but VSS will happily score such records very high, resulting in lots of useless context data being transmitted to the LLM, which it can't leverage for anything useful. But a distinct keyword density search with these keywords.

  • watch
  • women
  • diesel

Will ensure records containing the above keywords in any of its text fields will score high up in the result set, allowing the Shopify AI chatbot LLM to work with relevant data instead of lots of irrelevant information originating from a VSS distance calculation.

CRUD generator

Our CRUD generator does a lot more than simply generating CRUD HTTP endpoints. It creates aggregate endpoints, distinct endpoints, count endpoints, etc. We're now in the process of implementing distinct keyword density search to it as well.

This implies you can use RAG on any SQL database you already have, without importing or changing your existing code or logic in any ways. To understand the benefits, let mw illustrate how it's working in a video for you.

Wrapping up

Distinct keyword density search comes as an addition to our already very high quality VSS database, but can for some specific domains retrieve much higher quality data to use as context as you invoke the LLM to answer questions. In addition, it can be wrapped around any existing SQL database, without requiring modifications.

This makes it more easily consumed, in addition to using much less resources and requiring zero changes to your existing data structure, and/or logic.

Have a Custom AI Solution

At AINIRO we specialise in delivering custom AI solutions and AI chatbots with AI agent features. If you want to talk to us about how we can help you implement your next custom AI solution, you can reach out to us below.

Thomas Hansen

Thomas Hansen I am the CEO and Founder of AINIRO.IO, Ltd. I am a software developer with more than 25 years of experience. I write about Machine Learning, AI, and how to help organizations adopt said technologies. You can follow me on LinkedIn if you want to read more of what I write.

Published 24. Jan 2025

Create AI Agent CRM Systems in Minutes

According to Jensen Huang, Mark Zuckerberg, and Marc Benioff, we're at the end of the line as software developers because of AI agents being able to replace us entirely. Let's create a CRM AI Agent and see.

Read More

Will AI Replace Software Developers?

Mark Zuckerberg, Jensen Huang, and most other top Nasdaq CEOs believes they'll be replacing senior software developers with AI agents in 2025. However, is it really that simple?

Read More

Create SQL-based AI Agents from Natural Language Input

By transpiling from natural language to SQL, you can create AI agents accessing your SQL database entirely without coding. No-Code + AI + Magic == AI Agents

Read More