Add Deep Research to ANY LLM

Add Deep Research to ANY LLM

This might come as a surprise for most but Deep Research is literally just a "party trick". Deep research is amazing at increasing the model's accuracy and quality, but it's entirely based upon "seeding the model with context". The context again is built using web searches and scraping the SERP, for then to use the scraped content as context. Realising there's nothing magical about it, I took the time to implement it on top of "o3-mini" as you can see in the following video.

How it works

The process is based upon allowing the LLM to perform multiple searches, scrape the SERP results, for then to continue this process recursively until it's found all information it needs to answer the original question. However, instead of me explaining it, let me show you the system instruction that adds "deep research" capabilities to our system.

## Deep Research

If the user asks you to research some topic then you are to:

1. **Initiate the Research:** Perform 2 to 5 initial searches you believe might return
   relevant information. Inform the user what you’re doing, and what search queries
   you will use. Then respond with a list of search function invocations.
2. **Scrape and Evaluate:** Once search results are returned, choose 2 to 5 relevant
   URLs to scrape from each search. Use the scrape URL function for these URLs and
   analyze the gathered content.
3. **Identify Additional Angles:** Carefully examine the scraped content for new
   keywords, topics, or details that were not covered in the initial search.
4. **Recursive Searching:** If any new topics or gaps are identified, immediately
   perform another round of search and scraping:
    - Inform the user that you are delving deeper into the topic.
    - Execute a new search with refined queries based on the uncovered details.
    - Scrape additional relevant URLs.
5. **Iterate Until Complete:** Continue the cycle of searching, scraping, and
   evaluating until you are confident that:
    - All facets of the topic have been exhaustively researched.
    - No new keywords or topics emerge that would require further research.
6. **Exclusively Use Gathered Context:** Do not answer the user's question until
   all relevant background information has been gathered exclusively from the
   searched and scraped content.
7. **Transparency:** Always keep the user informed about which step of the
   research process you are on before each function invocation.

**Remember:** Your final answer must be solely based on the context and information
acquired through this recursive research process. Do not make assumptions or
introduce external information that was not obtained via the search and scrape functions.

Yup, in case you missed the point, most of the "Deep Research" implementation is based upon prompt engineering. In addition we've added two AI functions to the type, which is as follows:

  • Search Google using SERP API
  • Scraping pages and transforming the HTML into Markdown

You can see how to do this in the following screenshot.

Adding AI function to LLM

Wrapping up

So Deep Research is basically just a "party trick", allowing the LLM to reach out into the world and collect more information as it needs more information. However, this drastically increases the capabilities of the LLM, and it can be applied to any LLM, including DeepSeek-R1.

At AINIRO we can add "deep research" capabilities towards your existing databases, and any APIs you've got from before - Or any type of information you've got access to such as for instance CSV files, XML files, PDF files, etc. If you're interested in creating similar AI agents with "Deep Research capabilities", then please reach out to us, and we can probably help you out.

Have a Custom AI Solution

At AINIRO we specialise in delivering custom AI solutions and AI chatbots with AI agent features. If you want to talk to us about how we can help you implement your next custom AI solution, you can reach out to us below.

Thomas Hansen

Thomas Hansen I am the CEO and Founder of AINIRO.IO, Ltd. I am a software developer with more than 25 years of experience. I write about Machine Learning, AI, and how to help organizations adopt said technologies. You can follow me on LinkedIn if you want to read more of what I write.

Published 5. Feb 2025

OpenAI's O3-mini versus DeepSeek R1

2 weeks ago DeepSeek dethroned OpenAI as the leading LLM provider. Today OpenAI released O3-mini and crawled back on the throne. We compared these models to see their capabilities. Read the gory details here.

Read More

Using DeepSeek from Italy

Italy just shut off access to DeepSeek from the country as a whole. Since we're nice guys, we're giving you a backdoor to try it out.

Read More

Is Deepseek Spyware

There's a lot of confusion out there today, where some are claiming Deepseek is spying on its users. Let's clarify that particular point.

Read More