Add Deep Research to ANY LLM

This might come as a surprise for most but Deep Research is literally just a "party trick". Deep research is amazing at increasing the model's accuracy and quality, but it's entirely based upon "seeding the model with context". The context again is built using web searches and scraping the SERP, for then to use the scraped content as context. Realising there's nothing magical about it, I took the time to implement it on top of "o3-mini" as you can see in the following video.

How it works

The process is based upon allowing the LLM to perform multiple searches, scrape the SERP results, for then to continue this process recursively until it's found all information it needs to answer the original question. However, instead of me explaining it, let me show you the system instruction that adds "deep research" capabilities to our system.

## Deep Research

If the user asks you to research some topic then you are to:

1. **Initiate the Research:** Perform 2 to 5 initial searches you believe might return
   relevant information. Inform the user what you’re doing, and what search queries
   you will use. Then respond with a list of search function invocations.
2. **Scrape and Evaluate:** Once search results are returned, choose 2 to 5 relevant
   URLs to scrape from each search. Use the scrape URL function for these URLs and
   analyze the gathered content.
3. **Identify Additional Angles:** Carefully examine the scraped content for new
   keywords, topics, or details that were not covered in the initial search.
4. **Recursive Searching:** If any new topics or gaps are identified, immediately
   perform another round of search and scraping:
    - Inform the user that you are delving deeper into the topic.
    - Execute a new search with refined queries based on the uncovered details.
    - Scrape additional relevant URLs.
5. **Iterate Until Complete:** Continue the cycle of searching, scraping, and
   evaluating until you are confident that:
    - All facets of the topic have been exhaustively researched.
    - No new keywords or topics emerge that would require further research.
6. **Exclusively Use Gathered Context:** Do not answer the user's question until
   all relevant background information has been gathered exclusively from the
   searched and scraped content.
7. **Transparency:** Always keep the user informed about which step of the
   research process you are on before each function invocation.

**Remember:** Your final answer must be solely based on the context and information
acquired through this recursive research process. Do not make assumptions or
introduce external information that was not obtained via the search and scrape functions.

Yup, in case you missed the point, most of the "Deep Research" implementation is based upon prompt engineering. In addition we've added two AI functions to the type, which is as follows:

Search Google using SERP API
Scraping pages and transforming the HTML into Markdown

You can see how to do this in the following screenshot.

Adding AI function to LLM

Wrapping up

So Deep Research is basically just a "party trick", allowing the LLM to reach out into the world and collect more information as it needs more information. However, this drastically increases the capabilities of the LLM, and it can be applied to any LLM, including DeepSeek-R1.

At AINIRO we can add "deep research" capabilities towards your existing databases, and any APIs you've got from before - Or any type of information you've got access to such as for instance CSV files, XML files, PDF files, etc. If you're interested in creating similar AI agents with "Deep Research capabilities", then please reach out to us, and we can probably help you out.

Add Deep Research to ANY LLM

How it works

Wrapping up

Thomas Hansen

OpenAI's O3-mini versus DeepSeek R1

Using DeepSeek from Italy

Is Deepseek Spyware

Solutions

Misc

Legal

Solutions

Case Studies

Contact Us