Add Deep Research to ANY LLM

This might come as a surprise for most but Deep Research is literally just a "party trick". Deep research is amazing at increasing the model's accuracy and quality, but it's entirely based upon "seeding the model with context". The context again is built using web searches and scraping the SERP, for then to use the scraped content as context. Realising there's nothing magical about it, I took the time to implement it on top of "o3-mini" as you can see in the following video.
How it works
The process is based upon allowing the LLM to perform multiple searches, scrape the SERP results, for then to continue this process recursively until it's found all information it needs to answer the original question. However, instead of me explaining it, let me show you the system instruction that adds "deep research" capabilities to our system.
## Deep Research
If the user asks you to research some topic then you are to:
1. **Initiate the Research:** Perform 2 to 5 initial searches you believe might return
relevant information. Inform the user what you’re doing, and what search queries
you will use. Then respond with a list of search function invocations.
2. **Scrape and Evaluate:** Once search results are returned, choose 2 to 5 relevant
URLs to scrape from each search. Use the scrape URL function for these URLs and
analyze the gathered content.
3. **Identify Additional Angles:** Carefully examine the scraped content for new
keywords, topics, or details that were not covered in the initial search.
4. **Recursive Searching:** If any new topics or gaps are identified, immediately
perform another round of search and scraping:
- Inform the user that you are delving deeper into the topic.
- Execute a new search with refined queries based on the uncovered details.
- Scrape additional relevant URLs.
5. **Iterate Until Complete:** Continue the cycle of searching, scraping, and
evaluating until you are confident that:
- All facets of the topic have been exhaustively researched.
- No new keywords or topics emerge that would require further research.
6. **Exclusively Use Gathered Context:** Do not answer the user's question until
all relevant background information has been gathered exclusively from the
searched and scraped content.
7. **Transparency:** Always keep the user informed about which step of the
research process you are on before each function invocation.
**Remember:** Your final answer must be solely based on the context and information
acquired through this recursive research process. Do not make assumptions or
introduce external information that was not obtained via the search and scrape functions.
Yup, in case you missed the point, most of the "Deep Research" implementation is based upon prompt engineering. In addition we've added two AI functions to the type, which is as follows:
- Search Google using SERP API
- Scraping pages and transforming the HTML into Markdown
You can see how to do this in the following screenshot.
Wrapping up
So Deep Research is basically just a "party trick", allowing the LLM to reach out into the world and collect more information as it needs more information. However, this drastically increases the capabilities of the LLM, and it can be applied to any LLM, including DeepSeek-R1.
At AINIRO we can add "deep research" capabilities towards your existing databases, and any APIs you've got from before - Or any type of information you've got access to such as for instance CSV files, XML files, PDF files, etc. If you're interested in creating similar AI agents with "Deep Research capabilities", then please reach out to us, and we can probably help you out.
Have a Custom AI Solution
At AINIRO we specialise in delivering custom AI solutions and AI chatbots with AI agent features. If you want to talk to us about how we can help you implement your next custom AI solution, you can reach out to us below.