I suspect we already had the best web scraper in the industry for AI and LLM. However, we have somehow managed to increase its quality 10x as of today. If you want to try it out, you can create a demo chatbot here.
Why bother you may ask? Well, the problem with AI and ChatGPT specifically is that it suffers from the exact same problems as everything else related to computing. Which is as follows ...
Garbage in, garbage out!
Unless you can somehow provide super high quality input to it, it will simply return inferior quality back to you.
If you can somehow provide it with super high quality input, it will return super high quality output to you - And it all starts with how your website is scraped. Hence, the quality of the scraper becomes crucial to create high quality AI chatbots.
How it works
In the above link we've already explained all the basics, such as how our scraper chops up pages into multiple "training snippets". The new thing in the current release, is that we're able to also use
DIV elements from your page, in addition to that we will create specific training snippets for images, allowing images to be loosely associated with queries. The latter is kind of a big deal since when displaying images, relevant images might not (only) be found where the images are physically found on your website.
This allows our chatbot technology to find and display images much more frequently than before, and also typically display more relevant images.
In addition to this, we've updated the scraper to tolerate almost anything. As long as your site is not a SPA, or blocking scrapers, our scraper will somehow be able to extract meaningful content from it. On top of this, our scraping technology will now respect your robots.txt file, and not scrape unless given permission. If you want to prevent our scraper from scraping our website you can stop it similarly to how you stop OpenAI's scraper, except ours is named
AINIRO. In addition to the above, we can now also scrape password protected websites, though this require a small amount of manual work from our side.
Below is a screenshot of our website scraper while working.
New demos for all
To celebrate our new scraper, and to allow for everybody to test its quality, we've decided to allow everybody that have previously created a demo chatbot to create one more demo chatbot. In case you tried our demo previously and you weren't satisfied, you can now try it again to see the quality difference.
Notice - If you tried to create a demo chatbot previously and it didn't work, it will highly likely work now. In fact, the only thing our scraper doesn't tolerate as far as we know are SPA sites, in addition to sites explicitly blocking scrapers. However, even when it doesn't work, the new scraper will give you feedback about why it doesn't work - Allowing you to fix your site such that we can scrape it later.