Key Takeaways
1. Google has long been the leading web crawler, but now faces competition from AI firms like OpenAI, which have developed their own bots to gather internet data.
2. A study by Hostinger found that OpenAI’s GPT bot accessed 4.4 million websites, surpassing Google’s bot, which accessed 3.9 million.
3. Other bots, including those from Ahrefs, Anthropic, Meta, TikTok, Bing, and Apple, collectively make about 1.4 billion daily requests for information across the 5 million sites studied.
4. Different bots target various sections of the web, allowing them to create a comprehensive map of the internet over time.
5. About 80% of web queries originate from U.S. tech firms, indicating that a few major companies control the indexing process and influence the content and answers provided by AI systems.
For many years, Google has been the go-to source for gathering information online and is often seen as the best example of a web crawler. These automated tools explore the web and record what they find, helping search engines make sites easy to find. However, Google Search is now up against rivals, as AI technologies also need internet data. This has led AI firms like OpenAI to create their own bots to scour the web for information.
Study on Website Crawlers
In late August 2025, Hostinger, a web hosting company, carried out research on how accessible 5 million websites are to crawlers. Notably, OpenAI’s GPT bot managed to reach 4.4 million of these websites, surpassing Google’s bot, which only accessed 3.9 million. Other lesser-known bots, including Ahrefs’ SEO crawler and Anthropic’s Claude bot, along with crawlers from Meta, TikTok, Bing, and Apple, were also quite active, collectively making roughly 1.4 billion daily requests for the 5 million sites.
Coverage of Internet by Bots
The fact that some bots cover fewer websites than others doesn’t mean they overlook certain areas of the web. Instead, these programs alternate their targets, which allows them to build a nearly complete map of the internet over time, usually in just a few weeks.
The research also found that about 80% of queries come from tech firms in the United States, while around 10% come from China, with other countries contributing only a tiny fraction. This indicates that the process of indexing the internet is largely controlled by U.S. providers, particularly a few major tech companies. Consequently, a small group of platforms significantly shapes what content is accessible and what answers AI systems provide.
Hostinger via Presseportal
Source:
Link


Leave a Reply