Technology

“AI trackers are ruining the Internet”: bots are threatening even Wikipedia

Share
Share

The Internet Is Under Siege — Not by Cybercriminals, but by a Growing Wave of AI Bots Consuming Bandwidth Like Never Before

Their mission: to crawl and collect vast amounts of content — text, images, and video — to train language models and image generators. But the cost of this activity is being borne by key pillars of open knowledge, like Wikimedia, and thousands of open-source projects operating on limited resources.

Since early 2024, the Wikimedia Foundation has reported a 50% increase in bandwidth usage, particularly in its multimedia repository, Wikimedia Commons. During peak moments — such as following the death of former U.S. President Jimmy Carter — this surge in traffic caused slow page loads and overwhelmed connections for readers.

What’s most concerning is that this isn’t due to increased human interest. The majority of this traffic comes from automated bots — many of them unidentified — scraping content to feed AI systems.

In practice, this means that nearly 65% of connections to Wikimedia’s central servers are being used by crawlers that ignore basic protocols like the robots.txt file, traditionally used to limit automated access to websites.

Wikimedia operates on a “knowledge as a service” model: its content is free and openly reusable — a cornerstone in the development of search engines, voice assistants, and now AI models. But that very openness is starting to work against it.

The situation is even more critical for small open-source projects maintained by communities or individual developers. Many are watching their limited resources being drained by AI bot traffic, causing operating costs to skyrocket — or worse, forcing projects offline altogether.

Gergely Orosz, developer and author of The Software Engineer’s Guidebook, experienced this firsthand: data usage on one of his projects increased sevenfold in a matter of weeks, forcing him to pay penalties for exceeding bandwidth limits.

In response, some developers are going on the offensive. Community-built tools like Nepenthes and corporate solutions like Cloudflare’s AI Labyrinth are deploying “tarpits” — traps filled with fake or irrelevant content (often also AI-generated) designed to confuse and exhaust bots, wasting their resources without providing useful data.

At the heart of this crisis lies a fundamental contradiction: the same openness that enabled AI to flourish is now threatening the survival of the open platforms that made it possible. AI companies benefit from free and open content, but do not contribute to the infrastructure that sustains it. This outsourcing of costs puts the sustainability of the ecosystem at serious risk.

If no new consensus is reached, the greatest threat isn’t that AI will run out of data — it’s that the open spaces feeding it may shut down from exhaustion.

 

Share
Related Articles
Technology

Ukraine captured the new Russian cruise missile

Ukrainian forces had long been tracking a peculiar Russian drone—unlike the typical...

Technology

The Marketing Trap in Hardware: Empty Terms and Inflated Promises

In the world of hardware, it’s not uncommon to come across names...

Technology

The iconic slogan “Intel Inside” makes a comeback, this time with a fresh and modern twist

Many years ago, Intel introduced its processors to the public for the...

Technology

iPhone 17 Pro cases leak, confirming its new design

A major design change is expected for the iPhone 17 Pro, although...

Technology

The Switch 2 features an NVIDIA chip designed for autonomous vehicles, although with limited functionality

Last week’s presentation of the Nintendo Switch 2 cleared up many questions...

Technology

OpenAI to Add Watermarks to AI-Generated Images Following Ghibli-Style Controversy

AI-generated images mimicking the distinctive style of Studio Ghibli have gone viral...

Technology

Nintendo delays pre-orders for the Switch 2 in the United States due to tariffs

Gamers in the United States eager to reserve Nintendo’s highly anticipated new...