Technology

“AI trackers are ruining the Internet”: bots are threatening even Wikipedia

Share
Share

The Internet Is Under Siege — Not by Cybercriminals, but by a Growing Wave of AI Bots Consuming Bandwidth Like Never Before

Their mission: to crawl and collect vast amounts of content — text, images, and video — to train language models and image generators. But the cost of this activity is being borne by key pillars of open knowledge, like Wikimedia, and thousands of open-source projects operating on limited resources.

Since early 2024, the Wikimedia Foundation has reported a 50% increase in bandwidth usage, particularly in its multimedia repository, Wikimedia Commons. During peak moments — such as following the death of former U.S. President Jimmy Carter — this surge in traffic caused slow page loads and overwhelmed connections for readers.

What’s most concerning is that this isn’t due to increased human interest. The majority of this traffic comes from automated bots — many of them unidentified — scraping content to feed AI systems.

In practice, this means that nearly 65% of connections to Wikimedia’s central servers are being used by crawlers that ignore basic protocols like the robots.txt file, traditionally used to limit automated access to websites.

Wikimedia operates on a “knowledge as a service” model: its content is free and openly reusable — a cornerstone in the development of search engines, voice assistants, and now AI models. But that very openness is starting to work against it.

The situation is even more critical for small open-source projects maintained by communities or individual developers. Many are watching their limited resources being drained by AI bot traffic, causing operating costs to skyrocket — or worse, forcing projects offline altogether.

Gergely Orosz, developer and author of The Software Engineer’s Guidebook, experienced this firsthand: data usage on one of his projects increased sevenfold in a matter of weeks, forcing him to pay penalties for exceeding bandwidth limits.

In response, some developers are going on the offensive. Community-built tools like Nepenthes and corporate solutions like Cloudflare’s AI Labyrinth are deploying “tarpits” — traps filled with fake or irrelevant content (often also AI-generated) designed to confuse and exhaust bots, wasting their resources without providing useful data.

At the heart of this crisis lies a fundamental contradiction: the same openness that enabled AI to flourish is now threatening the survival of the open platforms that made it possible. AI companies benefit from free and open content, but do not contribute to the infrastructure that sustains it. This outsourcing of costs puts the sustainability of the ecosystem at serious risk.

If no new consensus is reached, the greatest threat isn’t that AI will run out of data — it’s that the open spaces feeding it may shut down from exhaustion.

 

Share
Related Articles
TechnologyWorld

SpaceX Boosts Amazon: Launches 24 Kuiper Satellites to Challenge Starlink

On July 16, 2025, SpaceX successfully launched the KF-01 mission using a...

Elon Musk looking to the side while wearing a neat black suit
PoliticsTechnology

Elon Musk loses $34 billion in net worth after split with Trump and Tesla stock plunge

On Friday, June 6, 2025, Elon Musk experienced one of the sharpest...

CompaniesTechnology

Microsoft Announces Nearly 9,000 Job Cuts as It Doubles Down on AI

Microsoft has confirmed its largest round of layoffs in two years: approximately...

Cloudflare (con su logo: una nube naranja) sobre un fondo blanco
CompaniesTechnology

Cloudflare launches tool allowing websites to charge AI bots

What is Cloudflare’s new tool and how does it empower digital content...

Technology

Ukraine captured the new Russian cruise missile

Ukrainian forces had long been tracking a peculiar Russian drone—unlike the typical...

Technology

The Marketing Trap in Hardware: Empty Terms and Inflated Promises

In the world of hardware, it’s not uncommon to come across names...

Technology

The iconic slogan “Intel Inside” makes a comeback, this time with a fresh and modern twist

Many years ago, Intel introduced its processors to the public for the...