AI bots are crawling websites and copying content into their language models. Not just headlines or meta descriptions, but whole blog posts, service pages, FAQs, anything they can reach. In most cases it happens without permission, without a visit that registers in your analytics, and without credit. The AI company gets the value. Your site gets nothing.
Cloudflare, which I use to protect and manage all the sites I host, introduced a significant response to this in 2025. If you have not come across Cloudflare before: it sits between your website and the wider internet, handling DNS, security, caching, and firewalls. Every visitor passes through it before reaching your server, which means you can filter traffic, block attacks, and improve speed before anything hits your hosting. Cloudflare introduced the option to block AI bots by default, covering crawlers from OpenAI, Anthropic, Google’s AI crawler, and several others. They have also rolled out a pay-per-crawl feature that gives site owners the option to charge AI companies for accessing their content at all. It uses HTTP 402 status codes, which in plain terms means “not without payment.” It is still a relatively new model, but it is an important one: it reasserts that the content on your site has value.
Cloudflare is not the only one reacting to this. The BBC sent a legal warning to AI startup Perplexity demanding deletion of all scraped BBC content, a record of what had been taken, and payment to cover the misuse. This is not a future policy debate. It is happening now, to real organisations, over real content.
I have updated every site I manage with a strict firewall rule. Unless a bot is on the allow list, things like Googlebot or Bingbot that serve a legitimate purpose for your search visibility, it is blocked. AI bots, content scrapers, broken SEO tools, and fake browsers are denied at the Cloudflare edge and never reach the website itself. This is not a robots.txt file, which is essentially a polite request that well-behaved bots might or might not follow. It is a proper firewall. The results across the sites I manage have been consistent: spam traffic down, malicious login attempts almost eliminated, and server load reduced. None of it adds plugins or bloat to your WordPress install. It runs in the background and keeps things clean.
Whether to block AI bots entirely is a decision worth thinking through rather than defaulting either way. For a professional services business, a financial planner, a local tradesperson, or a consultancy, there is a reasonable argument for allowing certain AI crawlers. If someone asks an AI assistant a relevant question and your content is referenced in the response, that can still lead to an enquiry, even without a direct click. For a content-heavy site, a blog, a portfolio, or anything where the writing itself is the value, there is much less reason to allow it. The AI system serves your content directly to the user, with no visit, no pageview, and no credit to you. That is your work being used for someone else’s product.
If you do not know what bots are visiting your site, it is worth finding out. Client logs I have reviewed have shown thousands of hits per day from AI tools and content scrapers that never generate a single real visit. All they do is consume bandwidth and remove content from its context. Sorting this out properly is something I include as standard in my digital support service, and it is built into my website design process from day one. If you want to know what is crawling your site and what to do about it, get in touch and I will take a look.




