Block AI Crawlers, Save Tears: Ditching Vercel for Cloudflare Pages
Have you tried self-hosting with Next.js? What was your experience like?
Cloudflare recently announced their decision to block AI crawlers for customers who host with them and offer pay-per-crawl in private beta, which is quite exciting given the declining usability and relevance of Google search.
As a small gesture of meaningless resistance under political and economic despair, I’m migrating my Next.js blog from Vercel to Cloudflare Pages.
I want my blog to be discovered and read by humans!
In principle, listing bots you want to disallow from crawling your site in a robots.txt is a first line of defense, but many crawlers don’t adhere to it.1 Last year, Perplexity was found to be using crawlers that spoof the user agent, such that their bots appear to be human browsing the site.2
Even on a free plan with workers or pages, Cloudflare has multiple features that make it easy to block and monitor AI crawlers:
the option to auto-generate a file that covers the popular AI bots
a dashboard that allows you to monitor AI crawler requests, and block them
- an AI audit feature (currently in beta) that allows you to manage and monitor AI crawlers, and register verified bots.
It's been 2 days. Interesting to see what hits.
Next.js is fun and sexy when you work with boring technologies
Next.js is intoxicating. I was hooked on it up to version 12. At the time, no other meta-framework provided the control it did for being able to alternate between static site generation, server-side rendering and client-side rendering all within the same application. It was (and still is), quite easy to keep up with. (...Until they started confusing the newcomers and experienced folks alike with the pages router and app router ... shakes fist)
Like Heroku, Vercel -previously now
- was a gateway platform for getting into CI/CD. Both Next and Vercel built a wonderful reputation in the community for Scandinavian web design, comprehensive documentation and basically, the ability to go to market without worrying about infrastructure.
It also happens get more expensive as you scale, especially if you want your site highly available in different regions and have high traffic.3
Check out these Reddit threads 🍿
r/nextjs
"Vercel pricing 20m requests/month"r/devops
"Why use Vercel when you can go direct AWS ? especially for small projects"r/nextjs
"Vercel pro pricing is fubar"r/nextjs
"When does Vercel become expensive?"
I heard about the proverbial hell with deploying Next.js outside Vercel. As a result, OpenNext was started as a vendor-backed community project to adapt Next.js to Cloudflare, AWS and Netlify. I had to try it for myself.
I wanted to see how easy it would be to deploy my mostly static Next.js blog off Vercel, and I wasn't entirely disappointed – if you're patient with getting past the phantom build errors, prerendering issues and remembering which Node.js APIs are supported by Cloudflare.
Gotchas
I followed this guide for deploying an existing Next.js site to Cloudflare Pages, but also found certain oddities that made this take 6 hours to get to a manual deploy from CLI, and another 6 to automate on Github Actions. And yes, I was troubleshooting with AI.
Bring your own domain
For a while, I had set up my DNS to proxy requests to Vercel's IP addresses. (See Github Issue #7739) But if you intend to use Cloudflare for edge computing, analytics or monitoring, then you might get the most benefit out of transferring your DNS to Cloudflare.
I got mine for $15 CAD/year. Goodbye Dreamhost 😤!
Update your Content Security Policy
You'll need to update your Content Security Policy to allow Cloudflare to run analytics and track real user data.
Prevent linters from corrupting your wrangler.jsonc
with a dangling comma
cough, prettier, cough
Something in my boilerplate repo was causing the linter to add a dangling comma before the closing bracket of my wrangler.jsonc
config file.
Your build will fail when it reaches Cloudflare if you have a dangling comma.
Route-based rendering adjustments
If you have any server functions or SSR-ed pages, routes that use these functions must target the edge runtime with:
export const runtime = 'edge'
Build Time Noise
Every half year or so I have to maintain this Next.js blog and I'll find some new build error after pulling upstream from the template repo. I don't know if it's because I'm using a template repo with a lot of dependencies, or if modern front end frameworks are less stable than a Jekyll site might be.
Prerendering errors may happen on views containing server components or dynamic routes that are statically generated but rely on server functions to populate data that would appear on the page.
19:34:24.622 ⚡️ Invalid prerender config for /blog/[...slug]
19:34:24.622 ⚡️ Invalid prerender config for /blog/[...slug].rsc
19:34:24.690 ⚡️ Invalid prerender config for /tags/[tag]/page/[page]
19:34:24.691 ⚡️ Invalid prerender config for /tags/[tag]/page/[page].rsc
19:34:24.691 ⚡️ Invalid prerender config for /tags/[tag]
19:34:24.691 ⚡️ Invalid prerender config for /tags/[tag].rsc
This was ultimately benign but annoying.
ESM conversion
I had to update my eslint, tailwind, and prettier config and package.json
type property to support ESM imports
and default exports. Then I updated next.config.js
to next.config.mjs
and use ESM named imports instead of CommonJS' require
.
If your Next plugins haven't been imported as ESModules, you'll have to convert them to ESM. Unlike the commonJS version of next.config.js, there is no module.exports.plugins
property, and you have to use the plugin as if it's a higher-order function taking the config object as an argument.
Before
const withContentlayer = require('next-contentlayer2')
module.exports = withContentlayer({
plugins: [withContentlayer, withBundleAnalyzer]
})
After
import { createContentlayerPlugin } from 'next-contentlayer2'
// Use createContentlayerPlugin instead of withContentlayer
const withContentlayer = createContentlayerPlugin({
configPath: './contentlayer.config.ts',
})
const config = {...}
withContentlayer(config)
Add Cloudflare's Github App Integration
The Github app, ["Cloudflare Workers and Pages"]( can be installed (https://github.com/apps/cloudflare-workers-and-pages) to enable automatic builds after pushing to main.
I had initially set up Github Actions Workflow with the CLOUDFLARE_API_TOKEN
environment variable to build the project, then deploy the static site on Cloudflare. I found that the automatic builds on Cloudflare's side was much faster, while server side calls were 429-ing on my Github Actions workflow.
Self-hosting Next.js full stack apps will require more legwork
If you're deploying a full stack app instead of a completely static site, DON'T use @cloudflare/next-on-pages
. Use @opennextjs/cloudflare instead.
The Workers runtime supports a broad set of Node.js APIs — but the Next.js Edge Runtime code intentionally constrains this ↗. As a result, only the following Node.js APIs will work in your Next.js app:
- buffer
- events
- assert
- util
- async_hooks
Maybe I'll try this Next (ha ha).
Conclusion
Moving from Vercel to Cloudflare Pages was a petty decision driven by the need to scratch a CI itch in the dead heat of nuclear AI summer.
It's not impossible to deploy Next.js off Vercel, but it does take longer than usual due to Cloudflare's and Vercel's limited support for Node.js APIs.
As the ecosystem of tools and platforms become increasingly subscription-driven and stack-based, developers and businesses alike will need to make deliberate choices to avoid vendor lock-in and bot-proof their intellectual property.
I could see challenges with applying this to a larger fullstack enterprise project where multiple services and edge-related features would need to be migrated or rewritten and re-configured on Cloudflare, but where you have an able developer, the cost savings could be substantial.