The 2MB Crawl Budget Limit: What Google Ignores Past the Cutoff

SPOKE 01 · BYTE-LEVEL SEO

Google stops fetching at 2MB per URL. Everything past that point is ignored, not rendered, not indexed, not citable by AI engines. Here is what that actually means for your site.

2MB

The hard byte cutoff per URL.

Including HTTP headers. Including all inline content. Once Google has read 2 million bytes of your page response, the rest is silently dropped.

CONFIRMED BY GOOGLE · MARCH 31, 2026

Google’s own words on the 2MB limit

“Googlebot currently fetches up to 2MB for any individual URL (excluding PDFs). This means it crawls only the first 2MB of a resource, including the HTTP header.”

Gary Illyes, Google Search team, “Inside Googlebot,” March 31, 2026

Two additional confirmations from the same Google post:

PDFs get a 64MB limitthe only major exception. If you publish whitepapers, research reports, or product specifications as PDFs, you have far more headroom there than on HTML pages.
For other crawlers without a specified limit, the default is 15MB. But Googlebot’s 2MB is what applies to your regular HTML pages.
“Any bytes that exist after that 2MB threshold are entirely ignored. They aren’t fetched, they aren’t rendered, and they aren’t indexed.”

What counts toward the 2MB limit

Most engineers think of “page size” as just the visible HTML body. Google counts everything.

HTTP response headers

Server, cache, CSP, cookies, security headers

Raw HTML body

Every tag, every attribute, every comment

Inline CSS and JavaScript

Anything inside style or script tags

Schema markup

JSON-LD, microdata, RDFa, all of it

Inlined SVGs and base64 images

Heavy contributors most teams forget

Comments and whitespace

Unminified HTML can add 30 to 40 percent

A typical D2C product page byte budget

Cumulative total comes to about 2,280 KB. Google stops at 2,048 KB. Everything after that is invisible.

Hero section and meta
180 KB · safe

Product description and images
420 KB · safe

Reviews and ratings widget
680 KB · safe

Cross-sell carousels
540 KB · safe

Tracking scripts
320 KB · partly cut off at 2MB

FAQ schema (placed at bottom by plugin)
140 KB · ignored

The cutoff hits midway through the tracking scripts. The FAQ schema, which the SEO plugin placed near the footer, never gets read. Result: zero FAQ rich results in SERP, despite being marked up correctly.

APURV’S TAKE ON THE 2MB RULE

The 2MB rule isn’t a license to write less. It’s a reminder to structure better.

A lot of folks read about the 2MB limit and immediately want to trim content. That’s the wrong takeaway. The byte limit doesn’t mean you stop creating valuable, in-depth content for your audiences. It means the page has to be structured the right way.

Say what you need to say in language that’s easy to understand and easily digestible. Both users and AI platforms have to be able to ingest the information in one pass and pull out the parts that matter. That stays critical, whether Google’s limit is 2MB or 20MB.

The byte budget is a structural constraint on how you arrange your page. It is not a content constraint on what you say.

Apurv Singh, Founder HQ Digital

What breaks when content sits past 2MB

1. Your structured data disappears from results

FAQ schema, Product schema, Article schema, if they sit past the cutoff, Google never parses them. You lose rich results, knowledge panels, and AI citation eligibility in one stroke. Most enterprise SEOs blame “Google not picking up our schema” when the real issue is byte position.

2. Canonical tags below cutoff are unread

If your canonical tag is injected by a plugin in the footer and your page is heavy, Google may never see it. You get duplicate content flags and indexation chaos, with no diagnostic in Search Console because the page technically loaded.

3. Internal links past the cutoff don’t pass authority

Your footer link to a critical category page? If it’s beyond 2MB on a heavy template, that link doesn’t transfer crawl signals. Internal linking strategies built around footer or sidebar placement quietly fail on bloated sites.

4. AI engines cite competitors instead of you

ChatGPT, Perplexity, and Google’s AI overviews all rely on indexed content. If your best answer to a query sits past 2MB on your page, it doesn’t exist in the index that AI engines query. A leaner competitor with the same answer wins the citation.

PRACTITIONER NOTE

“During my time at Times Internet I saw this happen with a category page that had 12 widget zones below the fold. Schema was rendered by a plugin at the very bottom. Engineers thought everything was fine because Search Console showed the page indexed. But the rich results never appeared. Once we moved the JSON-LD into the head, the rich snippets came back in 14 days.”

Apurv Singh, Founder HQ Digital

How to measure your own byte position

You cannot fix what you cannot measure. Three simple ways to find out where your critical content sits.

METHOD 1

Use Chrome DevTools network tab

Load your page, open DevTools, go to Network tab, click the main document request. Check Response Size. If it’s over 1.8MB you are at risk, that 200KB buffer accounts for headers and rendering overhead.

METHOD 2

Use curl with byte counter

Run this in terminal to see total bytes returned: curl -s -w “%{size_download} bytes” -o /dev/null https://yoursite.com/critical-page. Run it on your top 10 organic landing pages. Sort by size.

METHOD 3

Use Search Console URL Inspection

Click “View Crawled Page” to see what Google actually fetched. Compare its byte size with what your browser loads. If Google’s version is missing tail content, that’s your evidence.

What to fix first

Move schema into head

JSON-LD belongs above all body content. Never let a plugin place it at the bottom.

Inline only what’s critical

Move third-party widgets, chat scripts, and trackers to async-load after byte 1500K.

Audit your inline CSS bloat

Page builders often inject 200KB+ of unused critical CSS into every page head.

Minify HTML in production

Whitespace, comments, and indentation add 30 to 40 percent. Strip them at the edge.

CONTINUE THE SERIES

Next: Googlebot Is an Ecosystem, Not One Crawler

Smartphone bot, Desktop bot, Image bot, AdsBot, each behaves differently. Understand the full ecosystem before you optimize.

Read Spoke 02 →

Related resources

← Byte-Level SEO Hub
AI-Powered SEO Hub
Dream SEO Masterclass
Join HQ Club