SPOKE 06 · PLAYBOOK

12 diagnostic checks and fixes that tie everything in this hub together. Built for engineers, senior SEOs, and consultants who need a concrete checklist to run on any site.

Use this checklist in order. The first 4 items diagnose. Items 5 through 9 are byte-level fixes. Items 10 through 12 are infrastructure improvements.

A full pass typically takes 6 to 10 hours per site, depending on tech stack complexity.

Phase 1: Diagnose (items 1 to 4)

01

Measure response size on your top 20 pages

Run curl with size_download on your highest-traffic landing pages. Sort by size descending. Anything over 1.8MB is at immediate risk of byte cutoff.

curl -s -w “%{size_download}” -o /dev/null https://yoursite.com/page

02

Compare browser HTML vs Search Console crawled HTML

Inspect the URL in Search Console, click “View Crawled Page” then HTML. Diff it against what you see in browser View Source. Any content present locally but missing in crawled view is a byte-level loss.

03

Audit crawl stats for response time spikes

Search Console → Settings → Crawl Stats. Plot average response time over 90 days. Flag any spike above your normal baseline. These correlate with throttled crawl periods.

04

Count external requests per page

Open DevTools → Network → disable cache → reload. Note total request count. Anything over 40 is high. Over 60 is critical. Document the heaviest 10 by transfer size.

Phase 2: Byte-level fixes (items 5 to 9)

05

Move all JSON-LD schema to the head

Many WordPress plugins inject schema in the footer. Override this in functions.php or via your SEO plugin’s settings. Schema must sit before any heavy body content to survive byte limits.

06

Move canonical and meta tags to the top of head

Canonical URL, robots meta, hreflang, title, and description should all sit in the first 50 lines of HTML. They’re critical infrastructure, never let them slip past the byte buffer.

07

Minify HTML in production

Whitespace, newlines, comments. Use Cloudflare’s auto-minify, WP Rocket’s HTML minification, or your build step. This single change can save 25 to 35 percent on byte size.

08

Defer non-critical third-party scripts

Pixel, GA, Hotjar, chat widgets, social embed scripts. Add async or defer attributes. Better, load them after window onload via JavaScript or Tag Manager.

09

Consolidate plugin CSS and JS

Use LiteSpeed Cache, WP Rocket, or Autoptimize to combine plugin assets into 2 to 3 bundles. Reduces fetch count for WRS and improves crawl efficiency.

Phase 3: Infrastructure (items 10 to 12)

10

Set up SSL auto-renewal with monitoring

Let’s Encrypt auto-renew is mandatory. Pair it with an uptime monitor (UptimeRobot, BetterStack) that alerts you 14 days before expiry. SSL failure is the single most common cause of crawl rate collapse.

11

Whitelist verified Googlebot in your WAF

Cloudflare Bot Fight Mode and similar tools can rate-limit Googlebot. Add an explicit allow rule for Google’s official IP ranges. Verify by inspecting your access logs for blocked Googlebot requests.

12

Implement full-page caching with edge CDN

Pair WordPress with LiteSpeed Cache or W3 Total Cache. Add Cloudflare or Bunny in front. Serve cached HTML from edge nodes near Google’s crawl regions. Target TTFB under 200ms.

Phase 4: What Google’s May 2026 AI guide tells you to ignore

On May 15, 2026, Google published its first official AI optimization guide. Four days later at Google I/O 2026, the team confirmed that “AEO and GEO are still SEO”there is no separate optimization track for AI Mode and AI Overviews.

The guide also explicitly named four tactics that are wasting time and budget. Cross-check this list before you spend on anything labelled “AEO” or “GEO”.

13

Don’t waste time on llms.txt files

Google explicitly stated that its crawlers may discover llms.txt files but treat them like any other text file. There is no special indexing pathway. Other AI crawlers may use them, but don’t expect Google traffic from one.

14

Don’t pre-chunk your content for AI

Many AEO tools push you to break long-form content into bite-sized chunks “so AI can parse it.” Google says its systems already understand multi-topic pages and extract the relevant passage natively. Pre-fragmenting damages user experience without helping AI.

15

Don’t AI-rewrite for long-tail keyword variants

Google’s AI features understand synonyms and meanings natively. Rewriting the same article 40 different ways to capture every long-tail variant is wasted effort. One well-structured original passage outperforms 40 paraphrased variants.

16

Don’t build special schema variants for AI

There is no “AI schema” or “AI Markdown” version of your pages that Google rewards. Standard schema.org structured data is what AI Mode uses. Spending engineering hours on AI-specific variants is a 2024 leftover, skip it.

What Google says to do instead

✓ PRIORITISE

Non-commodity content

Content with first-hand experience, original research, expert analysis, what AI can’t synthesise on its own. Generic summaries face declining visibility.

✓ PRIORITISE

Byte-level crawlability

Items 1 through 12 of this playbook. AI Mode uses the same Google index, if you’re not crawled, you’re not citable.

✓ PRIORITISE

Local + shopping + media data

Google Merchant Center feeds, Google Business Profile data, and structured image/video markup feed AI Mode’s commerce and local answers.

✓ PRIORITISE

Standard schema, placed early

Standard schema.org markup in the page head (before the 2MB byte threshold). No AI-specific variants needed, just the basics, executed correctly.

📚 Primary sources: Google’s “Optimizing your website for generative AI features on Google Search” (May 15, 2026) at developers.google.com/search/blog/2026/05/a-new-resource-for-optimizing, and “Inside Googlebot: demystifying crawling, fetching, and the bytes we process” (March 31, 2026) at developers.google.com/search/blog/2026/03/crawler-blog-post

Apurv Singh

PRACTITIONER NOTE

“I have run this exact checklist on 50+ D2C and SaaS sites. The pattern is consistent: items 1 through 4 take a day, but they always surface 3 to 5 unknown issues. Items 5 to 9 can be done in a single sprint by one engineer. Items 10 to 12 need infrastructure changes that require leadership buy-in. Most sites recover 25 to 40 percent of lost organic traffic within 60 days of completing this playbook. Run it once a quarter.”

Apurv Singh, Founder HQ Digital

Playbook completion scorecard

Score yourself out of 12. Anything below 8 means there’s still measurable risk on your indexation.

01.
Top 20 pages measured for byte size
02.
Browser vs crawled HTML compared
03.
90-day crawl stats analyzed
04.
External request count documented
05.
Schema moved to head
06.
Canonical and meta in top 50 lines
07.
HTML minified in production
08.
Third-party scripts deferred
09.
Plugin CSS and JS consolidated
10.
SSL auto-renew with monitoring
11.
Googlebot whitelisted in WAF
12.
Full-page caching plus edge CDN

Want this playbook applied to your site?

Apurv Singh and the HQ Digital team have run this exact diagnostic on Fortune 500 brands and high-growth D2C businesses. Join HQ Club for advanced SEO frameworks or take the Dream SEO Masterclass.