
Technical SEO — Complete guide and checklist for 2026
Crawling, indexing, Core Web Vitals, structured data, crawl budget and AI visibility — all in one guide with a prioritization framework and checklist.
Great content never reaches its full potential if the site has technical flaws. Technical SEO is the foundation everything else rests on — the layer that decides whether Google can find, understand and surface your site to the right people. This guide covers everything you need to know, in the right order.
Use the checklist directly. Further down you'll find a complete technical SEO checklist you can work through point by point. You can also handle your entire technical audit in SEOZ AuditWizard which automatically scans and prioritizes fixes by impact.
What is technical SEO?
Technical SEO is about optimizing the site so search engines can crawl, render and index your content correctly. It's not about what you write — it's about how the site is built and how the search engine bots experience it.
The difference vs on-page SEO: on-page SEO is about the quality and relevance of content for a given keyword. Technical SEO is about structure and underlying infrastructure. The line isn't always sharp — page speed affects both technical and content UX, for example — but the distinction helps you assign the right resources to the right fixes.
The most important points — start here
Before diving into details, here are the five things you need to fix if you do nothing else:
- 1. Make sure the site is indexable — check robots.txt and Search Console. If Google can't index you, you don't rank at all.
• 2. Fix Core Web Vitals — LCP, INP and CLS are confirmed ranking factors and directly measurable in GSC.
• 3. Implement structured data — helps Google and AI services understand and cite your content.
• 4. Build a logical internal link structure — important pages need more internal links with relevant anchor texts.
• 5. Clean up duplicate and thin content — it dilutes link equity and worsens Google's view of the entire domain.
How to prioritize: Impact vs. Effort
A checklist isn't a to-do list in priority order. Technical SEO is about prioritizing the right effort for the largest effect. Use this framework:
🔴 High impact, low effort — start here
- robots.txt blocking Google by accident
• Broken internal links
• Missing H1s on important pages
• Pages without canonical tags
• Basic schema (Article, Organization)
🟠 High impact, high effort — plan it in
- Full Core Web Vitals optimization
• URL structure migration on large sites
• Advanced structured data (Product, FAQ, HowTo)
• JavaScript rendering and SSR
🔵 Low impact, low effort — do when there's time
- Alt tags on images without SEO value
• Optimization of secondary category pages
• Tweaking meta descriptions without CTR problems
⚪ Low impact, high effort — skip
- Chasing 100/100 in PageSpeed if you already pass thresholds
• Optimizing pages without organic potential
Complete technical SEO checklist
Tick off each point as you work through it. Priority is (P0) = critical, (P1) = important, (P2) = nice-to-have.
Crawling & indexing
- robots.txt doesn't block any important resources (P0)
• All important pages are indexable (check Coverage in GSC) (P0)
• XML sitemap submitted in Search Console and contains only indexable URLs (P1)
• Orphan pages identified and addressed (P1)
• Crawl budget optimized (thin pages, redirect chains, URL parameters handled) (P2)
Core Web Vitals & page speed
- LCP under 2.5 seconds (P0)
• INP under 200ms (P0)
• CLS under 0.1 (P1)
• Mobile experience tested and approved in GSC (P1)
• Images compressed and lazy-loaded below the fold (P1)
On-page technical
- Exactly one H1 per page, all important pages have an H1 (P0)
• No duplicate page titles or meta descriptions (P1)
• Canonical tags correctly implemented (P1)
• Keyword included in URL (close to the root) (P2)
• Alt tags on all images with SEO value (P2)
Internal links
- No broken internal links (404) (P0)
• Important pages have the most internal links with relevant anchor texts (P1)
• No internal redirects (link directly to the final destination) (P2)
• Navigation (desktop & mobile) links to your most important SEO pages (P2)
Structured data & schema
- No schema errors in GSC (Rich Results report) (P1)
• Article/BlogPosting implemented on all blog posts (P1)
• Organization/LocalBusiness implemented (P2)
• FAQPage schema on pages with Q&A (P2)
• Product/Service schema on product pages (P2)
Security & accessibility
- HTTPS enabled, HTTP redirects to HTTPS (P0)
• Only one variant works (www / non-www) (P1)
• Hreflang correctly implemented (if multilingual site) (P1)
• 404 pages with 301 redirects to relevant pages (P2)
AI visibility & GEO (new in 2026)
- Logical H1–H2–H3 hierarchy for AI parsing (P1)
• Semantic HTML (no CSS-only structures) (P1)
• FAQPage schema for citability in AI answers (P1)
• Brand monitored in SEOZ AI Visibility Tracker (P2)
Core Web Vitals — what you need to pass
Core Web Vitals (CWV) is Google's measurement of page experience and a confirmed ranking factor since 2021. Three numbers to pass:
- LCP (Largest Contentful Paint): ✅ Good under 2.5s · ⚠️ OK 2.5–4s · ❌ Poor over 4s
• INP (Interaction to Next Paint): ✅ Good under 200ms · ⚠️ OK 200–500ms · ❌ Poor over 500ms
• CLS (Cumulative Layout Shift): ✅ Good under 0.1 · ⚠️ OK 0.1–0.25 · ❌ Poor over 0.25
How to improve LCP
LCP measures how long it takes for the largest visible content element to load. The most common culprit is a big hero image that loads late. Fixes:
- Preload the hero image with
• Serve images in WebP or AVIF
• Use a CDN with edge caching close to the user
• Lazy-load everything below the fold, but never the hero element
How to improve INP
INP replaced FID in 2024 and measures responsiveness on user interactions. Minimize JavaScript execution on the main thread:
- Break long tasks with setTimeout or scheduler.postTask()
• Move heavy computation to Web Workers
• Avoid unnecessary third-party code (chatbots, trackers) that blocks the main thread
How to improve CLS
Layout shift happens when elements move after the page has loaded. The simplest fix: always set explicit width and height on images and iframes.
Crawling & indexing
There's a fundamental distinction: a page can be crawled (Googlebot has visited it) without being indexed (included in Google's search index). Check the status in GSC under Indexing → Pages.
robots.txt
robots.txt is a text file in the root that tells search engines what they should not crawl. It's one of the first places we check on serious ranking problems — it's not unusual for an entire site to be blocked by accident.
Common mistake: blocking JS or CSS resources in robots.txt. Google needs to load these to render the page correctly. If they're blocked, Googlebot sees a broken version of the site.
robots.txt can also block AI crawlers like GPTBot (OpenAI) and CCBot (Common Crawl). If you don't want your content training AI models, you can add explicit blocking rules for those.
XML sitemap
The sitemap should only contain pages you actually want indexed. In technical audits we often see sitemaps full of thin pages, 301 redirects and noindex pages — all of which lower Google's trust in the sitemap as a whole.
- Remove pages with noindex directives
• Remove pages with 3xx status codes
• Remove pages with parameters (unless you intentionally index them)
• Add the sitemap to GSC and to robots.txt
Crawl budget
Crawl budget is how much time and resources Googlebot spends on your site. It's primarily critical for sites with thousands of pages. If Googlebot wastes time on thin and duplicate pages, important pages get crawled less often.
- Block admin, search and filter pages in robots.txt
• Fix redirect chains — every extra hop costs crawl budget
• Noindex or remove thin pages
• Make sure the XML sitemap is clean (see above)
Thin pages and duplicate content
Thin pages are pages with little or no value to the user. It's not about page length per se — a brief, exact answer to a specific question isn't thin. The problem arises when pages exist without adding anything unique.
Thin pages dilute the domain's link equity, waste crawl budget, and can lower Google's overall view of the site. Auto-generated pages (common in WordPress and e-commerce systems) are the most common source.
Duplicate content
Duplicate content confuses Google about which page should rank for a given keyword. Common sources:
- URL variants — the site is reachable on both www and without, or via http and https
• URL parameters — filtering and sorting create identical pages on different URLs
• Pagination — page 2 of a category often has near-identical content to page 1
• Manual duplication — similar landing pages built for local variants
The solution is canonical tags. They tell Google which page is the original without 301-redirecting the user. On both the original and the duplicate page you set pointing to the original's URL.
Structured data and schema
Structured data (schema.org markup) is a way to give search engines clear, machine-readable information about your content. It can drive Rich Snippets in Google — and is an increasingly important signal for being surfaced in AI-generated answers.
The most important schema types to know:
- Article / BlogPosting — for all articles and blog posts. Powers top stories and date snippets.
• FAQPage — for pages with Q&A. Drives the FAQ accordion in SERPs and is extra important for AI citability.
• HowTo — for step-by-step guides. Shown stepwise in the SERP.
• Product — for product pages. Drives price, availability and reviews.
• LocalBusiness — for physical businesses. Shows opening hours and address in the SERP.
• Organization — for company pages. Drives the knowledge panel.
• BreadcrumbList — for breadcrumbs. Shows URL hierarchy in search results.
Validate all structured data in Google's Rich Results Test and monitor errors in GSC under Search appearance → Rich results.
Rendering and JavaScript
Google renders JavaScript, but with delay. Critical content — headings, body text, internal links — must be in the HTML source, not only after JS execution. Test how Googlebot sees your page with URL Inspection Tool in GSC and click "Test live URL".
For Next.js sites like SEOZ we recommend Server-Side Rendering (SSR) or Static Site Generation (SSG) for SEO-critical pages. Client-side-only rendering (CSR) is a P1 problem if it affects visibility of important elements.
Internal links
Internal links do three things for your SEO: they steer link value (PageRank) toward important pages, they tell Google which context a page belongs to, and anchor texts are a strong relevance signal.
- Important pages should have the most internal links — preferably from high-traffic pages
• Use varied but relevant anchor texts — not always the exact-match keyword
• Links higher up on the page weigh more than footer links
• Never link to URL parameters — always link to the canonical URL
• Find orphan pages with Screaming Frog (filter on "Inlinks = 0")
Tip: Google your most important keyword and find pages on page 2–5 of your own search index. They're great candidates to link internally from to your target page — they already have relevance established with Google.
URL structure
The URL is one of the strongest on-page signals. Ground rules:
- Include the primary keyword in the URL, e.g. /technical-seo/
• Keep the URL close to the root — avoid deep hierarchies like /blog/category/subcategory/article/
• Avoid stop words like "for", "and", "to", "of"
• Never change URLs without adding 301 redirects
• Use hyphens, not underscores, as separators
A logical URL structure like /blog/technical-seo/ also helps in analytics — you can easily filter traffic per section.
Hreflang — multilingual sites
Hreflang tells Google which language and region variant of a page to show different users. Always implement on multilingual sites with tags in .
Common mistakes: forgetting to point back (hreflang must be reciprocal), using wrong region codes, and including noindex pages in the hreflang implementation.
Technical SEO for AI search and GEO (2026)
Google's AI Overviews, ChatGPT, Perplexity and other AI services are reshaping how technical SEO needs to be thought of. Ranking on page one is no longer enough — you need to be visible and cited in AI-generated answers.
Structured data matters even more for AI
AI models prefer content with clear structure. FAQPage and HowTo schema make it easier for AI to understand context and cite you correctly. Prioritize these types on pages with high information value.
Clean HTML hierarchy
A logical hierarchy with H1 → H2 → H3 and semantic HTML (, , ) makes it easier for AI models to parse and understand the content. Avoid building page structure with CSS and divs alone.
Page speed affects AI crawling
The faster a page loads, the more content AI crawlers can process within their budget. With more AI agents indexing the web — a phenomenon that accelerated after Adobe bought Semrush and positioned itself for GEO — a fast technical foundation matters more than ever.
SEOZ AI Visibility Tracker lets you see how your brand appears and is cited in ChatGPT, Perplexity and Google AI Overviews. It's the natural next step after a technical audit — measuring visibility where more and more buying decisions now begin.
Google Search Console — your most important tool
GSC is free and gives you data no third-party tool can match. In a technical audit, GSC is used for:
- Indexing → Pages — see which pages aren't indexed and why
• Experience → Core Web Vitals — data from real users, not lab tests
• Experience → Mobile usability — identify mobile-specific problems
• Search appearance → Rich results — monitor schema errors and opportunities
• Settings → Crawl stats — understand how Googlebot behaves on the site
Tools for technical SEO
- SEOZ AuditWizard — automatic scanning, prioritization and tracking of technical issues
• Google Search Console — free, indispensable, always start here
• Screaming Frog — full crawl: canonicals, hreflang, H1, internal and external links
• PageSpeed Insights — CWV data for specific URLs
• Rich Results Test — validate structured data
• Siteliner — find broken links and duplicate content
Summary
Good technical SEO isn't about chasing 100/100 in tools. It's about prioritizing the issues that actually prevent Google (and AI search engines) from understanding and valuing your site. Start with the P0 items in the checklist above and work down in priority order.