You already saw the number.
In the WCAG toolkit series I ran a sitemap audit across the full published surface of this portfolio. Thirty-five pages. 5,816 findings. Four CSS-level commits later: seven, all false positives from a subsystem I wrote myself. Zero SERIOUS. Zero AA failures.
That post was about what happened. People asked the better question afterwards: how does an audit find pages the homepage audit never touched? How does one tool turn 5,808 instances of the same bug into a single line in a report instead of 5,808 tickets?
This is the machine. No reveal, no twist - you know how the story ends. Part 1 is the architecture.
Single-page audit is a Lighthouse extension
I’ll say the uncomfortable part first, because it’s the whole reason this feature exists.
A single-URL accessibility audit - point it at a page, get a grade - is a solved problem. Lighthouse does it. axe does it. My own toolkit did it at v0.3. It’s useful, and it tells you almost nothing about whether your site is accessible.
Round 3 of the portfolio audit converged on the homepage. Three runs, zero new findings. “We’re done.” Then I pointed discovery at the router instead of a URL and found nine SERIOUS findings on three pages the homepage audit had no way to see. The homepage was clean. The article pages were not. The episode listings were not. The archive was not.
Convergence on one URL doesn’t mean the site converged. It means that URL did. That gap - between “this page passes” and “this site passes” - is the entire problem class multi-page audit exists to handle.
A flag, not a rewrite
The whole multi-page capability hangs off one flag: --multi-page.
Without it, the toolkit behaves exactly as it did at v0.3. Same single-page audit, byte-identical output. That was a hard design constraint, not a nice-to-have. The moment a tool silently changes what it does between versions, you’ve broken every CI pipeline that trusted it. So multi-page is strictly opt-in, and the old path is frozen.
What the flag plugs in is small to describe and the reason the rest works: a discovery layer in front of the dynamic tester, and a deduper behind it.
single-page (v0.3): --url -> audit -> report
multi-page (v0.4): --url -> discover routes -> audit each -> dedup -> report
[discovery layer] [deduper]
The audit step in the middle is the same engine I already had. Multi-page doesn’t make the audit smarter. It makes the audit run against the right set of pages, and it makes the output legible when one bug shows up on forty of them.
Discovery: three strategies by default, AI on request
Here’s the decision I’d defend hardest, because it’s the one that looks wrong until you’ve paid an API bill.
The discovery dispatcher runs a fallback chain. Default order:
sitemap -> router-scan -> json-config
It tries the first. If that comes back empty, it falls to the next. You can pin one explicitly with --strategy=<name> and skip the chain.
Notice what’s not in the default chain: the AI agent. There are four strategies, but only three run automatically.
Sitemap is the cheapest truth available. If the site ships a sitemap.xml, that’s the post-build reality of what’s actually published - 35 routes for this portfolio. One HTTP fetch, parse, filter out the noise (/og/, /api/, feeds). Confidence 1.0, because it’s not a guess, it’s the build output.
Router-scan is the deterministic fallback when there’s no sitemap or you’re running against a local dev server. It reads package.json, identifies the framework, and walks the source: src/pages/**/*.astro, App Router and Pages Router for Next, vite-plugin-pages config for Vue, and so on. No network, no model, no tokens. It found 11 routes from this portfolio’s source skeleton.
JSON config is the escape hatch. A wcag.config.json with an explicit page list and optional auth hooks, for when discovery can’t infer what you want - gated routes, a staging subset, a hand-picked critical path.
The AI agent is strategy four, and it’s deliberately out of the default chain. It dispatches a route-discovery agent through a Claude Code session, reads the framework configs, and returns a structured route list with confidence scoring. It’s the most flexible strategy and the only one that costs money to run. So it’s opt-in: --strategy=ai, or it activates if you already have AI enabled. Nobody gets a surprise token bill because a routine audit decided to think.
That’s the principle the whole series runs on, applied to one feature: deterministic by default, AI only where it earns its place. A sitemap parse and a source walk solve the discovery problem for most projects without a single model call. The agent is there for the projects that need it, not as the front door.
Frameworks: four that know, four that warn
The live teaser for this episode says “8 frameworks.” That’s true in the narrow sense that the detector recognises eight, and misleading in the sense that matters, so here’s the honest version.
Four have a real route-discovery detector: Astro, Next (both App Router and Pages Router), Vue (vite-plugin-pages), and Nuxt (which rides the Vue detector). Point router-scan at any of these and it walks the actual routing structure.
Four are recognised but not implemented: SvelteKit, Remix, Gatsby, React Router. The detector identifies them from package.json and then warns “no detector yet.” You don’t get a route list - you get a clear message telling you to use --strategy=ai or write a config.
I left a note to myself in the troubleshooting docs about this. If the tool ever tells you “detected next but no detector implemented yet,” that message is lying - Next has a detector. If you actually see it, you’ve hit one of the four that don’t. Future me will know what that means. Now you do too.
Eight recognised, four fully supported. If you’re a tester, you’d have caught the gap the first time you ran it on SvelteKit, so I’d rather say it up front.
The deduper is the part that matters
Discovery gets you the right pages. The deduper is what makes 5,816 findings survivable.
Run an audit across 35 pages and the naive output is 35 pages × findings per page. The same broken token in a shared component shows up on every page that renders it. The Shiki code-block theme leak in this portfolio appeared on every page with a code block - 5,808 instances of one bug. As 5,808 line items, that report is unreadable and unfixable. It looks like a catastrophe. It’s one CSS variable.
So findings don’t aggregate by count. They aggregate by cause. The deduper groups on a four-part key:
(ruleId, sourceFile, line, selector)
Same rule, same source location, same selector = same bug, regardless of how many URLs it surfaced on. The group collapses to a single canonical finding, and the URLs roll up into an affectedPages array hanging off it.
5,808 instances become one finding that says “this appears on these 34 pages.” Fix the variable once, and the next audit shows all 34 green. That’s the line in the report that’s worth the whole feature: single fix -> many pages green. The dependency graph of your bugs, not a flat list of symptoms.
This is also why “5,816 findings” was never the disaster it sounds like. The right question isn’t how many findings - it’s how many distinct bugs. The answer was three: one Shiki config leak (5,808 instances), one badge color, and seven keyboard cycles that turned out to be false positives. Multi-page audit didn’t multiply the work. It surfaced the structure underneath it.
What shipped, and what didn’t
Public v0.4.1 (30 April) ships all of this: the route-discovery package, the three-plus-one strategies, the multi-page orchestrator, cross-page dedup, and heat-map reporting. AGPL-3.0, same as the rest of the toolkit. The discovery layer went in with 47 new hermetic tests on top of the existing suite - sitemap edge cases, dispatcher chain exhaustion, the dedup logic itself.
What did not ship, so I don’t oversell it: the Pro tier’s niche specialists - a modal-specialist and an ecommerce-journey agent - are stubs right now, marked “do not dispatch.” They’re on the roadmap, not in a release. The Pro tier features that are real - trace recording, screenshot sequences, authenticated routes, parallel execution - are the subject of a later part. I’ll draw that line clearly when I get there.
Tomorrow: what this is worth to the person signing off compliance
Part 1 was the engineering. Part 2 is the other half of the question, and it’s the one a CTO actually cares about: if your homepage passes its accessibility audit, what does that tell you about your site? (Less than you’d hope.) What does single-page-green actually cost you when the European Accessibility Act applies to the whole surface, not the landing page? And what does a discovery-driven audit change about that math?
That one’s tomorrow. No code, just the part that shows up on a budget.
The multi-page audit is open source: sdet-wcag-toolkit, AGPL-3.0. Part 2 covers the business case - what site-wide compliance is actually worth and where single-page audits leave you exposed.