Accessibility Testing in CI

Automated accessibility testing in your CI pipeline catches accessibility regressions before they reach production. While automated tools cannot catch every accessibility issue (they typically detect around 30-40% of WCAG violations), they excel at catching the issues that are most common and most easily introduced accidentally: missing alt text, insufficient color contrast, improper heading hierarchy, missing form labels, and ARIA attribute errors.

By running accessibility checks on every pull request, you prevent new accessibility violations from being merged. Over time, this shifts accessibility from a periodic audit concern to a continuous quality standard that is enforced automatically.

Pa11y CI with axe Runner

Pa11y CI is a command-line tool designed to run accessibility tests against multiple URLs as part of a CI pipeline. It wraps Pa11y (the accessibility testing engine) and adds features specifically for CI integration: batch URL testing, JSON and CLI output formats, configurable thresholds, and exit codes that CI systems understand.

Pa11y supports multiple testing runners. The axe runner uses the axe-core engine by Deque, which is the same accessibility engine used in the browser-based axe DevTools extension. The axe engine is well-maintained, accurate, and produces actionable results with links to remediation guidance for each violation.

Configuration: The .pa11yci File

Pa11y CI is configured via a .pa11yci JSON file in your project root. This file specifies which URLs to test, which standard to use, and how the runner should behave.

{
  "defaults": {
    "runner": "axe",
    "standard": "WCAG2AA",
    "timeout": 30000,
    "wait": 2000,
    "chromeLaunchConfig": {
      "args": ["--no-sandbox", "--disable-setuid-sandbox"]
    }
  },
  "urls": [
    "http://localhost:3000/",
    "http://localhost:3000/about",
    "http://localhost:3000/login",
    "http://localhost:3000/dashboard",
    "http://localhost:3000/contact"
  ]
}

Key configuration options explained:

runner: "axe" — Uses the axe-core engine instead of Pa11y's default HTML CodeSniffer. axe is generally more accurate and produces better remediation guidance.
standard: "WCAG2AA" — Tests against WCAG 2.1 Level AA, which is the most commonly targeted standard and the one required by most accessibility legislation (ADA, Section 508, EAA). Level A is too basic; Level AAA is aspirational and often impractical to fully achieve.
timeout: 30000 — Allows 30 seconds for the page to load before testing. Set this high enough that pages with slow API calls or large datasets do not time out, but low enough that genuinely broken pages are caught.
wait: 2000 — Waits 2 seconds after the page loads before running tests. This gives client-side JavaScript time to render dynamic content. Increase this if your application has heavy client-side rendering.
chromeLaunchConfig — Passes flags to the headless Chrome instance. The --no-sandbox flag is required in many CI environments (like GitHub Actions) where the Chrome sandbox cannot be used.

Practical tip: Start by testing your most important pages — the homepage, login, and primary user flows. Add pages incrementally as you fix existing violations. Trying to test every URL at once on a site with existing accessibility issues will produce an overwhelming number of failures.

Running Against Localhost in CI

Pa11y CI tests against live URLs, which means your application needs to be running during the CI job. The typical pattern is to start your application as a background process, wait for it to be ready, run Pa11y CI, and then stop the application.

# GitHub Actions workflow
name: Accessibility Tests

on: [push, pull_request]

jobs:
  accessibility:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install dependencies
        run: npm ci

      - name: Build application
        run: npm run build

      - name: Start application
        run: npm start &
        env:
          NODE_ENV: test
          PORT: 3000

      - name: Wait for application
        run: npx wait-on http://localhost:3000 --timeout 60000

      - name: Install Pa11y CI
        run: npm install -g pa11y-ci

      - name: Run accessibility tests
        run: pa11y-ci

Important details about running in CI:

Background process: The & at the end of npm start runs the application in the background so subsequent steps can execute.
Wait for ready: The wait-on utility polls the URL until it responds with a 200 status. Without this step, Pa11y might start testing before the application is ready, causing false failures.
Environment variables: Use a test or CI-specific environment to avoid hitting production APIs or databases during testing.
Headless Chrome: CI environments run without a display. Pa11y uses headless Chrome by default, which works in all standard CI environments.

Interpreting Results

Pa11y CI reports violations in a structured format. Each violation includes the rule that was violated, the severity level, the HTML element that triggered the violation, and a recommendation for how to fix it.

Understanding severity levels:

Error: The element definitively violates a WCAG criterion. For example, an image with no alt attribute, a form input with no associated label, or text with insufficient contrast ratio. These must be fixed.
Warning: The element might violate a criterion, but the tool cannot determine definitively. For example, an image has an alt attribute, but the tool cannot verify that the alt text is meaningful. These should be manually reviewed.
Notice: The element has a characteristic that warrants manual inspection. For example, a link opens in a new window — the tool flags this for manual verification that the user is warned. These are informational and often do not require action.

Common violations and what they mean:

image-alt: An <img> element is missing an alt attribute. Screen reader users cannot understand what the image conveys. Fix: add descriptive alt text, or alt="" if the image is purely decorative.
color-contrast: Text does not have sufficient contrast against its background. The minimum ratio is 4.5:1 for normal text and 3:1 for large text (18px+ or 14px+ bold). Fix: darken the text or lighten the background.
label: A form input has no associated <label> element. Screen reader users cannot determine what the input is for. Fix: add a <label> with a matching for attribute, or use aria-label.
heading-order: Headings skip levels (e.g., <h1> followed by <h3> with no <h2>). Screen reader users navigate by heading level, so skipping levels breaks their navigation flow. Fix: use headings in sequential order.
link-name: A link has no discernible text content. Links with only an icon and no text or aria-label are invisible to screen readers. Fix: add descriptive text or an aria-label attribute.

A Real Example: CodeFrog's Accessibility Pipeline

The CodeFrog project itself uses automated accessibility testing as part of its CI pipeline. This serves as both a quality gate for the product and a real-world demonstration of the practices described in this guide.

The CodeFrog accessibility pipeline:

Runs on every pull request so accessibility regressions are caught before merging.
Tests all public pages of the CodeFrog website against WCAG 2.1 Level AA.
Uses Pa11y CI with the axe runner for accurate, actionable results.
Enforces zero violations as a quality gate — any new accessibility violation causes the build to fail.
Reports results as GitHub Actions annotations so developers see violations directly in the pull request interface.

This approach means that every page on the CodeFrog website has been verified for automated accessibility compliance. New pages must pass accessibility testing before they can be merged, and existing pages are continuously verified to prevent regressions.

Key insight: Eating your own dog food — using the same quality practices you recommend — is the strongest form of credibility. The CodeFrog website is tested with the same tools and standards it helps other teams implement.

Combining with Lighthouse CI

Lighthouse CI is Google's tool for running Lighthouse audits in CI pipelines. While Pa11y CI focuses specifically on accessibility, Lighthouse CI provides a broader quality check that includes performance, accessibility, best practices, and SEO scores.

Running both tools together gives you comprehensive quality coverage:

Pa11y CI (with axe): Deep, focused accessibility testing with detailed violation reports and specific remediation guidance. Best for ensuring WCAG compliance.
Lighthouse CI: Broader quality metrics including performance scores, accessibility scores, SEO scores, and best practices. Best for tracking overall page quality trends over time.

There is overlap in accessibility testing between the two tools, but they use different engines and catch different issues. axe-core (used by Pa11y) and Lighthouse's accessibility audits (also partially based on axe-core) have slightly different rule sets and implementations. Running both provides defense in depth.

# lighthouserc.js configuration
module.exports = {
  ci: {
    collect: {
      url: [
        'http://localhost:3000/',
        'http://localhost:3000/about',
        'http://localhost:3000/login'
      ],
      numberOfRuns: 3
    },
    assert: {
      assertions: {
        'categories:accessibility': ['error', { minScore: 0.95 }],
        'categories:performance': ['warn', { minScore: 0.80 }],
        'categories:seo': ['warn', { minScore: 0.90 }]
      }
    }
  }
};

Setting Failure Thresholds

How strict should your accessibility CI gate be? There are two main approaches:

Zero violations (recommended for new projects): The build fails if any accessibility violation is detected. This is the strictest approach and ensures that no new accessibility issues are ever introduced. It works well for new projects or projects that have already resolved all existing violations.

Allowlisting known issues (practical for existing projects): If your project has existing accessibility violations that cannot be fixed immediately, you can configure Pa11y to ignore specific rules or elements while still catching new violations. This lets you adopt CI-based testing without being blocked by legacy issues that need more time to resolve.

// .pa11yci with rule exceptions for legacy issues
{
  "defaults": {
    "runner": "axe",
    "standard": "WCAG2AA",
    "ignore": [
      "color-contrast"
    ]
  },
  "urls": [
    {
      "url": "http://localhost:3000/legacy-page",
      "ignore": ["heading-order", "image-alt"]
    },
    "http://localhost:3000/new-page"
  ]
}

Important: If you use an allowlist, track it. Create tickets for every allowed violation, and treat the allowlist as technical debt to be paid down. An allowlist that grows over time defeats the purpose of automated testing. The goal is always to reach zero violations.

A practical adoption path for existing projects:

Baseline: Run Pa11y CI against your site and document all existing violations.
Allowlist: Configure Pa11y to ignore the existing violations.
Gate: Enable the CI gate so new violations are blocked immediately.
Remediate: Fix existing violations sprint by sprint, removing them from the allowlist as you go.
Strict mode: Once all violations are fixed, remove the allowlist entirely and enforce zero violations.

Resources

Pa11y CI Documentation — Configuration reference, runner options, and CI integration guides
axe-core — The accessibility testing engine used by Pa11y's axe runner, with rule descriptions and remediation guidance
Lighthouse CI — Run Lighthouse audits in CI for performance, accessibility, SEO, and best practices scoring