Automated Quality Gates
A quality gate is a set of pass/fail criteria that code must satisfy before it is allowed to proceed to the next stage of the pipeline — typically before merging into the main branch or before deploying to production. Without quality gates, your CI/CD pipeline is just a notification system: it tells you about problems but does not prevent them from reaching users. Quality gates turn notifications into enforcement.
This lesson covers how to define meaningful quality gate criteria across multiple quality dimensions, how to implement them in your pipeline, and how to gradually raise your quality standards without grinding development to a halt.
What Makes a Good Quality Gate?
An effective quality gate has several properties:
- Automated: It runs without human intervention on every commit or pull request. Manual approval steps have their place, but the core quality checks must be automated.
- Fast: A quality gate that takes 30 minutes discourages developers from running it. Aim for total pipeline time under 10 minutes for the checks that block merging.
- Deterministic: The same code should produce the same result every time. Flaky tests and intermittent failures erode trust in the gate, and teams start ignoring failures.
- Actionable: When a gate fails, the developer must be able to understand what failed and how to fix it. Cryptic error messages or opaque failure conditions defeat the purpose.
- Gradual: You can start with lenient thresholds and tighten them over time as the codebase improves. Trying to enforce perfection on day one leads to frustration.
Test Coverage Thresholds
Test coverage measures the percentage of your code that is exercised by automated tests. While coverage is not a perfect proxy for test quality, it is a useful baseline metric. A project with 0% coverage has no automated safety net; a project with 80% coverage has significantly more confidence in its changes.
Most test frameworks support coverage thresholds that cause the test run to fail if coverage drops below a specified level:
// jest.config.js
module.exports = {
coverageThreshold: {
global: {
branches: 80,
functions: 80,
lines: 80,
statements: 80
}
}
};
In your CI pipeline, this translates to a simple step:
- name: Run tests with coverage enforcement
run: npm test -- --coverage
If coverage drops below 80% on any metric, the step fails, and the pull request is blocked from merging.
Zero Critical Accessibility Violations
Accessibility violations have varying severity. A missing alt attribute on a decorative image is a minor issue. A form without any labels is a critical issue that prevents some users from completing essential tasks. A reasonable quality gate enforces zero critical and serious violations while allowing minor issues to be addressed over time.
Pa11y CI supports threshold configuration:
{
"defaults": {
"standard": "WCAG2AA",
"timeout": 10000,
"threshold": 10
},
"urls": [
"http://localhost:3000/",
"http://localhost:3000/signup",
"http://localhost:3000/dashboard"
]
}
Setting "threshold": 10 allows up to 10 violations before failing. Teams should lower this threshold over time toward zero as violations are resolved. The goal is zero, but the path there should be incremental to avoid blocking all development.
For axe-core based tools, you can configure which rules are enabled. Note that axe-core exposes an impact field on each result (with values like "minor", "moderate", "serious", and "critical"), but severity-based failure logic is not a native axe-core runtime option. Instead, you must filter results after execution and apply your own failure threshold based on the impact field:
// axe configuration (runtime options)
{
"rules": {
"color-contrast": { "enabled": true },
"image-alt": { "enabled": true }
},
"resultTypes": ["violations"]
}
// After execution, filter results by impact level:
// violations.filter(v => v.impact === "serious" || v.impact === "critical")
Zero High/Critical Security Findings
Security scanning should be a mandatory quality gate. The key is to set the severity threshold appropriately so that critical vulnerabilities block merges while informational findings are logged but do not break the build:
# npm audit with severity threshold
- name: Security audit
run: npm audit --audit-level=high
# Gitleaks for secrets detection (any leak is critical)
- name: Scan for secrets
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# OWASP Dependency-Check with failure threshold
- name: Dependency check
run: |
dependency-check --project myapp --scan . \
--failOnCVSS 7 \
--format HTML --format JSON
The --audit-level=high flag tells npm audit to only fail on high and critical vulnerabilities. The --failOnCVSS 7 flag tells OWASP Dependency-Check to fail on vulnerabilities with a CVSS score of 7 or higher (high and critical). Secrets detection should always block — there is no acceptable threshold for leaked credentials.
HTML Validation
Invalid HTML can cause rendering inconsistencies, accessibility failures, and SEO problems. An HTML validation quality gate ensures that your pages are well-formed:
- name: Validate HTML
run: |
npx html-validate "dist/**/*.html" --formatter stylish
The html-validate tool checks for common HTML errors: unclosed tags, invalid nesting, deprecated elements, missing required attributes, and more. Configure rules in an .htmlvalidate.json file:
{
"extends": ["html-validate:recommended"],
"rules": {
"no-trailing-whitespace": "off",
"require-sri": "off"
}
}
Branch Protection Rules
Quality checks only matter if they are enforced. GitHub branch protection rules (and their equivalents in GitLab, Bitbucket, and Azure DevOps) prevent merges when required status checks fail:
- Require status checks to pass: Select the CI jobs that must succeed before a PR can merge. This is the most critical setting.
- Require branches to be up to date: Ensures the PR branch includes the latest changes from
main, so the checks ran against the current state of the codebase. - Require pull request reviews: At least one (or more) approvals from team members before merging. Combine automated checks with human review for comprehensive quality assurance.
- Include administrators: Even repository admins cannot bypass the quality gates. This prevents the "just this once" override that inevitably becomes the norm.
- Require linear history: Prevents merge commits, enforcing a cleaner git history that is easier to bisect when tracking down regressions.
SonarQube Quality Gates
SonarQube is a dedicated code quality platform that provides its own quality gate mechanism. A SonarQube quality gate evaluates code against a set of conditions and returns pass or fail. The default "Sonar Way" quality gate requires:
- No new bugs
- No new vulnerabilities
- No new security hotspots reviewed as unsafe
- New code coverage above 80%
- New code duplication below 3%
Notice that SonarQube focuses on new code by default. This is a pragmatic approach: it does not punish teams for legacy code quality issues but ensures that all new code meets a high standard. Over time, as legacy code is refactored, the overall quality improves.
Integrating SonarQube into your pipeline looks like this in GitHub Actions:
- name: SonarQube Scan
uses: sonarsource/sonarqube-scan-action@v7.0.0
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}
- name: SonarQube Quality Gate
uses: sonarsource/sonarqube-quality-gate-action@v1.2.0
timeout-minutes: 5
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
The quality gate action polls SonarQube for the analysis result and fails the pipeline step if the quality gate does not pass.
CodeFrog Mega Report Grades
CodeFrog generates a comprehensive quality report (the Mega Report) that grades a website across multiple dimensions: accessibility, security, performance, SEO, HTML validation, and more. Each dimension receives a letter grade from A to F, and the overall report provides a composite grade.
You can use these grades as quality gate criteria. For example, you might require that no dimension scores below a B before a deployment is allowed, or that the overall grade must be A or B. This approach provides a holistic quality gate that considers all dimensions simultaneously rather than checking them individually.
The advantage of a composite quality gate is that it prevents teams from optimizing one dimension at the expense of others. A site that has perfect accessibility but critical security vulnerabilities should not pass the gate, and vice versa.
Gradually Raising Quality Standards
One of the biggest mistakes teams make with quality gates is trying to enforce everything at once. If your project currently has 30% test coverage, 47 accessibility violations, and several known security issues, setting the gate to require 80% coverage, zero accessibility violations, and zero security findings will block all development.
Instead, use a phased approach:
- Phase 1 — Establish baselines: Run all quality checks but do not block merges. Collect data on the current state. Share the results with the team.
- Phase 2 — Prevent regression: Set thresholds at the current level. The gate ensures things do not get worse, even if they are not yet good.
- Phase 3 — Incremental improvement: Each sprint, tighten one threshold slightly. Increase coverage by 2%, reduce the accessibility violation threshold by 5, move the security audit level from
criticaltohigh. - Phase 4 — Target state: Over several months, reach your target quality standards. The team has adapted to the gates, the codebase has improved, and the quality bar feels natural rather than oppressive.
Common Quality Gate Criteria Summary
Here is a reference table of quality gate criteria organized by dimension:
- Tests: All tests pass. Coverage above threshold (e.g., 80%). No test regressions.
- Accessibility: Zero critical/serious WCAG 2.1 AA violations detectable by automated tools (note: automated checks cover only a subset of WCAG criteria; full conformance requires manual testing). Pa11y CI exits cleanly.
- Security: Zero high/critical dependency vulnerabilities. Zero leaked secrets. Security headers present.
- Code Quality: Zero linting errors. Formatting consistent. HTML valid. No new code smells above threshold.
- Performance: Lighthouse performance score above threshold (e.g., 90). Core Web Vitals within "Good" range.
- SEO: Lighthouse SEO score above threshold. Required meta tags present. Structured data valid.
Not every project needs every gate. Start with the dimensions that matter most to your users and your business, and expand from there.
Resources
- SonarQube Quality Gates Documentation — How to define, configure, and enforce quality gates in SonarQube
- GitHub Branch Protection Documentation — How to configure required status checks, reviews, and other branch protection settings