Analytics and Real User Monitoring

Performance testing in a lab environment tells you how fast your site could be. Analytics and Real User Monitoring (RUM) tell you how fast it actually is for the people who use it. The gap between lab results and real-world performance is often significant, because lab tests run on powerful machines with fast connections, while real users browse on mid-range phones over cellular networks from locations far from your servers.

Understanding real user behavior and real user performance is essential for quality engineering. It shifts your perspective from "does this work on my machine?" to "does this work for my users?" — and those are very different questions.

Google Analytics 4 (GA4)

Google Analytics 4 is the current generation of Google's analytics platform. It replaced Universal Analytics and introduced an event-based data model where every interaction is tracked as an event rather than fitting into the older session-and-pageview paradigm. GA4 is free for most use cases and provides insights into how users find, navigate, and interact with your website.

Key capabilities of GA4 for quality engineering:

  • User flow analysis: Understand the paths users take through your site. Identify pages where users drop off, which often indicates usability problems, confusing navigation, or performance issues.
  • Engagement metrics: GA4 tracks engaged sessions (sessions lasting more than 10 seconds, with a conversion event, or with 2 or more page views). Low engagement rates on specific pages can signal quality problems.
  • Event tracking: Custom events let you measure specific interactions like form submissions, button clicks, video plays, and error occurrences. You can track JavaScript errors as custom events to correlate error rates with user behavior.
  • Audience segmentation: Break down data by device type, browser, geography, and other dimensions. If your bounce rate is high on mobile but fine on desktop, that tells you exactly where to focus your quality improvement efforts.
  • Core Web Vitals integration: GA4 can receive Core Web Vitals data, giving you performance metrics alongside behavioral data in a single dashboard.

Real User Monitoring vs. Synthetic Monitoring

These two approaches to performance monitoring are complementary, not competing. Understanding when to use each is important:

Real User Monitoring (RUM) collects performance data from actual users as they interact with your site. A small JavaScript snippet in your pages measures loading times, Core Web Vitals, and other metrics for every page load, then sends that data to a collection endpoint. The result is a complete picture of how your site performs across the full spectrum of user devices, networks, locations, and behaviors.

Strengths of RUM:

  • Reflects actual user experience, including the long tail of slow connections and old devices.
  • Captures the full distribution of performance — not just the median, but the 75th and 95th percentiles where real problems hide.
  • Shows performance by segment: you can see that users in South America have 3x slower load times than users in North America, or that Chrome performs differently from Safari.
  • Measures metrics that lab tests cannot, such as First Input Delay (which requires a real user interaction).

Synthetic monitoring runs scripted tests from controlled locations on a regular schedule. A monitoring service loads your pages from its servers (typically using headless Chrome) and measures the results. Think of it like an automated quality inspector running the same test every 5 minutes from 10 locations around the world.

Strengths of synthetic monitoring:

  • Consistent baseline for tracking changes over time. Since the test conditions are controlled, you can confidently attribute performance changes to code or infrastructure changes rather than shifts in user mix.
  • Detects regressions before users encounter them. If a deployment makes your checkout page 2 seconds slower, synthetic monitoring catches it immediately — even at 3 AM when no real users are active.
  • Can test pages that have no traffic yet (pre-launch, staging environments).
  • Can test complex user flows (login, add to cart, checkout) end-to-end on a schedule.

The best monitoring strategy uses both: synthetic monitoring for consistent regression detection and baseline tracking, and RUM for understanding the real user experience.

Core Web Vitals in the Field

Core Web Vitals are Google's standardized metrics for measuring user experience on the web. While lab tools like Lighthouse measure these in a simulated environment, the values that matter most are the ones measured from real users in the field.

The web-vitals JavaScript library is Google's official library for measuring Core Web Vitals in the browser. It is lightweight (under 2 KB gzipped) and provides accurate measurements of:

  • Largest Contentful Paint (LCP): How long it takes for the largest visible content element (image, heading, text block) to render. Good scores are under 2.5 seconds.
  • Interaction to Next Paint (INP): How responsive the page is to user interactions. It measures the latency from a user input (click, tap, key press) to the next visual update. Good scores are under 200 milliseconds.
  • Cumulative Layout Shift (CLS): How much the page layout shifts unexpectedly during loading. Good scores are under 0.1.

To use the web-vitals library, install it via npm or load it from a CDN, then call the metric functions with a callback that sends the data to your analytics endpoint:

import {onLCP, onINP, onCLS} from 'web-vitals';

onLCP(sendToAnalytics);
onINP(sendToAnalytics);
onCLS(sendToAnalytics);

function sendToAnalytics(metric) {
  // Send to your analytics endpoint
  navigator.sendBeacon('/analytics', JSON.stringify({
    name: metric.name,
    value: metric.value,
    id: metric.id
  }));
}

Chrome User Experience Report (CrUX)

The Chrome User Experience Report is a public dataset of real user performance data collected from Chrome users who have opted in to sharing usage statistics. It provides field data for millions of websites, aggregated at the origin level (and for individual URLs with sufficient traffic).

CrUX data is significant because it is the dataset Google uses to assess Core Web Vitals for search ranking purposes. If your CrUX data shows poor performance, it can affect your search visibility. You can access CrUX data through several channels:

  • PageSpeed Insights: Enter any URL to see its CrUX data alongside lab results from Lighthouse.
  • Google Search Console: The Core Web Vitals report shows CrUX data for all pages on your site, grouped by status (good, needs improvement, poor).
  • CrUX API: Programmatic access to CrUX data for integration into your own dashboards and monitoring tools.
  • BigQuery: The full CrUX dataset is available in BigQuery for advanced analysis and comparisons across sites.

Using RUM Data to Prioritize Quality Improvements

RUM data is most powerful when you use it to make decisions. Here is a practical approach to turning data into action:

  1. Identify your worst-performing pages: Sort pages by their 75th percentile LCP or INP. The pages at the top of this list are where your users are suffering the most.
  2. Segment by device and connection: A page that loads in 1.5 seconds on desktop might take 8 seconds on a mobile device over a 3G connection. Look at the segments where performance is worst, not just the overall average.
  3. Correlate performance with business metrics: Compare conversion rates, bounce rates, and revenue for fast page loads versus slow page loads. This data helps you build the business case for performance investments. Research consistently shows that faster pages convert better.
  4. Set performance budgets: Based on your RUM data, establish target metrics for each page type. For example, "product pages should have LCP under 2.5 seconds at the 75th percentile." Monitor these budgets over time and investigate regressions.
  5. Measure the impact of changes: After optimizing a page, compare the before-and-after RUM data to verify the improvement reached real users. Lab improvements do not always translate to field improvements if the bottleneck was something the lab test did not capture (like slow third-party scripts).

Privacy Considerations

Analytics and RUM collect data about user behavior, which brings privacy responsibilities. Regulations like GDPR (European Union), CCPA (California), and similar laws worldwide require that you handle user data transparently and with consent.

  • Cookie consent: Google Analytics uses cookies, which means you need a cookie consent mechanism for users in jurisdictions that require it. Do not load GA4 until the user has given consent. This means your analytics data will be incomplete for users who decline, which is a trade-off you must accept.
  • Data minimization: Collect only the data you need. Avoid tracking personally identifiable information (PII) like names, email addresses, or IP addresses in your analytics events. GA4 has IP anonymization enabled by default, but you should verify your configuration.
  • Data retention: Configure your analytics to delete data after a reasonable period. GA4 allows you to set data retention to 2 or 14 months. Shorter retention periods reduce your exposure if a data breach occurs.
  • Privacy policy: Your privacy policy must disclose what data you collect, why you collect it, how long you keep it, and who you share it with. This is a legal requirement, not just a best practice.

Server-Side Analytics Alternatives

If privacy is a primary concern for your project, or if you want analytics that are not blocked by ad blockers, server-side analytics tools offer a compelling alternative to Google Analytics:

  • Plausible Analytics is a lightweight, open-source, privacy-friendly analytics tool. It does not use cookies, does not collect personal data, and is fully compliant with GDPR without requiring a cookie consent banner. The script is under 1 KB, compared to GA4's much larger footprint. You can self-host it or use their cloud service.
  • Umami is another open-source, privacy-focused analytics platform. Like Plausible, it is cookie-free and GDPR compliant by default. Umami provides a clean interface with real-time dashboards, event tracking, and multi-site support. It is designed to be self-hosted, giving you complete control over your data.
  • Matomo (formerly Piwik) is the most feature-rich open-source analytics platform. It offers functionality comparable to Google Analytics, including heatmaps, session recordings, and A/B testing. Matomo can be configured to work without cookies and can be self-hosted for full data ownership. It is widely used by government organizations and companies with strict data sovereignty requirements.

These tools trade the depth and ecosystem of Google Analytics for simplicity, privacy, and independence. For many sites, they provide all the analytics you actually need without the privacy baggage.

Tip: Start with the web-vitals library to measure Core Web Vitals from real users. It takes less than 10 minutes to integrate and gives you immediately actionable data about your site's real-world performance. Combine it with a privacy-friendly analytics tool like Plausible for a lightweight, compliant monitoring setup.

Resources