Caching Strategies

The fastest HTTP request is the one that never happens. Caching stores copies of resources at various points between your server and the user's browser so that subsequent requests can be served without going back to the origin. A well-designed caching strategy can eliminate redundant network requests, reduce server load, and make your site feel almost instant for returning visitors.

Browser Caching with Cache-Control

The Cache-Control HTTP response header is the primary mechanism for controlling how browsers and intermediate caches store your resources. It replaced the older Expires header and gives you fine-grained control over caching behavior.

Key Cache-Control Directives

  • max-age=N — tells the browser to consider the resource fresh for N seconds. During this window, the browser serves the cached copy without contacting the server at all. For example, max-age=31536000 caches for one year.
  • immutable — tells the browser that the resource will never change at this URL. Even when the user explicitly refreshes the page, the browser can skip revalidation. This is ideal for versioned assets like app.a1b2c3.js where the filename changes when the content changes.
  • no-cache — despite the confusing name, this does not prevent caching. It tells the browser to cache the resource but always revalidate with the server before using it. The server can respond with a 304 Not Modified if the resource hasn't changed, avoiding a full download.
  • no-store — this truly prevents caching. The browser must fetch the resource from the server every time. Use this for sensitive data that should never be stored on disk.
  • public — allows the response to be cached by shared caches (CDNs, proxy servers), not just the user's browser
  • private — restricts caching to the user's browser only. Use this for responses that contain user-specific data (like a personalized dashboard) that should not be stored by CDN edge servers.
  • s-maxage=N — like max-age but applies only to shared caches (CDNs). This lets you set different expiration times for the CDN and the browser.
  • stale-while-revalidate=N — allows the cache to serve a stale response while it revalidates in the background. This provides instant responses even when the cached version has expired.

Recommended Cache-Control Strategies

Different types of resources call for different caching strategies:

  • Versioned static assets (JS, CSS with content hashes in filenames): Cache-Control: public, max-age=31536000, immutable — cache for a year and never revalidate, since the filename changes when the content changes
  • HTML documents: Cache-Control: no-cache — always check with the server to ensure the latest version is served, but allow caching to avoid full downloads when nothing has changed
  • API responses: Cache-Control: private, no-cache or private, max-age=0, must-revalidate — prevent CDN caching of user-specific data while still allowing conditional requests
  • Sensitive data: Cache-Control: no-store — never cache authentication tokens, personal data exports, or other sensitive content
Tip: The combination of content-hashed filenames and immutable is the gold standard for static assets. When you deploy a new version, the filename changes (e.g., app.v2.js becomes app.v3.js), so the browser fetches the new file. The old version stays cached harmlessly until it's evicted.

ETags for Validation

An ETag (Entity Tag) is a unique identifier that the server assigns to a specific version of a resource. When the browser has a cached copy with an ETag, it sends that ETag in subsequent requests via the If-None-Match header. If the resource hasn't changed, the server responds with 304 Not Modified (no body), saving bandwidth. If it has changed, the server sends the new version with a new ETag.

ETags work alongside Cache-Control: no-cache to implement a "cache but always validate" pattern. This is perfect for HTML pages and API responses that change unpredictably — the user gets the latest content, but the server avoids sending the full response when nothing has changed.

There are two types of ETags:

  • Strong ETags (e.g., "abc123") — indicate byte-for-byte equivalence. Two responses with the same strong ETag are identical.
  • Weak ETags (e.g., W/"abc123") — indicate semantic equivalence. The responses might differ in insignificant ways (like whitespace or headers) but are functionally the same.

CDN Caching Layers

A Content Delivery Network (CDN) adds a caching layer between your origin server and your users. When a user requests a resource, the CDN edge server closest to them checks its cache. If the resource is cached (a "cache hit"), the edge server responds directly. If not (a "cache miss"), the edge server fetches from your origin, caches the response, and then serves it to the user.

CDN caching introduces a multi-tier architecture:

  1. Browser cache — controlled by Cache-Control and ETag headers in the response
  2. CDN edge cache — respects s-maxage (or max-age if s-maxage is not set) and can be purged independently
  3. CDN origin shield (optional) — a centralized cache between the edge and your origin that reduces origin load by consolidating cache misses
  4. Origin server — your actual web server, which should only handle requests when all cache layers miss

CDNs also support cache purging (also called invalidation), which lets you force all edge servers to discard a cached resource and fetch a fresh copy from the origin. This is essential when you deploy updates and need them visible immediately without waiting for the cache to expire.

Service Workers for Offline-First

A service worker is a JavaScript file that runs in a separate thread from your web page and can intercept network requests. This gives you programmatic control over caching that goes far beyond what HTTP headers can achieve.

Common service worker caching strategies:

  • Cache First — check the cache first, fall back to the network if not found. Best for static assets that rarely change.
  • Network First — try the network first, fall back to the cache if offline. Best for content that should be as fresh as possible but still available offline.
  • Stale While Revalidate — serve from cache immediately for a fast response, then update the cache from the network in the background. The user sees the cached version now and gets the updated version on the next visit.
  • Cache Only — only serve from cache, never go to the network. Used for resources that are explicitly precached during service worker installation.
  • Network Only — bypass the cache entirely. Used for requests that must always be fresh, like analytics pings.

Libraries like Workbox (from Google) simplify service worker implementation with pre-built caching strategies, precaching support, and routing helpers.

Warning: Service workers are powerful but add complexity. A bug in your service worker can serve stale content indefinitely or prevent updates from reaching users. Always include an update mechanism and test thoroughly. Service workers require HTTPS (except on localhost).

Cache Busting

The biggest challenge with caching is cache invalidation — making sure users get the latest version when you deploy changes. The most reliable technique is cache busting through content hashes.

Instead of serving styles.css, your build tool generates a filename that includes a hash of the file's contents: styles.a1b2c3d4.css. When you change the CSS, the hash changes, producing a new filename like styles.e5f6g7h8.css. Since the browser has never seen this URL before, it fetches the new file. The old file remains cached but is harmless since nothing references it anymore.

This pattern requires your HTML to reference the hashed filenames dynamically. Build tools like Webpack, Vite, and Parcel handle this automatically through their manifest or HTML plugin systems.

Other cache busting approaches (less recommended):

  • Query string versioning (styles.css?v=2) — some CDNs and proxies ignore query strings when caching, making this unreliable
  • Manual version numbers (styles.v2.css) — requires manual updating and is error-prone
  • Cache purging via CDN API — useful as a complement to content hashing but shouldn't be your primary strategy

Server-Side Caching

Not all caching happens in the browser or CDN. Server-side caching reduces the work your application needs to do for each request.

Redis

Redis is an in-memory data store commonly used for caching database query results, session data, and computed values. Because Redis stores data in RAM, reads are extremely fast (sub-millisecond). Redis supports data structures like strings, hashes, lists, and sorted sets, along with built-in expiration (TTL).

Common uses:

  • Caching expensive database queries
  • Storing session data
  • Rate limiting
  • Pub/sub messaging for real-time features

Memcached

Memcached is a simpler in-memory caching system designed purely for key-value storage. It is lighter weight than Redis and excels at caching large numbers of small, simple values. Memcached distributes data across multiple servers using consistent hashing, making it easy to scale horizontally.

Choose Redis when you need data structures, persistence, or pub/sub. Choose Memcached when you need a simple, high-throughput key-value cache for a large number of objects.

Cache Invalidation Strategies

Phil Karlton famously said, "There are only two hard things in Computer Science: cache invalidation and naming things." Here are the main approaches:

  • Time-based expiration (TTL) — set a max-age and accept that users might see stale content for that duration. Simple but imprecise.
  • Event-driven invalidation — when data changes, actively purge or update the cache. More complex but ensures freshness.
  • Content-addressed caching — use content hashes as cache keys (like the filename hashing described above). The cache never needs to be invalidated because changed content gets a new key.
  • Versioned keys — include a version number in cache keys (e.g., user:42:profile:v3). Increment the version when data changes, and old entries naturally expire.
Tip: When in doubt, prefer content-addressed caching for static assets and short TTLs with revalidation for dynamic content. This avoids the complexity of active invalidation while keeping content reasonably fresh.

Resources