CRL Double-Gate in mTLS: Revoking a Cert When the Client Is Already Connected

In the previous article, we saw how to serve three mTLS audiences on a single port with SNI routing, and how cert binding protects against session replay. But there's still a gap: revocation.

You revoke a client certificate. You update your CRL. The problem: the client is already connected via TCP keep-alive. Their TLS handshake happened 10 minutes ago. tls.Config.VerifyConnection only runs at handshake. The client keeps sending requests with a revoked cert, and your server accepts them.

Why VerifyConnection isn't enough

In Go, tls.Config offers two validation hooks:

  • VerifyPeerCertificate — called during the handshake, before the TLS connection completes
  • VerifyConnection — called after the full handshake, once

Both only run at handshake time. With HTTP/1.1 keep-alive or HTTP/2 multiplexing, a single handshake can serve hundreds of requests over several minutes. During that time, the CRL can change.

// This check only runs once per TLS connection
tlsConfig := &tls.Config{
    VerifyConnection: func(cs tls.ConnectionState) error {
        if len(cs.PeerCertificates) == 0 {
            return nil
        }
        serial := cs.PeerCertificates[0].SerialNumber
        if crlStore.IsRevoked(serial) {
            return fmt.Errorf("certificate %s is revoked", serial)
        }
        return nil
    },
}

This code blocks new connections with a revoked cert. It doesn't block existing ones.

The double-gate pattern

The solution: check the CRL in two places.

  1. Gate 1 — handshake-time via VerifyConnection: blocks new connections
  2. Gate 2 — request-time via HTTP middleware: checks the peer cert serial on every request
// Gate 2: HTTP middleware
func crlMiddleware(crlStore *CRLStore) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            if r.TLS == nil || len(r.TLS.PeerCertificates) == 0 {
                next.ServeHTTP(w, r)
                return
            }

            serial := r.TLS.PeerCertificates[0].SerialNumber
            if crlStore.IsRevoked(serial) {
                http.Error(w, "Certificate revoked", http.StatusForbidden)
                return
            }

            next.ServeHTTP(w, r)
        })
    }
}

The middleware accesses r.TLS.PeerCertificates — the certificates presented during the handshake of the connection carrying this request. Even if the handshake was 10 minutes ago, the serial is still accessible.

Cost: one CRL store lookup per request. If the store is in-memory (a map[string]bool behind a sync.RWMutex), it's a few nanoseconds.

CRL hot-reload

For the double-gate to be effective, the in-memory CRL must be current. Two approaches:

Periodic polling

func (s *CRLStore) startPolling(ctx context.Context, url string, interval time.Duration) {
    ticker := time.NewTicker(interval)
    defer ticker.Stop()

    for {
        select {
        case <-ticker.C:
            if err := s.reload(url); err != nil {
                slog.Error("CRL reload failed", "error", err)
            }
        case <-ctx.Done():
            return
        }
    }
}

Simple, but the delay between revocation and enforcement is at most the polling interval. For a financial service, 30 seconds might be too much.

Internal pubsub

The CRL issuer publishes an event on an internal channel (NATS, Redis Pub/Sub, PostgreSQL NOTIFY). The CRL store subscribes and reloads immediately. Sub-second latency between revocation and first request rejection.

The CRL rollback trap

A trap I found during audit: the CRL's HTTP source sometimes responds with stale content (CDN cache, deployment rollback, file race condition). If your CRL store naively replaces the in-memory CRL with the downloaded one, a rollback reactivates revoked certificates.

The solution: verify that the CRL Number is monotonically increasing.

func (s *CRLStore) reload(url string) error {
    newCRL, err := fetchCRL(url)
    if err != nil {
        return err
    }

    s.mu.Lock()
    defer s.mu.Unlock()

    // Monotonic check: new CRL Number must be > current
    if s.currentNumber != nil && newCRL.Number.Cmp(s.currentNumber) <= 0 {
        slog.Warn("CRL rollback detected",
            "current", s.currentNumber,
            "received", newCRL.Number,
        )
        return fmt.Errorf("CRL number %s <= current %s: rollback rejected",
            newCRL.Number, s.currentNumber)
    }

    s.revokedSerials = buildRevokedMap(newCRL)
    s.currentNumber = newCRL.Number
    return nil
}

The newCRL.Number > cached.Number check is the only protection against a rollback attack on the CRL. Without it, an attacker controlling the CRL source (or the upstream cache) can reactivate any certificate.

Summary: both gates and their roles

GateWhenProtects against
VerifyConnectionTLS handshakeNew connections with revoked cert
HTTP middlewareEvery requestKeep-alive connections with cert revoked in between
CRL monotonic checkCRL reloadRollback attack / stale cache

Conclusion

Revocation in mTLS is a topic where "it seems to work" often hides a gap of several minutes. The double-gate — handshake + middleware — is the minimal pattern for effective revocation. Hot-reload with monotonic check is the pattern for fast revocation without rollback risk.

We've covered the network and transport layers of the service. The next article dives into the application architecture: how to handle side-effects in a CQRS/Event Sourcing system — the pubsub bridge for command outcomes and atomic audit logging. That's the subject of the next article.

Comments (0)