Go middleware best practices in production

The service had run for six months without a hiccup. Then one morning: 2% of requests returning a 500 — no log, no trace, no panic surfaced. The kind of bug that never reproduces locally. Half a day later, the cause: a panic in a handler, caught by a recover middleware… placed inside the logging middleware. The recover swallowed the panic after the logger had already written its line — but before the request ID was set. The result: a ghost 500, invisible in the dashboards.

A Go middleware is ten trivial lines. A chain of middleware in production is where the real problems hide: ordering, panic recovery, timeouts, ResponseWriter wrapping, context propagation. Here's what 14 years of running APIs taught me about the middleware that holds up — and the kind that falls over.

A middleware is just a function that wraps another

No magic, no framework required. A Go middleware is a function that takes an http.Handler and returns another. The canonical pattern, the one everything else follows:

type Middleware func(http.Handler) http.Handler

func Logger(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        start := time.Now()
        next.ServeHTTP(w, r)
        slog.Info("request",
            "method", r.Method,
            "path", r.URL.Path,
            "dur", time.Since(start),
        )
    })
}

Everything before next.ServeHTTP runs on the way in (on the request). Everything after runs on the way out (on the response). That's the onion model: the request travels through each layer toward the handler, and the response climbs back out in reverse.

The request descends through each middleware to the handler; the response climbs back out in reverse request response Recover (panic) Request ID + Tracing Logger Timeout (context) Auth + Rate limit Handler
The onion model: declaration order = traversal order.

To chain them without nesting by hand (A(B(C(handler))) gets unreadable fast), a small helper does the job:

func Chain(h http.Handler, mw ...Middleware) http.Handler {
    // apply in reverse so mw[0] becomes the outermost layer
    for i := len(mw) - 1; i >= 0; i-- {
        h = mw[i](h)
    }
    return h
}

// usage: Recover is the outermost layer, it wraps everything else
handler := Chain(mux,
    Recover,
    RequestID,
    Logger,
    Timeout(5*time.Second),
    Auth,
)

Note the loop direction: we apply mw from last to first so that mw[0] stays the outermost layer. Get the direction wrong and the whole order inverts — which is exactly the first trap.

Order isn't cosmetic, it's functional

The most-asked question about Go middleware in production: what order do I declare them in? The answer isn't a style convention, it's a matter of correctness. Three non-negotiable rules:

1. Recover must be the outermost layer. If it sits inside the logger, a panic in the logger itself isn't caught. If it sits inside request ID, the emergency 500 won't carry the correlation ID. Recover wraps everything.

2. Request ID before the logger. Otherwise your log line has no ID to correlate with the rest of the trace. Obvious when written down — yet it's the most common inversion I see in review.

3. Auth and rate-limit closest to the handler, but after observability. You want to log and trace even rejected requests (a spike of 401s or 429s is a signal). If auth sits above the logger, your rejections are invisible.

// ❌ BAD: recover inside, rejections invisible, ID missing from the 500
handler := Chain(mux, Logger, Auth, Recover, RequestID)

// ✅ GOOD: recover outside, ID then log, auth near the handler
handler := Chain(mux, Recover, RequestID, Logger, Timeout(5*time.Second), Auth, RateLimit)

The rule of thumb: from most defensive to most domain-specific. What protects the server (recover) on top, what identifies and observes (ID, log, trace) next, what filters (timeout, auth, rate-limit) just before the business logic.

Recover: the middleware that keeps the whole process alive

In Go, an unrecovered panic in a handler goroutine doesn't just kill the request — depending on the version and HTTP runtime, it can take down the whole server. The standard net/http server recovers per-request panics by default, but it turns them into a silent dropped connection with no usable application log. An explicit Recover is therefore essential:

func Recover(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        defer func() {
            if err := recover(); err != nil {
                slog.Error("panic recovered",
                    "err", err,
                    "path", r.URL.Path,
                    "stack", string(debug.Stack()),
                )
                w.WriteHeader(http.StatusInternalServerError)
                _, _ = w.Write([]byte(`{"error":"internal"}`))
            }
        }()
        next.ServeHTTP(w, r)
    })
}

Two real traps here. First: never re-panic inside the defer, or you end up with an unrecovered panic during recovery. Second, subtler: if a downstream middleware has already written a status code before the panic, your w.WriteHeader(500) will emit a superfluous response.WriteHeader call warning and be ignored. The client receives a truncated response. Handling that case is exactly why you need to know whether the response has already started — which brings us to wrapping.

Wrapping http.ResponseWriter without breaking Flush or Hijack

To log the status code or the response size, you have to intercept calls to WriteHeader and Write. The naive solution: a wrapper that remembers the code.

type statusRecorder struct {
    http.ResponseWriter
    status  int
    written int
}

func (r *statusRecorder) WriteHeader(code int) {
    r.status = code
    r.ResponseWriter.WriteHeader(code)
}

func (r *statusRecorder) Write(b []byte) (int, error) {
    if r.status == 0 {
        r.status = http.StatusOK // implicit Write = 200
    }
    n, err := r.ResponseWriter.Write(b)
    r.written += n
    return n, err
}

It works… until the day you serve Server-Sent Events or streaming, and everything breaks. The classic wrapping trap: by embedding http.ResponseWriter in a struct, you lose the optional interfaces the original writer implemented — http.Flusher, http.Hijacker, http.Pusher. Your SSE handler type-asserts to http.Flusher, fails, and streaming never flushes again.

// ✅ Re-expose Flush so you don't break streaming behind the wrapper
func (r *statusRecorder) Flush() {
    if f, ok := r.ResponseWriter.(http.Flusher); ok {
        f.Flush()
    }
}

// same for Hijack if you serve WebSocket behind this middleware
func (r *statusRecorder) Hijack() (net.Conn, *bufio.ReadWriter, error) {
    if h, ok := r.ResponseWriter.(http.Hijacker); ok {
        return h.Hijack()
    }
    return nil, nil, fmt.Errorf("hijack not supported")
}

Since Go 1.20, http.ResponseController exists precisely to call Flush/SetWriteDeadline through nested wrappers without re-exposing each interface by hand. On a fresh codebase, use it. If you're maintaining an existing wrapper, re-expose at least Flush — it's the case that breaks most often in production.

Timeouts: cut the request before it overflows

A handler that calls a slow database or a third-party API with no timeout is a time bomb: under load, goroutines pile up, memory climbs, and the service ends up OOM-killed. The right reflex: a middleware that sets a deadline on the request context, propagated to every downstream call.

func Timeout(d time.Duration) Middleware {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            ctx, cancel := context.WithTimeout(r.Context(), d)
            defer cancel()
            next.ServeHTTP(w, r.WithContext(ctx))
        })
    }
}

Careful: this middleware doesn't kill the handler, it signals cancellation via ctx.Done(). Your downstream code still has to honor the context — a db.QueryContext(ctx, ...), not a db.Query(...). A timeout on a context no one listens to is useless. It's the same discipline as avoiding goroutine leaks: if nothing listens to ctx.Done(), the goroutine stays blocked.

Avoid the stdlib http.TimeoutHandler for this specific need: it writes its own 503 response when the deadline expires, which conflicts with your status wrapper and your recover. A context timeout keeps you in control of the response.

Observability: request ID, structured logs, and the context trap

The request ID is the thread that ties a log line, a distributed trace, and a support ticket together. The middleware generates it (or reads it from the incoming header) and puts it in the context:

type ctxKey string

const requestIDKey ctxKey = "request_id"

func RequestID(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        id := r.Header.Get("X-Request-ID")
        if id == "" {
            id = uuid.NewString()
        }
        ctx := context.WithValue(r.Context(), requestIDKey, id)
        w.Header().Set("X-Request-ID", id)
        next.ServeHTTP(w, r.WithContext(ctx))
    })
}

Two best practices many people miss. One: the context key must be a private type (type ctxKey string), never a bare string — otherwise two packages both using "request_id" clobber each other. That's exactly the spirit of type rigor in Go: make the mistake impossible to compile. Two: enrich the logger with the ID once, via a slog.Logger stored in the context, rather than threading the ID through every log call:

// in the Logger middleware, once you have the ID:
logger := slog.With("request_id", RequestIDFrom(r.Context()))
ctx := context.WithValue(r.Context(), loggerKey, logger)
next.ServeHTTP(w, r.WithContext(ctx))

// everywhere downstream:
LoggerFrom(r.Context()).Info("user created", "user_id", u.ID)

Rate-limiting follows the same middleware mechanics; if you're building a serious per-IP one with golang.org/x/time/rate, I covered the full implementation in this dedicated token-bucket article.

Conclusion

A middleware in isolation is a textbook exercise. A chain of middleware in production is a matter of order and contracts: who wraps whom, who sees the panic first, who sets the context the others will read. None of these decisions is aesthetic — each one changes how the service behaves under load or during an incident.

The test that doesn't lie: trigger a panic in a handler, in a simulated production run, and watch what lands in your logs. If you get a clean 500, with the request ID, the stack, and the correlated log line — your chain is sound. If you get a ghost 500 with no trace, you now know exactly which order to revisit.

Comments (0)