ClaudeGate: an HTTP gateway to expose Claude Code as a REST API

Claude Code is a great CLI tool. You run it, it reads your codebase, it thinks, it writes code. But the moment you want to integrate it into a pipeline — a CI step, a backend service, a custom dashboard — you hit a wall. The CLI is a black box: you feed it via stdin, you read its stdout, and there is no HTTP interface, no job queue, no way to poll status, no streaming endpoint. Nothing.

I built ClaudeGate to solve exactly that. It wraps the Claude Code CLI into a proper REST API with an async job queue, SSE streaming, webhook callbacks, and SQLite persistence. The whole thing ships as a single static Go binary. No CGO, no runtime dependencies, no surprise.

Why Go, and why a gateway pattern

Claude Code already knows how to run code. My job was not to reimplement intelligence — it was to build the plumbing around it. A gateway pattern is the right fit here: take an opaque subprocess, wrap it with a proper async API, add persistence and observability, expose clean HTTP endpoints.

Go was the obvious choice. It gives you goroutines for free, channels for fan-out, and exec.CommandContext for subprocess management with cancellation baked in. The standard library is good enough that you end up with a small, self-contained binary that runs anywhere. No dependency hell. modernc.org/sqlite is pure Go — no CGO — so the binary stays static and works in scratch or Alpine containers.

ClaudeGate architecture: the request is queued (SQLite, persistent), workers run the Claude Code CLI in a child process, and the response flows back over SSE or via webhook.

The lifecycle of a job

Every request follows the same path. Here is the full flow from HTTP call to final result:

POST /api/v1/jobs
    → validate request body (1MB limit)
    → INSERT INTO jobs (status = "queued")
    → Enqueue(job.ID)           ← only the ID, not the struct
          |
          ↓
    buffered chan string
          |
          ↓
    worker goroutine
    → load job from SQLite
    → exec claude --output-format stream-json
    → parse stdout line by line
         "assistant" lines → SSE chunk events → subscribers
         "result" line    → SSE result event  → webhook callback
    → UPDATE jobs (status = "completed")

A deliberate design choice: only the job ID travels through the channel, not the full job struct. This matters for cancellation. If a job is cancelled mid-queue — before a worker picks it up — the worker loads it from SQLite, sees the cancelled status, and exits immediately. No stale data, no race conditions from a struct that was mutated after being enqueued.

The job queue: backpressure without drama

The queue is a buffered chan string. Simple, effective. Here is the Enqueue logic:

func (q *Queue) Enqueue(id string) error {
    select {
    case q.jobs <- id:
        return nil
    default:
        return ErrQueueFull
    }
}

The select/default pattern gives you non-blocking backpressure for free. If the channel is full, the caller gets an immediate error. No goroutine created, no blocking, no silent queue growth. The caller can decide what to do — retry later, surface an error to the user, whatever.

N worker goroutines all read from the same channel. Starting them is one of my favorite Go 1.22 patterns:

for range q.cfg.Concurrency {
    go q.runWorker(ctx)
}

range over an integer. No index variable, no boilerplate. Added in Go 1.22, and I use it every time I start a worker pool now.

For cancellation, the queue maintains a cancels map[string]context.CancelFunc. When a worker starts processing a job, it registers its cancel function under the job ID. The HTTP CancelJob handler looks up the running job by ID and calls the cancel — which propagates into the exec.CommandContext and kills the subprocess cleanly.

SSE streaming: fan-out without blocking

Claude Code outputs results progressively. Streaming those chunks to clients via Server-Sent Events is a much better experience than polling. The fan-out design:

type SSEEvent struct {
    Type    string // "status", "chunk", "result"
    Payload string
}

// Each subscriber gets its own buffered channel
subs map[string][]chan SSEEvent

When a chunk arrives from the subprocess, the worker broadcasts to all channels in subs[jobID]. Each channel has a capacity of 64. The send is non-blocking — if a subscriber's channel is full, the event is silently dropped for that subscriber. A slow client does not slow down the worker.

There is one edge case worth handling: what if a client connects after the job is already finished? In that case, the SSE handler checks the job status in SQLite first. If it is in a terminal state (completed, failed, cancelled), it immediately sends the final result and closes the connection — no subscription created, no waiting.

Webhook delivery: async, resilient, and SSRF-aware

When a job completes, ClaudeGate can POST a callback to any URL you configure on the job. The delivery is fire-and-forget: the worker hands it off to a goroutine and moves on.

The webhook module handles three things I always forget when I first implement webhooks:

Retries with exponential backoff: 3 attempts, 1s → 2s → 4s, 30s timeout per request.
SSRF protection: before sending, the target hostname is resolved via DNS. If the resolved IP is private (RFC 1918), loopback, or link-local, the request is rejected. This prevents a user from pointing the callback at http://169.254.169.254/ or your internal services.
Consistent payload: every callback has the same shape regardless of outcome.

{
  "job_id": "01HZQ...",
  "status": "completed",
  "result": "Here is the refactored function:\n\n...",
  "error": ""
}

Security: constant-time auth and a mandatory system prompt

Authentication uses the X-API-Key header. Keys are compared with crypto/subtle.ConstantTimeCompare — the standard way to prevent timing attacks on string comparison. ClaudeGate supports multiple API keys simultaneously, which makes key rotation possible without downtime: add the new key, deploy, remove the old key, deploy again.

The more interesting security feature is the mandatory system prompt. Every job submitted through ClaudeGate gets a security prompt prepended automatically, before the user's prompt reaches Claude. This prompt forbids shell execution, filesystem writes, and network access. The idea: ClaudeGate is an API that arbitrary callers can hit — you don't want to give them an unrestricted code execution environment. The security prompt is the guardrail.

If you are running ClaudeGate in a trusted environment where you control all callers, you can disable this with CLAUDEGATE_UNSAFE_NO_SECURITY_PROMPT=true. The name makes the trade-off explicit.

One more thing: the worker strips all CLAUDE* environment variables before exec-ing the subprocess. This prevents environment leakage from the parent process into Claude.

SQLite: simple persistence and crash recovery

SQLite with WAL mode is the right database for this use case. There is no need for Postgres here. Jobs are sequential, writes are frequent but small, the dataset is bounded.

The problem WAL + busy_timeout solves. SQLite uses file-level locks. Without configuration, when a writer holds the lock (e.g. a worker saving a result), any other concurrent access immediately gets SQLITE_BUSY — an error, not a wait:

Worker 1: UPDATE jobs SET status='completed'...  ← lock held
Worker 2: UPDATE jobs SET status='processing'... ← SQLITE_BUSY immediately, error
API:      SELECT * FROM jobs...                  ← SQLITE_BUSY immediately, error

busy_timeout = 10000 changes this behavior: SQLite retries for up to 10 seconds before returning the error. For the vast majority of contentions (a few milliseconds between two workers), this silently resolves the problem.

WAL alone is not enough. WAL fixes concurrent reads alongside writes — readers and writers no longer block each other. But two simultaneous writers are still exclusive. With CLAUDEGATE_CONCURRENCY > 1, two workers can finish at the same time and both attempt an UPDATE simultaneously.

WAL     → readers and writers no longer block each other
timeout → two simultaneous writers: the second waits instead of failing

The crash recovery logic is the part I am most satisfied with. On startup, ClaudeGate queries for any jobs stuck in processing status:

// On startup: recover jobs that were interrupted by a crash
stuck, err := store.FindByStatus(ctx, "processing")
for _, job := range stuck {
    store.UpdateStatus(ctx, job.ID, "queued")
    queue.Enqueue(job.ID)
}

If the process was killed mid-job — power loss, OOM, container restart — those jobs are re-queued on the next startup and processed normally. No manual intervention, no lost work.

Schema migrations are handled with idempotent ALTER TABLE ADD COLUMN statements. If the column already exists, SQLite returns an error, which is silently ignored. Not elegant, but it works, and it keeps the migration code dead simple.

A TTL cleanup goroutine runs periodically and deletes jobs older than the configured retention period. The SQLite file stays bounded in size even if you run ClaudeGate for months.

Modern Go patterns worth noting

The project uses Go 1.22's enhanced stdlib routing. No external router needed:

mux.HandleFunc("GET /api/v1/jobs/{id}/sse", h.StreamSSE)
mux.HandleFunc("POST /api/v1/jobs/{id}/cancel", h.CancelJob)
mux.HandleFunc("DELETE /api/v1/jobs/{id}", h.DeleteJob)

// In the handler:
id := r.PathValue("id")

Method + path in a single string, path parameters with PathValue. This is genuinely good enough for most APIs. I reach for a third-party router only when I need middleware chaining or route groups that the stdlib does not give me cleanly.

The Store interface keeps the job repository testable without touching SQLite in unit tests:

type Store interface {
    Insert(ctx context.Context, job *Job) error
    GetByID(ctx context.Context, id string) (*Job, error)
    UpdateStatus(ctx context.Context, id, status string) error
    FindByStatus(ctx context.Context, status string) ([]*Job, error)
    Delete(ctx context.Context, id string) error
}

The web playground is a single HTML file embedded at compile time via //go:embed static/index.html and served at GET / without authentication. It gives you a UI to submit jobs, browse history, and read the API docs without any external tool.

Get started in 5 minutes

git clone https://github.com/ohugonnot/claudegate.git
cd claudegate
cp .env.example .env
make build
./bin/claudegate

The .env.example documents every configuration option. Set your API key, optionally configure concurrency and TTL, point it at your Claude installation. That is it.

The full API surface:

Method	Path	Description
POST	/api/v1/jobs	Submit a job
GET	/api/v1/jobs	List jobs (limit, offset)
GET	/api/v1/jobs/{id}	Poll job status
DELETE	/api/v1/jobs/{id}	Delete a job
GET	/api/v1/jobs/{id}/sse	Stream output via SSE
POST	/api/v1/jobs/{id}/cancel	Cancel a running job
GET	/api/v1/health	Healthcheck (no auth)
GET	/	Embedded web playground

Key takeaway

ClaudeGate is not trying to be a platform. It is a thin, pragmatic wrapper that turns a CLI tool into something you can actually integrate with. The design is intentionally simple: one binary, one SQLite file, one config file. If you need to run it in Docker, it fits in a scratch container. If you need to run it on a bare VM, it is a single binary copy.

The interesting engineering here is not in the individual pieces — queues, SSE, webhooks are all well-understood patterns. It is in how they compose: the ID-only channel for safe cancellation, the crash recovery on startup, the SSRF check before every webhook delivery. The details that make the difference between "works in a demo" and "works in production at 3am".

The project is open source. If you are building something on top of Claude Code and want a proper HTTP interface, give it a try: github.com/ohugonnot/claudegate.