Claude Code is a great CLI tool. You run it, it reads your codebase, it thinks, it writes code. But the moment you want to integrate it into a pipeline — a CI step, a backend service, a custom dashboard — you hit a wall. The CLI is a black box: you feed it via stdin, you read its stdout, and there is no HTTP interface, no job queue, no way to poll status, no streaming endpoint. Nothing.
I built ClaudeGate to solve exactly that. It wraps the Claude Code CLI into a proper REST API with an async job queue, SSE streaming, webhook callbacks, and SQLite persistence. The whole thing ships as a single static Go binary. No CGO, no runtime dependencies, no surprise.
Why Go, and why a gateway pattern
Claude Code already knows how to run code. My job was not to reimplement intelligence — it was to build the plumbing around it. A gateway pattern is the right fit here: take an opaque subprocess, wrap it with a proper async API, add persistence and observability, expose clean HTTP endpoints.
Go was the obvious choice. It gives you goroutines for free, channels for fan-out, and
exec.CommandContext for subprocess management with cancellation baked in. The standard library
is good enough that you end up with a small, self-contained binary that runs anywhere.
No dependency hell. modernc.org/sqlite is pure Go — no CGO — so the binary stays static
and works in scratch or Alpine containers.
The lifecycle of a job
Every request follows the same path. Here is the full flow from HTTP call to final result:
POST /api/v1/jobs
→ validate request body (1MB limit)
→ INSERT INTO jobs (status = "queued")
→ Enqueue(job.ID) ← only the ID, not the struct
|
↓
buffered chan string
|
↓
worker goroutine
→ load job from SQLite
→ exec claude --output-format stream-json
→ parse stdout line by line
"assistant" lines → SSE chunk events → subscribers
"result" line → SSE result event → webhook callback
→ UPDATE jobs (status = "completed")
A deliberate design choice: only the job ID travels through the channel, not the full job struct.
This matters for cancellation. If a job is cancelled mid-queue — before a worker picks it up —
the worker loads it from SQLite, sees the cancelled status, and exits immediately.
No stale data, no race conditions from a struct that was mutated after being enqueued.
The job queue: backpressure without drama
The queue is a buffered chan string. Simple, effective. Here is the Enqueue logic:
func (q *Queue) Enqueue(id string) error {
select {
case q.jobs <- id:
return nil
default:
return ErrQueueFull
}
}
The select/default pattern gives you non-blocking backpressure for free. If the channel
is full, the caller gets an immediate error. No goroutine created, no blocking, no silent queue growth.
The caller can decide what to do — retry later, surface an error to the user, whatever.
N worker goroutines all read from the same channel. Starting them is one of my favorite Go 1.22 patterns:
for range q.cfg.Concurrency {
go q.runWorker(ctx)
}
range over an integer. No index variable, no boilerplate. Added in Go 1.22, and I use it
every time I start a worker pool now.
For cancellation, the queue maintains a cancels map[string]context.CancelFunc. When a worker
starts processing a job, it registers its cancel function under the job ID. The HTTP
CancelJob handler looks up the running job by ID and calls the cancel — which propagates
into the exec.CommandContext and kills the subprocess cleanly.
SSE streaming: fan-out without blocking
Claude Code outputs results progressively. Streaming those chunks to clients via Server-Sent Events is a much better experience than polling. The fan-out design:
type SSEEvent struct {
Type string // "status", "chunk", "result"
Payload string
}
// Each subscriber gets its own buffered channel
subs map[string][]chan SSEEvent
When a chunk arrives from the subprocess, the worker broadcasts to all channels in
subs[jobID]. Each channel has a capacity of 64. The send is non-blocking —
if a subscriber's channel is full, the event is silently dropped for that subscriber.
A slow client does not slow down the worker.
There is one edge case worth handling: what if a client connects after the job is already finished?
In that case, the SSE handler checks the job status in SQLite first. If it is in a terminal state
(completed, failed, cancelled), it immediately sends the
final result and closes the connection — no subscription created, no waiting.
Webhook delivery: async, resilient, and SSRF-aware
When a job completes, ClaudeGate can POST a callback to any URL you configure on the job. The delivery is fire-and-forget: the worker hands it off to a goroutine and moves on.
The webhook module handles three things I always forget when I first implement webhooks:
- Retries with exponential backoff: 3 attempts, 1s → 2s → 4s, 30s timeout per request.
- SSRF protection: before sending, the target hostname is resolved via DNS. If the resolved
IP is private (RFC 1918), loopback, or link-local, the request is rejected. This prevents a user from
pointing the callback at
http://169.254.169.254/or your internal services. - Consistent payload: every callback has the same shape regardless of outcome.
{
"job_id": "01HZQ...",
"status": "completed",
"result": "Here is the refactored function:\n\n...",
"error": ""
}
Security: constant-time auth and a mandatory system prompt
Authentication uses the X-API-Key header. Keys are compared with
crypto/subtle.ConstantTimeCompare — the standard way to prevent timing attacks on
string comparison. ClaudeGate supports multiple API keys simultaneously, which makes key rotation
possible without downtime: add the new key, deploy, remove the old key, deploy again.
The more interesting security feature is the mandatory system prompt. Every job submitted through ClaudeGate gets a security prompt prepended automatically, before the user's prompt reaches Claude. This prompt forbids shell execution, filesystem writes, and network access. The idea: ClaudeGate is an API that arbitrary callers can hit — you don't want to give them an unrestricted code execution environment. The security prompt is the guardrail.
If you are running ClaudeGate in a trusted environment where you control all callers, you can
disable this with CLAUDEGATE_UNSAFE_NO_SECURITY_PROMPT=true. The name makes the
trade-off explicit.
One more thing: the worker strips all CLAUDE* environment variables before
exec-ing the subprocess. This prevents environment leakage from the parent process into Claude.
SQLite: simple persistence and crash recovery
SQLite with WAL mode is the right database for this use case. There is no need for Postgres here. Jobs are sequential, writes are frequent but small, the dataset is bounded.
The problem WAL + busy_timeout solves. SQLite uses file-level locks. Without
configuration, when a writer holds the lock (e.g. a worker saving a result), any other concurrent
access immediately gets SQLITE_BUSY — an error, not a wait:
Worker 1: UPDATE jobs SET status='completed'... ← lock held
Worker 2: UPDATE jobs SET status='processing'... ← SQLITE_BUSY immediately, error
API: SELECT * FROM jobs... ← SQLITE_BUSY immediately, error
busy_timeout = 10000 changes this behavior: SQLite retries for up to 10 seconds
before returning the error. For the vast majority of contentions (a few milliseconds between
two workers), this silently resolves the problem.
WAL alone is not enough. WAL fixes concurrent reads alongside writes — readers and writers
no longer block each other. But two simultaneous writers are still exclusive. With
CLAUDEGATE_CONCURRENCY > 1, two workers can finish at the same time and both
attempt an UPDATE simultaneously.
WAL → readers and writers no longer block each other
timeout → two simultaneous writers: the second waits instead of failing
The crash recovery logic is the part I am most satisfied with. On startup, ClaudeGate queries
for any jobs stuck in processing status:
// On startup: recover jobs that were interrupted by a crash
stuck, err := store.FindByStatus(ctx, "processing")
for _, job := range stuck {
store.UpdateStatus(ctx, job.ID, "queued")
queue.Enqueue(job.ID)
}
If the process was killed mid-job — power loss, OOM, container restart — those jobs are re-queued on the next startup and processed normally. No manual intervention, no lost work.
Schema migrations are handled with idempotent ALTER TABLE ADD COLUMN statements.
If the column already exists, SQLite returns an error, which is silently ignored. Not elegant,
but it works, and it keeps the migration code dead simple.
A TTL cleanup goroutine runs periodically and deletes jobs older than the configured retention period. The SQLite file stays bounded in size even if you run ClaudeGate for months.
Modern Go patterns worth noting
The project uses Go 1.22's enhanced stdlib routing. No external router needed:
mux.HandleFunc("GET /api/v1/jobs/{id}/sse", h.StreamSSE)
mux.HandleFunc("POST /api/v1/jobs/{id}/cancel", h.CancelJob)
mux.HandleFunc("DELETE /api/v1/jobs/{id}", h.DeleteJob)
// In the handler:
id := r.PathValue("id")
Method + path in a single string, path parameters with PathValue. This is genuinely
good enough for most APIs. I reach for a third-party router only when I need middleware chaining
or route groups that the stdlib does not give me cleanly.
The Store interface keeps the job repository testable without touching SQLite in unit tests:
type Store interface {
Insert(ctx context.Context, job *Job) error
GetByID(ctx context.Context, id string) (*Job, error)
UpdateStatus(ctx context.Context, id, status string) error
FindByStatus(ctx context.Context, status string) ([]*Job, error)
Delete(ctx context.Context, id string) error
}
The web playground is a single HTML file embedded at compile time via //go:embed static/index.html
and served at GET / without authentication. It gives you a UI to submit jobs, browse history,
and read the API docs without any external tool.
Get started in 5 minutes
git clone https://github.com/ohugonnot/claudegate.git
cd claudegate
cp .env.example .env
make build
./bin/claudegate
The .env.example documents every configuration option. Set your API key, optionally
configure concurrency and TTL, point it at your Claude installation. That is it.
The full API surface:
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/jobs | Submit a job |
| GET | /api/v1/jobs | List jobs (limit, offset) |
| GET | /api/v1/jobs/{id} | Poll job status |
| DELETE | /api/v1/jobs/{id} | Delete a job |
| GET | /api/v1/jobs/{id}/sse | Stream output via SSE |
| POST | /api/v1/jobs/{id}/cancel | Cancel a running job |
| GET | /api/v1/health | Healthcheck (no auth) |
| GET | / | Embedded web playground |
Conclusion
ClaudeGate is not trying to be a platform. It is a thin, pragmatic wrapper that turns a CLI tool into something you can actually integrate with. The design is intentionally simple: one binary, one SQLite file, one config file. If you need to run it in Docker, it fits in a scratch container. If you need to run it on a bare VM, it is a single binary copy.
The interesting engineering here is not in the individual pieces — queues, SSE, webhooks are all well-understood patterns. It is in how they compose: the ID-only channel for safe cancellation, the crash recovery on startup, the SSRF check before every webhook delivery. The details that make the difference between "works in a demo" and "works in production at 3am".
The project is open source. If you are building something on top of Claude Code and want a proper HTTP interface, give it a try: github.com/ohugonnot/claudegate.