Adding comments to a PHP blog without a database

The first article on this blog explained how it was built in 30 minutes with Claude Code. Naturally, a blog needs comments. Same constraints: no database, no external dependencies, no Disqus tracking visitors. Just PHP + JSON files. Built in one session with Claude Code — the interesting part wasn't the code, it was the security audit that followed.

A comment system without a database seems trivial. It almost is. But "almost" hides a few classic pitfalls — some of them introduced directly by the speed of AI agents. The final result fits in ~300 lines total. What follows is the journey, not just the destination.

The requirements

  • Comments stored in JSON files (1 per article)
  • Immediate publication — no moderation queue for a personal blog
  • Anti-spam without captcha (zero friction for humans)
  • GDPR: no email, no full IP stored
  • Design integrated with the existing blog (same CSS system)
  • Zero external dependencies

The architecture

Three places in the codebase:

blog/
├── comments/
│   └── {slug}.json          # 1 file per article, created on first comment
├── posts/
│   └── comment-handler.php  # Single POST endpoint
└── template.php             # Modified blog_footer(): comments + form

The JSON storage format per article looks like this:

[
  {
    "id": "a3f2b1c4",
    "author": "Jean Dupont",
    "date": "2026-02-22T14:32:10+01:00",
    "content": "Great article, thanks.",
    "ip_hash": "9f86d081884c7d65"
  }
]

Each field has a reason. id: a sha256(uniqid()) truncated to 8 characters — unique enough for a personal blog, no need for a full UUID v4. ip_hash: hash of only the first two octets of the IP (see next section). content: stored raw, escaped on display with htmlspecialchars() — never store HTML.

Anti-spam: honeypot + rate limiting

Why honeypot over captcha: zero friction, no external service, stops the vast majority of bots. A hidden field in the form that humans don't see and don't fill in. Bots fill it every time:

<!-- Invisible to humans, irresistible to bots -->
<div class="hp-field">
    <input type="text" name="website" tabindex="-1" autocomplete="off">
</div>
.hp-field { display: none !important; }

Important detail: rejection returns a normal redirect to the article page, with the #comments anchor. No 403, no error message — nothing that confirms the filter exists. The bot receives a 302, as if it had succeeded.

Rate limiting: max 3 comments per IP prefix per 10-minute window, checked by scanning the existing JSON. The IP hash calculation handles both IPv4 and IPv6:

if (str_contains($ip, ':')) {
    // IPv6: take the first 3 groups (network /48)
    $groups = explode(':', $ip, 4);
    $ipPrefix = implode(':', array_slice($groups, 0, 3));
} else {
    // IPv4: take the first 2 octets
    $octets = explode('.', $ip, 3);
    $ipPrefix = ($octets[0] ?? '0') . '.' . ($octets[1] ?? '0');
}
$ipHash = substr(hash('sha256', $ipPrefix), 0, 16);

Only the network prefix is hashed — not the full IP. Impossible to recover the individual address, but possible to detect abuse from the same network. This is the right balance for GDPR: protection without over-collection.

The POST endpoint

comment-handler.php is the sole endpoint that receives submissions. The validation chain in order:

  1. GET method → redirect to /blog/ (no direct access)
  2. Honeypot not empty → silent redirect (fake confirmation)
  3. CSRF token — timing-safe comparison with hash_equals()
  4. Slug validated by regex [a-z0-9-]+ (path traversal prevention)
  5. Corresponding article file exists
  6. Author: 2–50 characters, content: 10–2000 characters
  7. Rate limit: < 3 comments from the same IP prefix in the last 10 minutes
  8. JSON write with LOCK_EX
  9. Post-Redirect-Get

The PRG (Post-Redirect-Get) pattern at the last step: after writing, redirect to /blog/{slug}?comment_ok=1#comments. The user lands on the page anchored to the comments section with a confirmation message. And most importantly: F5 doesn't resubmit the form.

What the security audit found

The initial implementation worked. The security audit's job was to break it. Five problems.

Missing CSRF

The most serious one. Without a CSRF token, any site can embed a hidden form that posts to /blog/comment-handler.php. A visitor clicks on a malicious link, their browser sends the request with their session cookies — and a comment is created in their name.

Fix: generate a token when the form is created, store it in the session, verify with hash_equals() on submission. The detail that matters:

// ❌ Vulnerable to timing attacks
if ($_POST['csrf_token'] === $_SESSION['csrf_token']) { ... }

// ✅ Constant-time comparison
if (!hash_equals($_SESSION['csrf_token'], $_POST['csrf_token'] ?? '')) {
    // Reject
}

Open redirect

The honeypot rejection used $slug in the redirect before the slug was validated. A crafted slug like //evil.com could redirect to an external domain. Fix: preg_replace('/[^a-z0-9-]/', '', $slug) applied immediately after reading from $_POST, before any use.

Session started on all pages

session_start() was at the top of template.php. Result: the blog listing page (which has no form) was setting a session cookie on every visitor. GDPR problem — a session cookie is a tracker.

Fix: move session_start() into the if ($slug !== null) block in blog_footer() — only pages that display a comment form start a session.

Broken IPv6

explode('.', $ip) on an IPv6 address like 2001:db8::1 returns a single-element array. All IPv6 visitors therefore shared a single rate limiting bucket. The bug produced no error — just completely ineffective rate limiting for half the web. Fix: the str_contains($ip, ':') detection shown above.

Dates in English

date('j M Y') outputs "22 Feb 2026" on a French-language blog. You could use setlocale(), but that's a function whose behavior depends on locales installed on the server — unreliable on shared hosting. More robust solution: a static lookup array.

$months = [
    1 => 'janvier', 2 => 'février', 3 => 'mars',
    4 => 'avril', 5 => 'mai', 6 => 'juin',
    7 => 'juillet', 8 => 'août', 9 => 'septembre',
    10 => 'octobre', 11 => 'novembre', 12 => 'décembre'
];
$date = new DateTime($comment['date']);
$formatted = $date->format('j') . ' ' . $months[(int)$date->format('n')] . ' ' . $date->format('Y');

Editorial note: 3 of these 5 problems were introduced by the initial parallel implementation (Sonnet agents working simultaneously on different parts of the code). The audit found all of them. The takeaway: AI-assisted code requires the same rigor of review as human-written code. Maybe more, because it arrives fast and looks correct.

Conclusion

The complete system is ~120 lines of PHP for the handler, ~60 lines of additions in the template, ~130 lines of CSS. No database, no npm install, no build step. Deployed by copying files.

The real value wasn't in writing the code — any PHP developer can write this in an afternoon. It was in the iterative security review: implement quickly, then audit methodically by trying to break things. With Claude Code, both fit into the same session — write, then immediately switch to adversarial mode.

For a personal blog, that's the right level of engineering. No more.

Comments (0)