Dynamic related articles in PHP without a database

The "related articles" section at the bottom of a blog post is exactly the kind of thing you hardcode at first — two links picked by hand — and then forget to maintain. The result: stale suggestions pointing to unrelated articles, or worse, articles that no longer exist.

On this blog, all content is described in a single posts.json file, with each article's slug, category, and tags. That's enough to automatically calculate relevant suggestions without a database.

The initial version passed an array of links to blog_footer():

<?php blog_footer([
    ['url' => '/blog/commentaires-sans-bdd-php', 'title' => 'Adding comments...'],
    ['url' => '/blog/creer-un-blog-avec-claude-code', 'title' => 'Building a blog...'],
]); ?>

Functional, but requires manual maintenance in every article file. With 10 articles it's manageable, with 50 it becomes unworkable — and it never automatically improves when you publish a new article that would be more relevant.

The source of truth: posts.json

Each article is described in posts.json with its metadata:

{
  "slug": "analytics-php-sans-cookies-rgpd",
  "title": "Analytics PHP sans cookies ni base de données",
  "date": "2026-02-25",
  "category": "Retour d'expérience",
  "tags": ["php", "analytics", "rgpd", "sécurité", "no-database"]
}

This is already used for the blog listing and the sitemap. Might as well use it for suggestions too.

The scoring algorithm

The principle is straightforward: for each candidate article (all articles except the current one), we calculate a relevance score based on two criteria:

  • +2 per shared tag — tags are the most precise signal
  • +1 if same category — a broader contextual signal

We sort by descending score, break ties by most recent, and keep the top 3. Articles with a score of 0 (nothing in common) are excluded.

function find_related_posts(string $slug, int $limit = 3): array {
    $all = json_decode(file_get_contents(__DIR__ . '/posts.json'), true) ?? [];

    // Find the current article
    $current = null;
    foreach ($all as $p) {
        if ($p['slug'] === $slug) { $current = $p; break; }
    }
    if (!$current) return [];

    $current_tags = $current['tags'] ?? [];
    $current_cat  = $current['category'] ?? '';

    // Score the candidates
    $scored = [];
    foreach ($all as $p) {
        if ($p['slug'] === $slug) continue;

        $score = count(array_intersect($current_tags, $p['tags'] ?? [])) * 2
               + ($p['category'] === $current_cat ? 1 : 0);

        if ($score > 0) {
            $scored[] = ['score' => $score, 'post' => $p];
        }
    }

    // Sort by score desc, then date desc on ties
    usort($scored, fn($a, $b) =>
        $b['score'] <=> $a['score'] ?: strcmp($b['post']['date'], $a['post']['date'])
    );

    return array_map(fn($s) => [
        'url'   => '/blog/' . $s['post']['slug'],
        'title' => $s['post']['title'],
    ], array_slice($scored, 0, $limit));
}

blog_footer() already accepted a $slug parameter (used to load comments). It just needs to auto-trigger the calculation when no explicit list is passed:

function blog_footer($related_posts = [], $slug = null) {
    if (empty($related_posts) && $slug !== null) {
        $related_posts = find_related_posts($slug);
    }
    // ...
}

On the article side, the call becomes:

<?php blog_footer([], 'my-article-slug'); ?>

That's it. Existing articles that were already passing their slug didn't need to be modified — they inherit the automatic behavior.

Why weight tags x2

A category groups articles with similar context but not necessarily similar subjects — "Lessons learned" can cover PHP just as well as Bash. Tags are more precise: two articles sharing the tags php and no-database are probably discussing the same problem. The x2 weighting on tags ensures that topical proximity takes precedence over categorical proximity.

Limitations

The posts.json file is read on every article page load. On a low-traffic blog with a dozen articles, this is negligible. If volume grows, a simple opcache or file cache would be sufficient — but that's not today's problem.

The algorithm doesn't consider the actual content of the articles, only their metadata. The quality of suggestions therefore depends directly on the quality of the tagging. Which is a good reason to be deliberate with tags rather than slapping them on at random.

Comments (0)