Automated multi-topic intelligence dashboard: concrete architecture with Claude and PHP

I started with a single crypto-veille.js script. 400 lines, everything hardcoded: categories, the prompt, FTP logic. It worked. Then I wanted the same thing to track the Epstein case, retro console releases, and tech/AI news. Copy-pasting a 400-line script four times with variations? Not my idea of a good time.

The exercise took three days of iterative development (with Claude Code as a pair programmer) and produced something interesting: a generic architecture driven by a central configuration file, a reusable Node.js runner, and a PHP rendering pattern that cleanly separates data from presentation. What I learned along the way — especially the bugs that cost the most time — is worth documenting.

The architecture in one sentence

A Node.js daemon polls registry.json every minute, decides which topics are due, launches run-veille.js --slug <slug>, which calls Claude CLI with WebSearch, deduplicates results, persists them as JSON, then calls Claude again (without WebSearch) to patch a structured article.json file. PHP reads these JSON files and generates HTML on the fly.


[veille-daemon.js]   ← runs continuously (systemd service)
    ↓ every minute: compare frequency_hours
    ↓ if cycle due → launch run-veille.js --slug <slug>

[run-veille.js]
    1. load registry.json (central config)
    2. claude --print --allowedTools WebSearch,WebFetch → items[] JSON
    3. deduplication via SHA256 of normalized title
    4. merge + prune → updates.json (atomic write)
    5. if render_prompt → claude --print (no WebSearch) → patch article.json
    6. if summary_weekly_day → summaries.json

[PHP veille/<slug>/index.php]
    include _veille-page.php → reads updates.json + article.json → HTML
        

The key element: everything that varies between topics is in registry.json. The Node.js code is generic. Adding a new topic requires zero changes to the runner.

registry.json — the heart of the system

Each topic is an entry in uploads/veille/registry.json:


{
  "veilles": {
    "retro": {
      "slug": "retro",
      "label": "Retro Consoles",
      "frequency_hours": 168,
      "prune_days": 180,
      "categories": ["NOUVELLE_CONSOLE", "FIRMWARE", "BON_PLAN"],
      "ftp_files": [
        { "local": "uploads/veille/retro/updates.json",  "remote": "/www/uploads/veille/retro/updates.json" },
        { "local": "uploads/veille/retro/article.json",  "remote": "/www/uploads/veille/retro/article.json" }
      ],
      "prompt": "You are an expert on the retro console market...",
      "render_prompt": "Update the JSON guide. Return only a patch..."
    }
  }
}
        

ftp_files[0] is always the raw news feed (updates.json). ftp_files[1+] are secondary files. render_prompt is optional — only for topics that have a structured article in addition to the news feed.

The article.json pattern — two Claude passes

The system distinguishes between two types of output files:

  • updates.json: chronological items feed (news, events). Append-only, automatic pruning.
  • article.json: a structured document enriched each cycle. Contains rankings, prices, biographies, analyses — everything that can't be reduced to "here are the latest news".

For article.json, pass 1 (WebSearch) collects raw data. Pass 2 (without WebSearch) receives that data plus the current state of article.json and returns a partial JSON patch — only the changed fields. The runner merges this patch with the existing document.


// renderArticle() — simplified
function renderArticle(renderPrompt, inputData) {
    const currentArticle = readJSON('article.json');

    const claudeInput = renderPrompt
        + '\n\nNEW INFORMATION:\n' + JSON.stringify(inputData)
        + '\n\nCURRENT DATA (article.json):\n' + JSON.stringify(currentArticle);

    // Claude without WebSearch — works on data passed in context
    const patch = callClaude(claudeInput, { noTools: true });

    // Merge patch → existing article
    const merged = { ...currentArticle, ...patch, last_updated: new Date().toISOString() };
    writeAtomically('article.json', merged);
}
        

The PHP side: 3 lines per topic

Each veille/<slug>/index.php does exactly this:


<?php
$veille_slug = 'retro';
include __DIR__ . '/../_veille-page.php';
        

_veille-page.php handles the shared structure (header, filters, news feed). Each topic has its own _article.php that reads article.json and displays the rich section: console rankings for retro, judicial timeline for epstein, AI model benchmarks for techno.

The bugs that cost the most time

Bug 1 — The secondary files loop was overwriting article.json

After each cycle, run-veille.js loops over ftp_files[1+] to write top-level fields from the current cycle (e.g. current_prices, market_note). The problem: this loop ran before renderArticle(). It was overwriting article.json with partial cycle data — stripping the complete console ranking that renderArticle was about to enrich 30 seconds later.

The symptom: the retro page displayed empty after every cycle. Diagnosis came from reading log timestamps:


[17:48:42] ✓ Secondary file written → uploads/veille/retro/article.json
[17:49:06] [render:NEW INFORMATION] ✓ Done → uploads/veille/retro/article.json
        

Secondary file at 17:48 → render at 17:49. The render writes based on what it received as context (the overwritten version), not the full version. Fix in one line:


// Line 220 of run-veille.js
if (f.local.endsWith('/article.json')) continue; // managed by renderArticle(), not here
        

Bug 2 — The daemon was overwriting registry.json

veille-daemon.js compares updated_at (local vs OVH). If the remote version is more recent or equal, it overwrites the local file.

Consequence: I modify registry.json locally, deploy it to OVH, but both versions have the same updated_at. At the next polling cycle, the daemon fetches the OVH version and overwrites the local one — losing my changes.

Fix: always bump updated_at to new Date().toISOString() on every local modification before deploying. This is an operational constraint that must never be forgotten.

Bug 3 — The render_prompt wasn't updating prices

The retro topic collects current console prices each cycle (current_prices). The render_prompt was supposed to apply them to the ranking. In practice, Claude returned a short patch with only market_note and highlights — without the consoles[] array.

Cause: the render_prompt was ambiguous. Claude optimized its output and omitted "unchanged" fields. Fix: add an explicit rule in the prompt:

ABSOLUTE RULE — PRICES: If current_prices is present in the NEW INFORMATION (even partially), you MUST MANDATORILY include the "consoles" field in the patch with the COMPLETE array of consoles and updated prices.

The render_prompt has no access to WebSearch. It works only with data passed in context. What Claude can't find in pass 1, it can't find in pass 2. This must be clearly stated in the prompt to avoid incorrect expectations.

Bug 4 — Invented image URLs

The render_prompt asked Claude to find "a direct URL to an official product photo". Claude fabricated plausible-looking URLs — all returned 404. No validation mechanism in the pipeline.

Manual fix: search on official sites (anbernic.com, trimui.com, goretroid.com), verify each URL with curl -I, inject directly into article.json. The real lesson: don't delegate asset lookup to the render pass (which has no WebSearch). If an image URL is critical, it must be collected during pass 1 (with WebSearch).

Planned improvements

  • Image URL validation. Nothing currently verifies that URLs in article.json return a real image. An HTTP 200 check at merge time would eliminate silent 404s.
  • Price history. For the retro topic, storing a {date, price} array per console would allow displaying a price evolution chart. The data is collected every cycle — it's just not persisted yet.
  • Significant delta alerts. If a console drops more than 20% between cycles, trigger a notification. The data is there, the diff logic exists — the trigger is missing.
  • Admin UI for adding topics. Currently adding a topic means editing registry.json, creating three PHP files, and adding two routes. A simple form → file generation UI would make this accessible without touching code.
  • Retry on render pass. Pass 1 has a retry. Pass 2 doesn't — if Claude returns invalid JSON, the article isn't updated and no alert is triggered.

Conclusion

The final architecture fits in a few files: a JSON registry, a generic Node.js runner (~450 lines), a polling daemon, and minimal PHP templates. The power comes from the separation between configuration (registry), collection (pass 1 with WebSearch), structured enrichment (pass 2 without WebSearch), and rendering (pure PHP).

What started as a crypto-specific script became a platform where any monitoring topic can be deployed in 30 minutes: create the registry entry, write a prompt, define the article.json structure, and write the PHP template. The rest — deduplication, retry, FTP sync, pruning, periodic summaries — is provided by the runner for free.

The dashboards are live at web-developpeur.com/veille/.

Comments (0)