Home · Track Record

Track Record

Paulo de Vries’s public dated probabilistic forecasts. Scored after resolution with Brier (1950) and decomposed via Murphy (1973). Aggregate calibration recomputes on each resolution. The methodology proven on the operator’s own predictions before applied at registry scale.

Why this exists

Calibration Ledger’s thesis is that predictive sources should be graded publicly, with append-only dated forecasts and Brier-scored outcomes. Operating a registry that grades others requires the same discipline applied first to the operator. This page is that discipline made visible.

Each entry below is recorded before the resolution date. Probability is locked at posting. Outcome is recorded once a public, verifiable source confirms resolution. Brier score is computed deterministically. Aggregate metrics (Reliability, Resolution, Uncertainty per Murphy 1973) recompute on every resolved row.

Methodology is documented at /methodology/. Operator identity is at /about/.

Aggregate

Total forecasts posted
5
Resolved
0
Unresolvable
0
Mean Brier score
— (no resolutions yet)
Calibration curve
— (renders after first 10 resolutions)

Scoring engine: lib/brier.ts. Runs at build time; deterministic; recomputes on every deploy. Source code: open in repo for audit.

Open forecasts (5)

PostedQuestionP(YES)ResolvesDomainSource
2026-04-27EU AI Act Article 50 (transparency obligations for providers and deployers of certain AI systems) becomes enforceable on schedule on 2026-08-02.85%2026-08-02geopoliticslink
2026-04-27S&P 500 index closing value on 2026-12-31 (last trading day) is higher than its closing value on 2026-01-02 (first trading day of 2026).62%2026-12-31marketslink
2026-04-27A frontier large language model (any vendor — OpenAI, Anthropic, Google DeepMind, Meta, etc.) scores ≥85% on GPQA-Diamond by 2026-12-31, with the score reported in an official model card or peer-reviewed evaluation.55%2026-12-31ai_benchmarkslink
2026-04-27Anthropic publicly announces a successor model to Claude Opus 4 (named e.g. 'Claude 5', 'Claude Opus 5', or any next-major-tier model) by 2027-03-31.70%2027-03-31technologylink
2026-04-27calibrationledger.com/track-record/ has at least 10 publicly posted dated probabilistic forecasts (status open OR resolved) on 2027-04-27.50%2027-04-27otherlink

Resolved forecasts (0)

No resolutions yet. After each resolution the row moves from Open to Resolved and contributes to the aggregate calibration metrics above.

Discipline commitments

  • Append-only. Probability assigned at posting is locked. Edits to question text after posting append a `corrected:` note rather than replacing the original.
  • Public source for resolution. Every resolution cites a verifiable URL (e.g. official report, market settlement, news of record).
  • No retroactive deletion. Forecasts that resolved badly are kept visible. Hiding bad predictions defeats the purpose.
  • Resolution date stated at posting. No moving goalposts.
  • Domain mix declared. Forecasts span multiple domains (geopolitics, AI benchmarks, markets, weather, sports, technology timelines) so the aggregate calibration is cross-vertical, not narrow-domain.

Machine-readable

The forecast log is also exposed as a machine-readable JSON feed at /api/forecasts.json (CC-BY-4.0, parseable schema documented in the file). LLM crawlers and RAG retrieval systems prefer the structured feed over HTML parsing.

Last verified: 2026-04-27. Page version 0.2 (scoring engine wired; awaiting first forecast posting). Operator: Paulo de Vries. Contact: contact@calibrationledger.com.