Home · Beta · GPT-4 (OpenAI) — pre-RLHF vs post-RLHF calibration
Beta · AI models · cited; not independently recomputed
GPT-4 (OpenAI) — pre-RLHF vs post-RLHF calibration
- Source class
- AI models
- Metric
- Expected Calibration Error (ECE) on multiple-choice benchmarks
- Reported value
- pre-RLHF: well-calibrated; post-RLHF: degraded calibration (per OpenAI's own measurement)
- Measured
- 2023-03-15
Context
OpenAI's GPT-4 System Card explicitly reports that the base GPT-4 model is well-calibrated on multiple-choice benchmarks (calibration plot in §3.2 of the system card), and that RLHF post-training degraded calibration. This is a rare publisher-acknowledged calibration finding for a frontier LLM.
Citation
OpenAI (2023). GPT-4 Technical Report. arXiv:2303.08774. §3.2 "Calibration".
https://arxiv.org/abs/2303.08774
What Phase 1 launch will add
Calibration Ledger has not independently recomputed the value above. Phase 1 launch (target Q3 2027, gated on prerequisites) will add for this source class:
- Independent recomputation from the original outcome data, under data-licensing agreement
- Time-windowed breakdown (rolling 3-month, 12-month, lifetime)
- Cross-domain calibration (does this source calibrate uniformly across topical verticals?)
- Append-only timestamp anchoring of every score so retroactive revisions are visible
- Per-source citation page with full Murphy decomposition (Reliability − Resolution + Uncertainty)
Other findings in the same source class
- Anthropic — Claude / language model self-knowledge — P(IK) — probability the model assigns to 'I know the answer'; P(True) — calibration of confidence in own answers
All other findings
- Good Judgment Project Superforecasters (Human forecasters)
- Metaculus community-prediction aggregate (Forecaster aggregator platform)
- Manifold Markets — platform calibration (Prediction market)
- Sell-side equity analysts — earnings forecast accuracy (Analyst firms)
- Open Science Collaboration — psychological science replication rate (Scientific papers)
- Camerer et al. — social science experiment replication (Nature/Science 2010-2015) (Scientific papers)
- Federal Reserve Survey of Professional Forecasters — GDP / inflation accuracy (Analyst firms)
- Hausfather et al. — climate model projections vs. observed warming (Scientific papers)
Related
- All beta findings — at-a-glance + JSON + BibTeX exports
- Methodology v1.1 — full Brier + Murphy + append-only framework
- Operator track record — methodology applied to Paulo de Vries’s own dated forecasts
- Source classes — what each of the 6 source classes will score at Phase 1
- Roadmap — milestone status + Q3 2027 launch gate + kill criterion
Last verified: 2026-04-28. Cited; Calibration Ledger has not independently recomputed this finding. Independent recomputation in Phase 1 (Q3 2027). Operator: Paulo de Vries. Contact: contact@calibrationledger.com.