Home · Beta · Camerer et al. — social science experiment replication (Nature/Science 2010-2015)
Beta · Scientific papers · cited; not independently recomputed
Camerer et al. — social science experiment replication (Nature/Science 2010-2015)
- Source class
- Scientific papers
- Metric
- Replication rate + median effect-size shrinkage
- Reported value
- 13 of 21 social science experiments replicated (62%); average effect size 50% of original
- Measured
- 2018-08-27
Context
Companion study to the Open Science Collaboration 2015 effort, focused on the 21 social-behavioral experiments published in Nature and Science 2010-2015 that met inclusion criteria. Higher replication rate than psychology overall (62% vs 36%), but effect sizes still systematically shrank — base rate for any per-paper Phase 1 scoring of social-science publications.
Citation
Camerer, C. F., Dreber, A., Holzmeister, F., et al. (2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour 2, 637–644.
https://doi.org/10.1038/s41562-018-0399-z
What Phase 1 launch will add
Calibration Ledger has not independently recomputed the value above. Phase 1 launch (target Q3 2027, gated on prerequisites) will add for this source class:
- Independent recomputation from the original outcome data, under data-licensing agreement
- Time-windowed breakdown (rolling 3-month, 12-month, lifetime)
- Cross-domain calibration (does this source calibrate uniformly across topical verticals?)
- Append-only timestamp anchoring of every score so retroactive revisions are visible
- Per-source citation page with full Murphy decomposition (Reliability − Resolution + Uncertainty)
Other findings in the same source class
- Open Science Collaboration — psychological science replication rate — Replication rate + effect-size shrinkage
- Hausfather et al. — climate model projections vs. observed warming — Implied transient climate response error; observed-vs-projected warming
All other findings
- Good Judgment Project Superforecasters (Human forecasters)
- Metaculus community-prediction aggregate (Forecaster aggregator platform)
- Manifold Markets — platform calibration (Prediction market)
- GPT-4 (OpenAI) — pre-RLHF vs post-RLHF calibration (AI models)
- Sell-side equity analysts — earnings forecast accuracy (Analyst firms)
- Anthropic — Claude / language model self-knowledge (AI models)
- Federal Reserve Survey of Professional Forecasters — GDP / inflation accuracy (Analyst firms)
Related
- All beta findings — at-a-glance + JSON + BibTeX exports
- Methodology v1.1 — full Brier + Murphy + append-only framework
- Operator track record — methodology applied to Paulo de Vries’s own dated forecasts
- Source classes — what each of the 6 source classes will score at Phase 1
- Roadmap — milestone status + Q3 2027 launch gate + kill criterion
Last verified: 2026-04-28. Cited; Calibration Ledger has not independently recomputed this finding. Independent recomputation in Phase 1 (Q3 2027). Operator: Paulo de Vries. Contact: contact@calibrationledger.com.