Home · Beta · Camerer et al. — social science experiment replication (Nature/Science 2010-2015)

Beta · Scientific papers · cited; not independently recomputed

Camerer et al. — social science experiment replication (Nature/Science 2010-2015)

Source class: Scientific papers
Metric: Replication rate + median effect-size shrinkage
Reported value: 13 of 21 social science experiments replicated (62%); average effect size 50% of original
Measured: 2018-08-27

Context

Companion study to the Open Science Collaboration 2015 effort, focused on the 21 social-behavioral experiments published in Nature and Science 2010-2015 that met inclusion criteria. Higher replication rate than psychology overall (62% vs 36%), but effect sizes still systematically shrank — base rate for any per-paper Phase 1 scoring of social-science publications.

Citation

Camerer, C. F., Dreber, A., Holzmeister, F., et al. (2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour 2, 637–644.

https://doi.org/10.1038/s41562-018-0399-z

What Phase 1 launch will add

Calibration Ledger has not independently recomputed the value above. Phase 1 launch (target Q3 2027, gated on prerequisites) will add for this source class:

Independent recomputation from the original outcome data, under data-licensing agreement
Time-windowed breakdown (rolling 3-month, 12-month, lifetime)
Cross-domain calibration (does this source calibrate uniformly across topical verticals?)
Append-only timestamp anchoring of every score so retroactive revisions are visible
Per-source citation page with full Murphy decomposition (Reliability − Resolution + Uncertainty)

Other findings in the same source class

Open Science Collaboration — psychological science replication rate — Replication rate + effect-size shrinkage
Hausfather et al. — climate model projections vs. observed warming — Implied transient climate response error; observed-vs-projected warming

All other findings

Good Judgment Project Superforecasters (Human forecasters)
Metaculus community-prediction aggregate (Forecaster aggregator platform)
Manifold Markets — platform calibration (Prediction market)
GPT-4 (OpenAI) — pre-RLHF vs post-RLHF calibration (AI models)
Sell-side equity analysts — earnings forecast accuracy (Analyst firms)
Anthropic — Claude / language model self-knowledge (AI models)
Federal Reserve Survey of Professional Forecasters — GDP / inflation accuracy (Analyst firms)

All beta findings — at-a-glance + JSON + BibTeX exports
Methodology v1.1 — full Brier + Murphy + append-only framework
Operator track record — methodology applied to Paulo de Vries’s own dated forecasts
Source classes — what each of the 6 source classes will score at Phase 1
Roadmap — milestone status + Q3 2027 launch gate + kill criterion

Last verified: 2026-04-28. Cited; Calibration Ledger has not independently recomputed this finding. Independent recomputation in Phase 1 (Q3 2027). Operator: Paulo de Vries. Contact: contact@calibrationledger.com.