A Bayesian ELO Inference Metric for CAMS II

Updated: Oct 7, 2025, 12:02 PM | 2

CAMS II — Bayesian ELO Metric

Good morning detectives, today I am here to make an update regarding a new metric that we are going to incorporate into CAMS II.

CAMS II comes with a Bayesian metric to infer player rating strength (ELO) from move quality. The approach combines a discrete reference model of centipawn-loss (CPL) distributions by ELO bin with a likelihood-based posterior over bins, complemented by a one-sided floor test that answers: “At least which ELO can we statistically sustain?” We validate on a large scale Lichess corpus of approximately 2,500,000 games, aggregated into fixed size batches within the 700 −−2799 range, and report aggregate performance of MAE=77.3 ELO, RMSE=104.6 ELO, and Exact-bin accuracy=0.4.

(You can click on the image or here to go to the full 4 pages document)

For those of you who are not interested in mathematics, I leave the results here.

Overall Performance

MAE ≈ 77 ELO → On average, the system is about 77 points off from the actual rating.
--> This means that if a player is 1800 ELO, CAMS II typically estimates them to be between ~1720 and 1870.

RMSE ≈ 105 ELO → It penalizes large errors more, but is still low, demonstrating overall consistency.

Exact-bin accuracy ≈ 0.38 → It hits the exact bin (out of 100 points) almost 4 out of 10 times.

These values come from ~2.5 million Lichess games, grouped into fixed-size batches.

An average error of ±77 ELO is more than acceptable for a purely positional estimator, it doesn't consider wins, tempo, or openings, only the quality of moves.

The consistency by bin shows that the method doesn't "collapse" at any range: the curve is smooth and stable.

CAMS II doesn't have a release date yet, it's still in beta, but I hope it represents an exponential improvement over some of the initial limitations of the first version.

See you next time!

A Bayesian ELO Inference Metric for CAMS II

CAMS II — Bayesian ELO Metric

Jordis Blog