Expected game score as a function of stockfish position evaluation and the human player's ELO rating - Chess Forums

Hedgehog_InTheFog

Feb 4, 2025

0

#1

Stockfish (as well as many other engines) can provide a win-draw-loss (or expected game score) evaluation of a position. This of course is based on the engine playing against itself - human's player's success rate would be different depending on their skill level.

It would be very interesting to obtain statistical data that show win-draw-loss probabilities (or expected game score) for a human player with a given ELO rating as a function of position evaluation obtained from Stockfish or another strong engine. For low rating levels the confidence of such estimate would be very low and improve substantially for higher ratings.

The idea seems to lie on the surface and is not difficult to implement (although likely requires appreciable computing resources to run analysis on a large database of games). I have not been able to find such analysis in public domain, perhaps not formulating the my search queries correctly. Does anyone have references or can suggest why this is not a valid problem to solve?

I strongly suspect that the expected points model that chess.com describes in general terms as a basis of their game review is based on something similar. Even if so the data behind this model is not publicly available (or I don't know where to look).

boltonchessfans

Feb 5, 2025

0

#2

An ELO number may need more parameters to modify the evaluation, I find from games that it would depend on the position and how much strategy versus tactics needed from that position. A person would need to play enough games for the algorithm to know the player style. Then for some games a player may be focused and others rushed because kids banging on the bathroom door.

An interesting format I found is when the game adjusts starting material based on ELO mismatch to give both players a fair chance at winning the game which kind of predicts how a game will play based on player ELO.

Hedgehog_InTheFog

Feb 5, 2025

0

#3

Ideally yes, we would include an indication of how much the position evaluation is rooted in the available tactics as an additional parameter. This may be estimated by following principal variation proposed by the engine and observing the changes in pure material balance. One can also throw in the ratings delta, as well as other parameters, complicating the model. As it stands however, so far I could not find any reference to any version of such analysis, simple or more complex.

Hedgehog_InTheFog

Feb 5, 2025

0

#4

I also realized that the Game Analysis forum may not be the most appropriate for this thread but I hesitate to place it better on chess.com forums.