#2109
"you had 2 data points"
++ Yes, I only have 2 and try to make best use of those.
"let's say the error rate is not in the single digits for either engine"
++ Well, the error rate is that low and gets lower with more time
"today it doesn't require 100s of games to determine which engine is best"
In the TCEC superfinals they have to impose slightly unbalanced openings to avoid all draws
They also give all information about evaluation, nodes/s, 7-men endgame table base hits
https://tcec-chess.com/
#2106
"why error rate is a linear function of time"
++ I did not assume a linear dependence.
On the contrary I assumed logarithmic dependence:
at 1 s/move: 1 error/679 positions
at 1 min/move: 1 error / 3478 positions
at 1 h/move: 1 error / 3478 * 3478 / 679 positions = 1 error / 17815 positions
at 60 h/move: 1 error / 17815 * 17815 / 3479 positions = 1 error / 91251 positions
"this could only be a minimum error rate "
++ As the error rate is that low, the occurence of two or more errors can be neglected.
P(2 errors) = P(2 errors|1 error)*P(1error) ~= P(1 error)^2 << P(1 error)
"this could only be a relative error rate"
++ By the generally accepted hypothesis that chess is a draw each decisive game must contain at least 1 absolute error: a move that turns a drawn position into a lost one
Eh, I don't know why I said that. Yeah, the rule you used is when the input is x60 the output is x5... that's obviously not linear... but I still don't know the logic for why that works other than you had 2 data points and are just playing with ratios.
As for draws, oh I see, you're saying draws that happen after zero errors are much more likely than draws that happen after 2, so we can just ignore games with multiple errors.
Eh... isn't this making a lot of assumptions? For example let's say the error rate is not in the single digits for either engine, and the winner is routinely committing multiple fewer errors. Why is this scenario unlikely? Current SF is something like 300 points stronger (link below) than the one that played AZ (SF8 played AZ). Is it really sensible that the error rate was so low 5 years ago when today it doesn't require 100s of games to determine which engine is best for e.g. CCRL 40/40?
https://www.chessprogramming.org/images/0/04/SfElo.png
https://ccrl.chessdom.com/ccrl/4040/