Yes, but Q-learning needs a table to store the Q(state, action) values. For chess, though, we cannot provide this table beforehand
Nor in any discrete Q-learning problem. It is the action values that get learnt.
Yeah, I meant that for chess we don't even know what would be the size of that (huge) matrix, therefore it would be inefficient to use it in a program.
Well, this is what some of the DeepMind work has done, and it is close to state of the art.
A way to think of it is that the network derives a way to positionally evaluate a general position with high level abstraction using millions of parameters.
Ok, but what I meant is that the NN is not a simple two-dimensional array (something up to 10³⁷ × average number of legal moves) that stores the Q(state, action) values, like in simple Q-learning, otherwise there would be no difference. The NN is much smaller and efficient, but it is nonetheless an approximator, so they modified the temporal difference equation to solve the problem of instability.
@playerafar
Either of your positions in #2566 could be either player to move.
The tablebases do contain illegal positions and the number of these is unknown, but in most cases appears to be negligible, probably smaller than the number of omitted positions with castling rights.
Determining legality is an as yet unsolved problem.
This would obviously represent a potential source of error with the extrapolation in the graph I posted, but as @Elroch already pointed out, that would be the least of my problems in that respect.
@MARattigan

Yes. You're right. With KPK lined up on the d-file - the pawn could have taken for check and then Kd8.
So either player could be to move - so the x2 stands.
I'll delete my invalid posts therefore.
So what that means is that the messiness of how total positions are arrived at in the KKP tables doesn't come from stalemates -
it comes from checks that couldn't have happened
Like you said.
Now - if we simply trust the tablebases and don't question about illegals versus ratios of Wins to drawn ...
then that leaves the irksomeness of the fact that there seems to be no trend as to ratio of wins versus odd numbers of pieces on the board.
To make a pun
Its Odd that !
And does the player who is to move have an edge in the stats?
I don't know if that would help much - if it did/does.
I would think that would even out ..