Progress of 8 piece tablebase?

Sort:
Avatar of MARattigan
cobra91 wrote:
...

Okay, I assume no 4-man tables were used here, as the blunder 6. Nf3?? would surely never be played by a machine that can look up the KNN vs. K ending and instantly see it's drawn. ...

That assumes that it doesn't think the position is drawn in the first place.

The game I originally posted was aginst Tarrasch/Stockfish which by default doesn't display evaluations or store them in the game record. I re-ran it as SF15/on it's tod v Rybka/Nalimov in Arena which does.

Notice that SF's evaluation remains at 0.00 throughout. The only difference a 4 man tablebase would make is that it would be a hard rather than soft 0.00 when it considered conversion to KNNvK.

Irrelevantly, notice the variability in (objective) accuracy which drops from over 80% to around 20% between the two games, but at a practical level it does give Black more scope to screw up.

Edit: My last sentence was ill advised. White blunders into a draw on move 9 in the second game (can no longer make it under the 50 move rule) so all his subsequent moves were objectively accurate. The percentage of objectively accurate moves is, in fact, similar in the two games.

(If SF15 really wanted to improve its % accuracy it should have moved its king to the fourth rank at the outset, which draws straight away.) 

Avatar of MARattigan
bigD521 wrote:
MARattigan wrote:
bigD521 wrote:

@44

Yes 8 men are in consideration/progress

Either you did not comprehend my question, or you gave me an obscure reply.

But I think so far only DTC without the 50 move rule taken into account. I may be wrong.

Your second sentence is no doubt correct, but you don't sketch any method for producing the tablebases in question. You'ld need to have some estimate of how long it would take and if there were a lot of pawns and goals that could be forever probably.

But I think there is already software that will start from goals and produce mini tablebases (or same curtailed at a specific position). It would obviously be possible given well defined goals. I should try an internet search if you don't get any answers.

What did you mean by "(This seems unlikely to me)" by the way?

I did some searching and did not come up with anything.

My second sentence was Pawns only.  I am a poor basic computer user and know nothing about programing. Therefore I cannot sketch anything.

My third sentence with minor major pieces I did not think would work. Perhaps to many pieces? Perhaps not being able to both mate and promote? The code could only be written to either mate or promote, not both? Limiting the amount of promotions can not be done?

What got me thinking awhile back, was that perhaps it really isn't about pieces, but about how many positions are achievable. 7 man can equal 5 queens which could give approx. 50 first moves along with King moves added. Therefore if one reduces the amount of moves, then more pieces could be added. 7 pawns each side, even all on the 2nd and 7th ranks it still amounts to a max of 14 moves long with perhaps 8 more with the king. The same applies to the third line. 

Perhaps instead of a setup as we have now, a list of options for the user to select from. Then it would be all presented as a package.

7 piece

2K and 14 Pawns, up to 7 pawns each color.

2K 1Q and 2 ( R,N. B) and x pawns

My apologies. I did some searching and didn't come up with anything either.

I think I was thinking of Freezer, which I'd previously noticed in browsing, but it's not exactly what you want and limited to 8 men in any case.

The 8 man limit is probably because anything higher would involve impracticably long computation times. The kind of tablebase you propose with the number of pieces you propose might also take geological time to complete.

That is not to say partial tablebases along the same lines would not be useful in specific circumstances.

In constructing a tablebase there is no obstacle in principle to having alternative goals or negative goals. The latter aren't used in the generally available tablebases, but a DTC tablebase for example will have alternative goals of mate or winning conversion to  a child endgame,

Avatar of bigD521

@MARattigan -  No worries on the goose chase, but thanks.

As far as pieces, of course you may be correct, Primarily I posted in the far outside chance that a programer/coder (?) would see and take interest these different viewpoints. Following different lines of thought, could result in surpassing even the current 8 piece block right now (based on how long it takes to write). 

Avatar of cobra91
MARattigan wrote:
cobra91 wrote:

Can you show an example for KQP vs. KQ or KRPP vs. KRP? Unlike KNN vs. KP, these endings are reasonably common in serious play, and I want to see exactly what is meant by "badly" and "difficult" (I do have a rough idea, but am still very curious). If such positions really can't be handled adequately without full tablebase support (by directly looking up the best move), chess could be significantly further from solved than I'd previously thought.

There's a couple of examples of KRPPvKRP here, the first a draw (with the 50 move rule in force) and the second a win. They're SF15 v SF15 rather than SF15 v Syzygy because I don't have any Syzygy tablebases (or the room to incorporate 7 men).

Would have replied much sooner, but you were the first to step up and bring hard data to the discussion. So rather than fill this thread with idle chatter, I made the tougher, more time-consuming choice to actually crunch some real numbers. I only had time to go over 5 or 6 games a day, but I've now compiled an error rate table covering every game you posted.

As I expected, SF15 performs significantly better in the more practical endings with KRPP vs. KRP, despite the obvious fact that these starting positions were deliberately chosen to be as sharp as possible (with a draw by 50 move rule barely achievable in one case and nearly achievable in the other). I suspect that, if SF15 was instead tested on a large random sample of similar endings (i.e. those having the same material composition) taken from real games, the resulting error rates would be noticeably lower than those above.

Avatar of MARattigan

Congratulations on those! 

The compilation is a painstaking process - part of the reason I didn't include the figures myself. 

I did produce a table for the KNNvKP example:

Unfortunately I appear to have missed a pair of competition rules blunders in both the 2 second game and the 128 second game. (The blunders are indicated in the comments. Key is; "blunder" - blunder under both basic and competition rules, "BRblunder" - blunder under basic rules only, "CRblunder" - blunder under competition rules only. There are also some numbers indicating current ply count and indications of current status under the two sets of rules mixed in.)

If you still have the move numbers, I would appreciate it if you could tell me the omissions to save me another trawl. For competition rules blunders I currently have:

2 sec. game; 5W, 23B, 25W, 31B, 36W, 40B, 72W, 73B, 74W

128 sec game: 6W, 10B, 31W

That also invalidates the corresponding graph I posted; I'll re-work it.  

I said the time consuming nature of tracking the blunders was part of the reason for not attempting the remaining tables. The other part was that I'm not competent enough in the other endgames to be confident of arriving at the right results. (This is questionable even in KNNvKP.)

The reason for this can be illustrated by the game between SF15 and Rybka/Nalimov I posted earlier (reproduced).

Consider the game under the rules I believed (possibly incorrectly) were in force when I first began looking at this endgame. I understood that at that time a 75 move rule would be in force.

With White's 28th. move he can no longer win under the 75 move rule. Before that move he could if there were no triple repetition rule in force. But could he with the triple repetition rule in force, or is he already drawn on move 27? If I were to think about that question I could give an answer. The point is no practicable tablebase could.

With the KRPPvKRP or KRBNvKQB I couldn't answer that type of question (under a 50 move rule) and I can't get the answers from a tablebase.

Given that Syzygy only necessarily gives correct moves in positions that have no repeated "positions" considered the same for the purposes of the triple repetition rule, how did you get round that problem in arriving at your blunder rates?

As far as your comments go, yes and no.

I picked endgame classifications and positions to give what I thought would give SF problems; they're not typical positions with the number of men. On the other hand I limited the positions to ply count 0.

SF finds positions more difficult as the ply counts increase for positions with the same diagram. In positions where it can reliably win from a position with ply count 0, it will often fail with a higher ply count and the same diagram even though there is sufficient and spare remaining within the 50 move rule to accommodate the mate it would play from the corresponding ply count 0 position.

The ply count 0 positions also necessarily have no previous repetitions and even the tablebases can't handle positions that do.

I promised earlier to expand on what I meant by "bad play" and "difficult positions". This has a bearing on your comments, but I'll still defer it because I need to do a little more work to give reasonable comments.   

Avatar of cobra91
MARattigan wrote:

Congratulations on those! 

The compilation is a painstaking process - part of the reason I didn't include the figures myself. 

I did produce a table for the KNNvKP example:

Unfortunately I appear to have missed a pair of competition rules blunders in both the 2 second game and the 128 second game. (The blunders are indicated in the comments. Key is; "blunder" - blunder under both basic and competition rules, "BRblunder" - blunder under basic rules only, "CRblunder" - blunder under competition rules only. There are also some numbers indicating current ply count and indications of current status under the two sets of rules mixed in.)

Note that I used only one rule set when compiling the error rate table -- the 50-move rule was considered, while the 3-fold rule (see below) was not. It is a common convention, in more theoretical settings, to disregard both rules. For instance, ICCF regulations allow wins to be claimed in tablebase positions that would otherwise be drawn under the 50-move rule. However, engines are generally aware of the 50-move rule, and it was very clear in my analysis of the games that ply count considerations had a major impact on the course of play; "cursed wins" (as defined by Syzygy) were often thrown away by apparent blunders as soon as SF15 recognized it couldn't win quickly enough to prevent a future 50-move draw claim. SF15 happily throws away "BR draws" in favor of "CR draws" (as you defined above), too.



 2-second game: 75B, 78W

128-second game: The table I posted lists 3 total errors for this game, matching your result. I'm not sure why you think you missed any.

The 50-move rule is what nearly every engine (that I know of, at least) is programmed to recognize and adjudicate. The ICCF also recognizes the 50-move rule, albeit for unsolved positions only. Fifty moves are the relevant FIDE threshold, too, although a 75-move rule was added some years ago to close a loophole that still existed regarding unclaimed draws (if neither player claims after 50 moves, the 75-move rule can automatically terminate the game, even without a claim). Therefore, it was the 50-move rule I used to calculate the error rates.

With the 3-fold rule taken into account, it's already a draw before White's 28th move.

By ignoring 3-fold repetitions altogether, basically. That may not sound like a great option, but there are a number of good reasons for this choice. As you pointed out, it's very computationally demanding to track every repeated position and continuously prove there exists (or does not exist) a winning line circumventing both the 50-move rule and the entire set of previously repeated positions.

Another issue is that such analysis is of little theoretical interest, due to game theory considerations. It is proven that in any winning position, a win can always be forced without ever repeating either the current or any subsequent position. This means any position that does repeat in an optimal winning line must have appeared earlier and would necessarily have been winning the first time it was reached. Therefore, the 3-fold rule does not change the theoretical value of the initial position, any alternative starting positions that may be considered, or any position occurring in a perfectly played game.

Finally, the 3-fold rule is unlikely to significantly alter the error rates. For triple repetitions to come into play at all, several things need to happen, and in practice it's quite rare for such a game to be played by a 3500-elo machine. First, a winning position must be repeated, and it must still be winning under the 50-move rule (despite the higher ply count) when it is reached the second time. Then, either (A) a further inaccuracy by the stronger side or (B) a blunder by the stronger side followed by a blunder by the weaker side must happen, leading to a position farther from victory than the repeated one which is still winning under the 50-move rule despite an even higher ply count. Additionally, the previously repeated position(s) must then be crucial to all winning variations, which is extremely unlikely unless the weaker side is already very close to escaping into "blessed loss" territory (in which case there is a high probability the stronger side will err and overstep the move limit anyway, and the only thing that changes is the specific move that gets labelled as the final blunder).

That's only because SF wins the vast majority of winnable positions easily and struggles only when the exact number of moves taken to force a mate or zeroing move becomes very important to the final game outcome. Higher ply counts can turn an otherwise straightforward win into a precise computational task requiring 30-move-deep calculations.

In the positions you chose, though, such positions (ply count > 0) occur naturally and are often as sharp as possible due to the specific calibration of the starting configurations.

When I said I was unsure what was meant by "bad" and "difficult", I was merely asking to see some games to provide context for what was subjectively meant -- how frequently SF misplays these endings and in what positions does it fail to play correctly. A formal definition of these terms is probably beyond the scope of this thread. grin

Avatar of DLPB
TheWombat5 wrote:

Considering the incredible increasing rate of computational power and storage for computers, at all levels, over the last 50 years, that somehow an 8 man table base would be some sort of difficult task to either complete or store soon is a bit odd. By 2030 this will be seen as a rather easy task. I actually dread quantum computing getting involved in chess, an engine using a rather powerful quantum computer will very likely, this century, be able to tell you during the middle game that you have lost, won or drawn with correct play.

Quantum computing doesn't work like that. You should really remain quiet when you don't understand basic computer science.