Forums

How many games in a database to believe the stats?

Sort:
Casual_Joe

I use a database to guide me through the openings in online chess.  Sometimes it shows what appears to be pretty lopsided statistics from positions that look very balanced.  I wonder if it's just because the number of games of that position in the database is low.  How many games would it take for you to believe the W/D/L statistics for a given position?

chasm1995

Maybe a thousand games for a nice, large pool.  I know on chess.com, though, they say how many games were played with a particular opening.

NimzoRoy

A bazillion at least, if you're going to ignore every other relevant factor such as the strongest players who have played the line you're looking at, the last time the line was played, how many games were played by amateurs, how many game scores are corrupted, how many were "GM draws" the make/model/brand/year/quality/size of the DB itself etc etc.

BTW if the line has been refuted (or even badly hammered) recently than all the statistics in the universe are irrelevant if you aren't aware of the refutation or strongest current line, and worse yet they are probably misleading if the line had decent results according to the DB being used

I rely on a 5.5 million game DB myself to get thru many openings and never cease to be amazed at how many times I've let myself be steered into an obscure line that is only good for my opponent. I think an appropriate word for this phenomenon is stupidity - or laziness in some cases. Unlike many other members, I actually don't like studying openings more than once every blue moon.

End of transmission...

rigamagician

It takes just one game to bust a line that has been popular for years, so in general, you probably shouldn't trust the stats.  You can look at the games though, and try to figure out where the losing player went wrong.

Casual_Joe

Keep in mind my alternative to using the DB is to rely solely on my own brain (always dangerous).  I agree with your point that games in a DB should be viewed with some suspicion, but I still think a DB is still useful for giving good guidance in the opening (say the first 10 moves or so).

MrDamonSmith

A gazillion bagillion or so. Somewhere around there.

SmyslovFan

Stats are more useful for determining how playable a system is than a variation. If a variation depends on a single critical line, then Riga's point is completely valid. But if you are using statistic to determine whether a system is playable, then you can be reasonably certain that there is some truth to the stats. 

But don't just use winning percentages. Look at who plays them and when. You may find that Magnus Carlsen has won games playing 1.a3, but then look to see whether he played 1.a3 in any serious event, or just blitz and blindfold games.

Don't rely on statistics, use them. You can use chessbase to see where the pieces are most often placed in a certain opening, and from there develop plans. Then check those plans against the games in the database to see what move-order issues arise. 

You don't even have to limit yourself to opening positions. You can check out endgames, middle games with isolated queen pawns, and all sorts of common positions and pawn structures!

But again, it is only one tool, not the ultimate tool.

MrDamonSmith

Hey, that's funny. I hadn't even read post 3 when I posted.

EricFleet
Casual_Joe wrote:

I use a database to guide me through the openings in online chess.  Sometimes it shows what appears to be pretty lopsided statistics from positions that look very balanced.  I wonder if it's just because the number of games of that position in the database is low.  How many games would it take for you to believe the W/D/L statistics for a given position?

You cannot rely solely on any database without risk. Imagine a scenario where a line has been played for years and years with White having a healthy advantage in score. Then someone realizes that Black has a nice continuation. Leading players learn about the continuation and stop playing the line as White. You might have 1,000 games played where White wins 500 times, draws 200 and loses 300. Suddenly in 50 games, Black has a nice advantage but not enough to tip the scales.

So, you blindly follow this line and your opponent is booked up. See the problem?

chasm1995

I blindly follow the DB, not knowing that there is a refutation that is much better than what I m playing.  Sealed

NimzoRoy
Casual_Joe wrote:

Keep in mind my alternative to using the DB is to rely solely on my own brain (always dangerous).  I agree with your point that games in a DB should be viewed with some suspicion, but I still think a DB is still useful for giving good guidance in the opening (say the first 10 moves or so).

I empathize with you totally (well maybe mostly) here. BUT the 10 move limit strikes me as irrelevant - some openings are rock solid "book" lines for many more moves (ie CKD Main Line as one example) and others disappear from "book" theory in under 10 moves.

Personally, I like thumbing thru MCO-15 to see which lines get the most/least columns and which ones appear (or don't appear) only in footnotes. Plus I usually prefer GM de Fimian's evalutations, remarks and/or opinions to those of a real patzer - like myself for instance.

NimzoRoy
MrDamonSmith wrote:

Hey, that's funny. I hadn't even read post 3 when I posted.

I'm suing you for plagarism! You'll be hearing from my atty - just as soon as he's released on bail...

Casual_Joe
NimzoRoy wrote:
Casual_Joe wrote:

Keep in mind my alternative to using the DB is to rely solely on my own brain (always dangerous).  I agree with your point that games in a DB should be viewed with some suspicion, but I still think a DB is still useful for giving good guidance in the opening (say the first 10 moves or so).

I empathize with you totally (well maybe mostly) here. BUT the 10 move limit strikes me as irrelevant - some openings are rock solid "book" lines for many more moves (ie CKD Main Line as one example) and others disappear from "book" theory in under 10 moves.

Personally, I like thumbing thru MCO-15 to see which lines get the most/least columns and which ones appear (or don't appear) only in footnotes. Plus I usually prefer GM de Fimian's evalutations, remarks and/or opinions to those of a real patzer - like myself for instance.

I use the 10 move limit because I'd have to subscribe to the DB to get more than 10 moves.  (I use chesstempo.com)

Casual_Joe
EricFleet wrote:
Casual_Joe wrote:

I use a database to guide me through the openings in online chess.  Sometimes it shows what appears to be pretty lopsided statistics from positions that look very balanced.  I wonder if it's just because the number of games of that position in the database is low.  How many games would it take for you to believe the W/D/L statistics for a given position?

You cannot rely solely on any database without risk. Imagine a scenario where a line has been played for years and years with White having a healthy advantage in score. Then someone realizes that Black has a nice continuation. Leading players learn about the continuation and stop playing the line as White. You might have 1,000 games played where White wins 500 times, draws 200 and loses 300. Suddenly in 50 games, Black has a nice advantage but not enough to tip the scales.

So, you blindly follow this line and your opponent is booked up. See the problem?

In your scenario, I would have zero chance of knowning this imagined refutation on my own.  Plus at my low skill level, emerging from the opening with a slightly better or worse position probably isn't enough to decide the game. 

I mostly use the DB to avoid opening traps and get ideas for the most common moves from the given position.

EricFleet
Casual_Joe wrote:
EricFleet wrote:
 

So, you blindly follow this line and your opponent is booked up. See the problem?

In your scenario, I would have zero chance of knowning this imagined refutation on my own.  Plus at my low skill level, emerging from the opening with a slightly better or worse position probably isn't enough to decide the game. 

I mostly use the DB to avoid opening traps and get ideas for the most common moves from the given position.

With all due respect, you asked a question and then baited and switched. You asked how to know when to trust a percentage breakdown and the answer is you cannot without analyzing deeper. If you feel that you don't have that expertise, then just go ahead and trust the percentage as the truth.

rigamagician

Books about openings are useful for understanding the ideas behind the moves you are playing.  Recent books by GMs will also point out some of the traps and things to watch out for.

NimzoRoy

I use the 10 move limit because I'd have to subscribe to the DB to get more than 10 moves.  (I use chesstempo.com)  Casual Joe

I think for $10/yr you get full use of a 3.5 million game DB at www.365chess.com. Check it out. I know it's easy to spend other people's money, but it looks like a good deal to me. I'd go for it if I didn't already have a much bigger,  costlier (and possibly better) DB already. Well mine is probably better cause you can use it to sort games with all kinds of criterion and also make up specialized DBs and generate "Opening Reports" for any opening you want, but it's definitely costlier!

You should also check out SCID vs PC (freeware DB) which looks pretty good IF you have the time and patience to learn how to use it. A pal of mine here got it and from what he's told me so far it's not that hard to work with.

blueemu
rigamagician wrote:

It takes just one game to bust a line that has been popular for years, so in general, you probably shouldn't trust the stats.

Correct. In the (fairly good) 365chess data-base, the line:

1. e4 e5 2. Nf3 Nf6 3. Nxe5 d6 4. Nf3 Nxe4 5. d4 d5 6. Bd3 Be7 7. c4 Nc6 8. O-O Nb4 9. cxd5 Nxd3 10. Qxd3 Qxd5 11. Re1 Bf5 12. Nc3 Nxc3 13. Qxc3 c6
... is still shown as favoring Black (White wins 35.7%, Draws 21.4%, Black wins 42.9%), even though that position has been known to be dead lost for Black ever since the mid-1970s.
Ubik42
rigamagician wrote:

It takes just one game to bust a line that has been popular for years, so in general, you probably shouldn't trust the stats.  You can look at the games though, and try to figure out where the losing player went wrong.

Exactly, if lots of people have played it, and one game busts it so no one playes it anymore, the stats may look very misleading.

blueemu
Ubik42 wrote:

Exactly, if lots of people have played it, and one game busts it so no one playes it anymore, the stats may look very misleading.

Case in point: the position shown in my post above. The data-base claims that it favors Black. But White has a forced win with 14. Bh6.