Data analysis: Difference between Male/Female ratings

Sort:
Avatar of Sqod

Very nice work, Flash. At least someone in this forum is capable of some serious, unbiased, sound work on this popular topic.

Avatar of wilford-n
0110001101101000 wrote:

It should not be possible to measure this anyway... if I'm not aware of calculating or planning any differently compared to previous games, and when I analyze the games I don't see a difference, then how will someone who studies my games see a difference?

"Your knight takes pawn was risky"

No, it was a miscalculation, I simply missed the reply.

"But you were trying to win because they were female"

No, knight takes pawn intended to force a draw, as I said, I miscalculated.

You've just struck upon the reason the majority of studies are blinded... to prevent either unconscious or intentional bias from coloring the results. In this study, risk assessment was made based on a standardized analysis of openings by 8 chess players (5 male, 3 female; rated from 2000-2600). Each made their assessments individually and none were aware of the assessments made by each other. For a move to be considered as "risky" or "safe," there had to be agreement between 6 of the 8 experts.

The study also controlled for factors such as age, rating, and frequency of play, in order to eliminate the likelihood of false correlation. In other words, by design, gender was the only independent variable.

Likewise, to prevent bias in the selection of games to be analyzed, the researchers used the ChessBase10 database. They eliminated a set of games (again, based on an objective set of rules designed to remove selection bias and incomplete/unreliable data) and were left with 1.4 million games by about 15,000 players.

In other words, neither the players knew how (or indeed, even whether) their games would be assessed, nor did the experts know during assessment whose games they were assessing. This is what is meant by the phrase double-blind.

This is all standard stuff in the world of scientific publication. It isn't a world of conjecture, despite what the poor quality of mainstream science reporting might lead you to believe. The methodology of a study is published along with data and results as an absolute requirement, and this paper is no different. If you're interested, you can read the full paper here. (EDIT: Replaced original link with one that contains the tables and appendices.)

Avatar of Elubas
wilford-n wrote:

Elubas: I'm not talking about foolhardy risk here, but things like moving into complications when the outcome is less than clear. Think Kasparov or Perosian or Mickey Adams.

But even there it seems common for chess players to go for a safe continuation, even if it means giving up a certain amount of winning chances. Or trying to draw against a certain opening when it could be refuted by sharper play. Granted, Kasparov might not be that type of player, but maybe people like Giri, Anand, Kramnik, at least when playing similarly rated players, seem to do this stuff occasionally. But then it depends on what's classified as "foolhardy" risk. "Risks" like, continuing to play for a win in a much better position, are sometimes rejected by the weaker player, even if it's a "safe" position, because they're afraid if they do anything they'll make a mistake and lose. This seems to happen at all levels of chess -- recently I have in mind that game in the last Millionaire Open where Caruana was probably losing against a young IM or something, with no counterplay, but nonetheless, the IM just took a repetition. Stuff like this doesn't seem all that uncommon in chess.

Avatar of DjonniDerevnja
Elubas wrote:
wilford-n wrote:

Elubas: I'm not talking about foolhardy risk here, but things like moving into complications when the outcome is less than clear. Think Kasparov or Perosian or Mickey Adams.

But even there it seems common for chess players to go for a safe continuation, even if it means giving up a certain amount of winning chances. Or trying to draw against a certain opening when it could be refuted by sharper play. Granted, Kasparov might not be that type of player, but maybe people like Giri, Anand, Kramnik, at least when playing similarly rated players, seem to do this stuff occasionally. But then it depends on what's classified as "foolhardy" risk. "Risks" like, continuing to play for a win in a much better position, are sometimes rejected by the weaker player, even if it's a "safe" position, because they're afraid if they do anything they'll make a mistake and lose. This seems to happen at all levels of chess -- recently I have in mind that game in the last Millionaire Open where Caruana was probably losing against a young IM or something, with no counterplay, but nonetheless, the IM just took a repetition. Stuff like this doesn't seem all that uncommon in chess.

Elubas, I really like what you writes here.

Avatar of wilford-n

@Elubas and DjonniDerevnja: I've already cited and linked a scientific paper that proves my statement about greater risk tolerance among men than among women in chess. That paper shows this trend exists across all levels of play. It happens; analysis of 1.4 million games proves it.

If you want to deny that fact, kindly point out where that paper goes wrong, or else provide similar scientific evidence supporting an opposing view. I'm not going to argue against speculation and anecdote.

Avatar of Elubas

It's hard to "prove" things, though, even in studies. It seems like a psychological study, and psychological studies, I don't know, it's just not the kind of study where I'd automatically say it "proves" something as much as it points in an interesting direction. I'd wonder a bit about how they classifed risk; it doesn't seem like an easy thing to measure, and as 00011100 pointed out, it can easily be that a loss of a piece could either be called "risky" or just plain bad. You did say that experts had to agree on each decision, but still did not point out the actual criteria used.

But, yes, it's good that you cited a source. There certainly seems to be at least some kind of difference regarding how men and women play. But it's always hard to know everything you want to know when it comes to psychological studies. For example, it might be that women play in a more risk averse way merely because of the psychology surrounding playing a male opponent. That is, maybe the woman is more cautious, not because she is a more cautious player in general, but because she has a subconscious fear of a male opponent. Which would be saying something a little different.

So proof, well, I wouldn't use that sort of word, although I don't use that word very liberally. I feel like you often find one study getting certain results, and then suddenly you find another one getting totally different results; there just seems to be a certain fallibility when there are too many variables that you don't have control over. But does it point to something interesting, yeah.

Avatar of Elubas

Oh, well to be fair, I was only going by the summary. I see you later posted the full paper, which may well have a robust method of risk classification (I don't know because personally I don't want to take the time to dutifully read through it all at this time, for better or worse).

Avatar of watcha

I think female performance in chess is best understood in terms ot the phenomenon of emergence.

The nice thing about emergence is that it is a collective property of agents that act individually and selfishly following their own interests and rules. They can be as individual and have as much free will as they wish yet together they generate something that is a stable common property, one that may have been never the goal of any individual actor.

If you look at gas molecules they fly in space, some move slowly, some are fast, some bump into each other, some fly in free space, some go in this direction some in that direction. Looking at the individual molecule it seems ridiculous to think that it has a temperature.

Yet when you try to reach into a burning gas ring, you all of a sudden realize, that indeed it has temperature:

Avatar of jambyvedar

http://www.fitbrains.com/blog/women-men-brains/

Avatar of watcha

Statisctics is dealing with measuring emergent properties.

Avatar of u0110001101101000
wilford-n wrote:
0110001101101000 wrote:

It should not be possible to measure this anyway... if I'm not aware of calculating or planning any differently compared to previous games, and when I analyze the games I don't see a difference, then how will someone who studies my games see a difference?

"Your knight takes pawn was risky"

No, it was a miscalculation, I simply missed the reply.

"But you were trying to win because they were female"

No, knight takes pawn intended to force a draw, as I said, I miscalculated.

You've just struck upon the reason the majority of studies are blinded... to prevent either unconscious or intentional bias from coloring the results. In this study, risk assessment was made based on a standardized analysis of openings by 8 chess players (5 male, 3 female; rated from 2000-2600). Each made their assessments individually and none were aware of the assessments made by each other. For a move to be considered as "risky" or "safe," there had to be agreement between 6 of the 8 experts.

The study also controlled for factors such as age, rating, and frequency of play, in order to eliminate the likelihood of false correlation. In other words, by design, gender was the only independent variable.

Likewise, to prevent bias in the selection of games to be analyzed, the researchers used the ChessBase10 database. They eliminated a set of games (again, based on an objective set of rules designed to remove selection bias and incomplete/unreliable data) and were left with 1.4 million games by about 15,000 players.

In other words, neither the players knew how (or indeed, even whether) their games would be assessed, nor did the experts know during assessment whose games they were assessing. This is what is meant by the phrase double-blind.

This is all standard stuff in the world of scientific publication. It isn't a world of conjecture, despite what the poor quality of mainstream science reporting might lead you to believe. The methodology of a study is published along with data and results as an absolute requirement, and this paper is no different. If you're interested, you can read the full paper here. (Unfortunately, I couldn't find a freely-availiable version that included the tables. If I find one, I'll edit the post and include it here.)

Oh, I didn't see it was actually done well in the first link... maybe I didn't look hard enough.

Thanks for the detailed explanation.

Avatar of wilford-n
Elubas wrote:

I feel like you often find one study getting certain results, and then suddenly you find another one getting totally different results; there just seems to be a certain fallibility when there are too many variables that you don't have control over. But does it point to something interesting, yeah.

A quick review of a few ponts, but the quote above is telling and will be the main focus of this response.

This wasn't a psychological study; it's a statistical analysis. Psychology doesn't enter into it, and indeed, whenever I read psychology journals, I find an awful lot of speculative and unfalsifiable BS. Psychology is to neuroscience as witch doctoring is to medicine, in my opinion.

A 6 out of 8 consensus among chess experts looking primarily at openings seems to be a good way to assess risk. Admittedly, it's left to their opinions, and each uses their own criteria... which is why the researchers didn't just use a simple majority. The fact that the assessors made their decisions independently also increases reliability. In any case, the system used ensures that evaluation errors will be rare and symmetrical, so will average out when a large number of games are analyzed.

And now on to the main point.

You may "feel like" you get disparate results in scientific studies, but this is due to the aforementioned poor science reporting in popular media. Science is often regarded as "dry," and in an attempt to generate interest, media takes its normal approach: add some sensationalism. New findings are often described as "overturning established science," when in fact they are usually the opposite. When the subject of study is any well-trodden field, they usually confirm predictions of established theory. Sometimes that confirmation carries an unexpected wrinkle, and that's when mainstream sources start trumpeting the result as somehow revolutionary. Discovery of new hominid fossils are often presented this way in mainstream reports. Popular science figures who talk a lot but rarely publish (such as Michio Kaku) knowingly capitalize on this trend, lending fringe ideas some degree of credibility (but only among the credulous).

Another problem is that mainstream science reporting has little regard for the peer review process. They'll present reports from dubious blogs with the same level of certainty as studies from highly cited peer-reviewed journals. This is especially true in health matters, such as disease treatment or nutrition. The anti-vaccination and anti-GMO movements are of this nature.

A third problem is that mainstream science reporting often expands conclusions unjustifiably. Again, health is where this is worst. A treatment that might partially interfere with a single viral process (with substantial unwanted side effects) is often presented as if a cure is just around the corner.

Fourth and finally, news self-censors social conclusions that they worry might offend some part of their audience. In an era of political correctness, this means that almost any study that shows legitimate differences between almost any demographic groups will not get coverage, and if it does reach a threshhold that demands coverage, it will be countered by interest group talking heads who prefer to shout down facts with opinions.

Considering these deficiencies, it's no wonder that so many average people seem to distrust science. They have no experience with it aside from highly distorted hyped coverage designed to attract viewers and advertising dollars.

Avatar of DjonniDerevnja
wilford-n wrote:

Elubas: I'm not talking about foolhardy risk here, but things like moving into complications when the outcome is less than clear. Think Kasparov or Perosian or Mickey Adams. Female chess players tend to be more likely to seek simplification, when the result of a particular decision is a bit more concrete.

That said, males are more likely to make speculative sacrifices that are less than sound, too. Tal or Grischuk would be of this latter category. (Well, Grischuk tends to sacrifice his clock more than material, but it's in the same category of risk.)

Sacrificing is playfullness and childish. Its action and fun. Men are usually more childish than women, and therefore we have a slight advantage. The best chessplayers have played around with all styles of chess, sound and solid, sharp an risky.

Avatar of Elubas

I guess I just have a natural scepticism towards studies that rely on evaluations rather than a truly objective observation. Granted, sometimes evaluation is your only choice with certain issues, so I get that, but nevertheless, I sometimes wonder about how well we can put a vague concept like "aggressiveness/solidity" of a move into objective terms. Perhaps it's analogous, in some way, to the idea that we can measure something like intelligence with this "objective" number, IQ, because we ask a bunch of people what they think means someone is intelligent, do some stats etc, and then make up some test questions.

So I guess that was sort of where I was coming from. But they seemed to have done a pretty good job here. I mean they are chess players, analyzing these positions. And it was done gender-blind. Still seems weird to turn "aggressiveness" into some kind of objective measure but in any case, they have found some kind of gender difference one way or another. It's interesting for sure.

Avatar of AngeloPardi

There are a few objective parameters of aggressivity and risk in chess. I think nobody will disagree with the fact that the Najdorf, East Indian, King's gambit and Larsen openings are more risky than the Caro-Kann or the English.

Avatar of SmyslovFan
wilford-n wrote:

@Elubas and DjonniDerevnja: I've already cited and linked a scientific paper that proves my statement about greater risk tolerance among men than among women in chess. That paper shows this trend exists across all levels of play. It happens; analysis of 1.4 million games proves it.

If you want to deny that fact, kindly point out where that paper goes wrong, or else provide similar scientific evidence supporting an opposing view. I'm not going to argue against speculation and anecdote.

You misrepresent the findings of the paper, and the source you gave was just the conclusions, not the full paper. 

The authors of the paper conclude that when women play against other women, they tend to be aggressive. Others have found that women's tournaments tend to be more aggressive than men's tournaments. This is consistent with the findings you posted. 

Avatar of SmyslovFan

Ah, I see you've posted a link to the full paper in the last 5 hours. I'll peruse that before making further comments.

Avatar of SmyslovFan

 I just perused the academic article. There are some serious problems with how "aggressive" and solid were determined. The authors asked eight experts (rated 2000-2600) to assign an A, S or 0 to various ECO codes. When 5 out of 8  agreed, the opening was assigned a designation. 

1.e4 is S, 1...Nf6 is A and so on. Ask eight different experts, and I'm sure that many openings would be considered differently. Also, time changes assessments. The Berlin variation of the Spanish for example may have been considered not quite as solid in 1999 as it does today. 

But even with that reservation, it's interesting that people change their repertoire based on whether their opponent is male or female. My first thought is to see if there's another explanation for people changing their repertoires. I couldn't tell based on the information provided whether players in the same rating range (say, 2200), have different repertoires. 

In short, the paper raised many questions. 

Even so, the findings of others that when women play against other women, the players tend to be more aggressive still appears to be true. The measure I read about was: the incidence of 1.e4 (viewed as more aggressive than 1.d4), the number of decisive results, and the number of decisive games under 30 moves. Unfortunately, I don't have a link to that study at the moment. 

Still, those measures appear to be more reliable than 5 out of 8 experts agreeing on whether an opening move is considered more solid or aggressive.

Avatar of wilford-n
SmyslovFan wrote:
wilford-n wrote:

@Elubas and DjonniDerevnja: I've already cited and linked a scientific paper that proves my statement about greater risk tolerance among men than among women in chess. That paper shows this trend exists across all levels of play. It happens; analysis of 1.4 million games proves it.

If you want to deny that fact, kindly point out where that paper goes wrong, or else provide similar scientific evidence supporting an opposing view. I'm not going to argue against speculation and anecdote.

You misrepresent the findings of the paper, and the source you gave was just the conclusions, not the full paper. 

The authors of the paper conclude that when women play against other women, they tend to be aggressive. Others have found that women's tournaments tend to be more aggressive than men's tournaments. This is consistent with the findings you posted. 

Maybe you should follow the whole discussion (and actually read the study) before making accusations of "misrepresentation." First of all, in my second post on the study, I did link the full paper, so that's your first misrepresentation. [Edit: I note that in a later comment you corrected this deficiency.] Secondly, your accusation reflects the fact that you only skimmed, and picked up what the authors themselves called a "secondary finding": the fact that, while on the whole, men tend to play lines assessed by experts as "riskier" than lines played by women, members of both genders tend to play more agressive lines when their opponent is female.

This in no way detracts from the primary finding, that "there is strong statistical support for the claim that female players prefer opening strategies that are more risk-averse than their male counterparts." This is the first result discussed in the Estimation Results section of the paper (p. 14). And again, the lead-off result in the Discussion of Results section (p. 19) reads: "We have found that women choose more cautious strategies than men, which we interpret as women being more risk averse on average."

You have ignored this primary result and misrepresented one novel finding of the paper as though it was the entire conclusion. So there's your second misrepresentation. I, on the other hand, misrepresented nothing, despite your claim otherwise.

Avatar of wilford-n
SmyslovFan wrote:

 I just perused the academic article. There are some serious problems with how "aggressive" and solid were determined. The authors asked eight experts (rated 2000-2600) to assign an A, S or 0 to various ECO codes. When 5 out of 8  agreed, the opening was assigned a designation. 

And now you've made another misrepresentation, or else your reading comprehension is low. Six out of eight were required to designate a move as aggressive or solid. Page 10:

"In more detail, they were instructed to define each opening as either aggressive or solid. We then compare the opinions of the experts and declare and Eco code to be solid [aggressive] if at least six out of the eight experts define it as solid [aggressive]. In cases when there are five or fewer votes for solid or aggressive, the opening is considered to be unclear."

Keep going, because I'd love to point out more of your misrepresentations. Especially since your similar accusation leveled at me turns out to be false.