Please let me know if you can think of other analytical approaches that would be convincing. By the way, yesterday I figured out how to fit a line to such data where the smaller sample sizes are progressively ignored in proportion to their relative weight as the number of ply incease, so I'm eager to try that out.
I haven't been trying to ask you a whole lot of questions because I know how time consuming and painful lol doing analysis can be lol.
I saw a flaw in your analysis. Which is why I asked the question. My purpose was to make sure you was aware of the flaw. I don't know how to say what I'm thinking lol. So it might come out stupid. I don't want you to think I'm talking bad about your analysis. I just want you to be aware of the subtle flaw so you don't get mixed up like I did.
For example:
After the moves 1.e4 c5 2.Nf3
2...d6
The percentage as
[<White wins>% <draw>% <Black wins>%]
[37.2% 29.2% 33.7%]
The flaw is the above percentage is not the actual percentage.
Its a fake percentage. It is an illusion.
I mean for example 2.Nf3 leads to over a dozen playable lines by white. I believe some can do better than others against certain set ups.
You might have a valid point there. Remember that to simplify the analysis I assumed that the most popular moves were the best, which is a good approximation but doesn't always hold true, especially with smaller sample sizes. These charts are time-consuming to do by hand, but following your question with my approach to analysis suggests I could do a few more charts for the most popular defenses against the Open Sicilian (1. e4 c5 2. Nf3).
Per 365chess stats, below are all the known responses to 2. Nf3 in the Sicilian. Names in quotes are temporary names I assigned, definitely not official...
The 3-element vectors show: [<White wins>% <draw>% <Black wins>%]
2...d6 {"Ftacnik Defense."}{202868/505289 = 40.1% popularity on 365chess.}
[37.2% 29.2% 33.7%]
2...Nc6 {"Sveshnikov Defense."}{159902/505289 = 31.6% popularity on 365chess.}
[36.8% 30.3% 33%]
2...e6 {"Cramling Defense."}{118876/505289 = 23.5% popularity on 365chess.}
[34.8% 31.2% 34%]
2...g6 {Hungarian Variation.}{10161/505289 = 2.01% popularity on 365chess.}
[35.5% 30.6% 33.9%]
2...a6 {O'Kelly Variation.}{7803/505289 = 1.54% popularity on 365chess.}
[37.2% 25.7% 37.1%]
2...Nf6 {Nimzovich-Rubinstein Variation.}{4620/505289 = 0.914% popularity on 365chess.}
[39.1% 30.8% 30.1%]
2...b6 {Katalymov Variation.}{714/505289 = 0.00141% popularity on 365chess.}
[39.4% 25.2% 35.4%]
2...Qc7 {Quinteros Variation.}{119/505289 = 0.000236% popularity on 365chess.}
[44.5% 28.6% 26.9%]
2...d5 {"Fernandez Variation."}{73/505289 = 0.000144% popularity on 365chess.}
[53.4% 21.9% 24.7%]
2...Qa5 {Stiletto Variation.}{65/505289 = 0.000129% popularity on 365chess.}
[47.7% 32.3% 20%]
2...Qb6 {"Rodchenkov Defense."}{30/505289 = 0.0000594% popularity on 365chess.}
[60% 16.7% 23.3%]
2...h6 {"Drazic Defense."}{24/505289 = 0.0000475% popularity on 365chess.}
[41.7% 33.3% 25%]
2...f5 {Brussels Gambit.}{14/505289 = 0.0000277% popularity on 365chess.}
[71.4% 7.2% 21.4%]
2...e5 {"Holden Defense."}{12/505289 = 0.0000237% popularity on 365chess.}
[58.3% 8.4% 3.3%]
2...f6 {"Poirier Defense."}{5/505289 = 0.00000990% popularity on 365chess.}
[40.0% 0% 60.0%]
2...a5 {"Groz Defense."}{2/505289 = 0.00000396% popularity on 365chess.}
[50.0% 0% 50.0%]
2...Na6 {"Fargere Defense."} {1/505289 = 0.00000198% popularity on 365chess.}
[0% 0% 100%]
REFERENCES
http://www.365chess.com/opening.php?m=4&n=4&ms=e4.c5.Nf3&ns=3.3.4 (6-24-15)
https://gameknot.com/chess-opening/sicilian-defence-b27?nd=523 (6-24-15)
http://allchessopenings.blogspot.com/ (6-24-15)
It looks to me like only the top three responses are worthy of analysis because the popularity drops off rapidly after those, down to a tenth of the previous popularity percentage. Three charts won't take *too* long to make, so if that sounds like it would significantly help identify any good anti-Sicilians, I'll do that. Please let me know if you can think of other analytical approaches that would be convincing. By the way, yesterday I figured out how to fit a line to such data where the smaller sample sizes are progressively ignored in proportion to their relative weight as the number of ply incease, so I'm eager to try that out.