Tactics Trainer Problem Difficulty

Sort:
The_Evil_Ducklings

I don't know what has changed. The problems are still great and I love the trainer, it's just my rating suddenly dropped 200-300 points in a few days. Maybe it's old age, but I suspect something changed in the algorithm. Maybe ratings were inflated and it was done as a correction. 

Josechu

benkku52. I've noticed soemthing like that too. My TT rating is +/- 1300 and recently I've been getting mainly problems quite a bit below that with a few quite a bit above. Of course you don't know when you get the problem what the rating is, so you see a hanging piece and you don't know whether to grab it quick (because it's a low-rated problem) or whether the hanging piece is just a distraction and there's a subtle mate in 3 hidden in there somewhere. So it's hard to score decent points and at the same time, boy do you pay for it if you get one of the "easy" ones wrong! But I'm quite philosophical about it. In a game nobody tells you how deep you should be looking. Sometimes an easy opportunity pops up unexpectedly (at my level anyway) other times you search and search and you're lucky if you find anything.

The one enhancement that I think would make TT even better is to have a different scoring system for quickfire chess and for long format chess. The current system works well for bullet and blitz, I think, but it's a bit annoying for those of us who like to take our time. (So I focus more on my pass rate than my rating). My long format scoring system would have:

1) No minus points if you get it right, no matter how long you take (within reason, I guess).

2) No plus points if you get it wrong, even if you get 4 out of 5 moves or whatever, if you don't get it completely right then you haven't seen it. At best you get less of a minus.

I think that would be a better way for correspondence players and people who play standard chess with a lot of time on the clock.

Josechu

I'm not saying that time doesn't come into it at all in the long format version.

If you get it 100% right very quickly you score maximum points. If you get it 100% right very slowly you score zero points.

If you get it wrong you score minus points. Less of a minus if you get some moves correct. Maximum minus if you take ages and still get the first move wrong.

There's still a premium for doing it fast but that premium never outweighs the primary object of getting it right.

Martin0

@Josechu, it sounds to me like you do not know how the ratings are calculated in tactics trainer. If you score 50% against a problem with the same rating you get +-0 and that is the score you get for solving about 1,57 times the avg. time (not sure exactly, but a bit above 1,5).

http://www.chess.com/tactics/help#rating

Martin0

I think I both agree and slightly disagree with benkku's reasoning. At first I was against solving puzzles against the clock, but now I think that can be a good thing. I don't believe in the reasoning with the 3 minutes/move though. The distribution of time for each move is very divided and you generally use more time in critical positions (and with good time management you will have enough time there, easier said than done though). Good tactics trainer problems only include critical positions (meaning one line is clearly best and not several winning or drawing lines), so comparing to otb games you probably don't need to move as fast and it could cause bad habits.

Another thing is that tactics trainer is too nice when your failing a problem half through. I think you should always score 0% when you didn't get all the moves right (You found the queen sac, but not the following mate, well done you get some rating points Cool)

If I would decide things I would be more strict and only have 100% when succesfull and 0% when unsuccesfull. You get a given time for a problem and score 0% if you don't make it within the time (but you may still solve it afterwards if you want anyway). There would be 4 modes "fast", "normal", "slow" and "timeless". I'm not sure about specifics, but the idea is that the timer will be based on the avg.time for each problem and different on different modes and the timer will not be there in "timeless mode".

 

These things are slightly off the issue here though, it's more important that people don't get a lot of lower rated problems where it will be very hard to maintain a rating. If there have been a deflation/inflation in tactics trainer problems I don't see it as a big issue (as long as it's not a huge difference) and I don't see it as something that should be attempted to fix.

Josechu

Martin. I know how it works currently, I've looked into it quite a lot. What I'm suggesting is an alternative system that is not so sensitive to time and where you do not get a minus score for solving in > 1.57 times the average time (but only if you opt for the "slow" algorithm. If you prefer the current "fast" option then you can opt for that.)

In my "slow" system if you correctly solve a problem with a rating = to your rating, and you are the slowest candidate ever to solve it correctly, then you score zero. If you are very slow but not quite the slowest then maybe you score 1. If you are the fastest to solve it correctly you score maximum points. See what I mean?

There are some wrinkles of course but the basic principles are described in my post #24 above.

Josechu

Martin - my post crossed with your post #28. Looks like we basically agree.

Martin0

@Josechu, Sorry I missed post #24 for some reason and thought you tried to explain how it currently works in post #26. My bad. I guess we basically agree indeed.

cshuenss

Mine is higher than my rating, though. Probably the randomness isn't working that well, OR that there are'nt enough high rated problems for you!

Recent Problems

DateID#RatingMy RatingMovesAvg. TimeMy TimeOutcome
Aug 10, 2013 6:00 PM 0083642 1526 1454 1/2 1:06 0:29 Failed (46% | -1)
Aug 10, 2013 10:35 AM 0000825 1516 1455 0/3 0:35 0:06 Failed (0% | -14)
Aug 10, 2013 10:35 AM 0155605 1518 1469 2/4 1:15 0:10 Failed (50% | +3)
Aug 10, 2013 10:34 AM 0166544 1495 1466 1/1 0:21 0:11 Passed (90% | +14)
Aug 10, 2013 9:38 AM 0071817 1430 1452 4/4 1:36 0:20 Passed (100% | +15)
Aug 10, 2013 9:38 AM 0144667 1428 1437 2/2 0:40 0:12 Passed (97% | +14)
Aug 10, 2013 9:37 AM 0091504 1421 1423 0/3 0:32 0:04 Failed (0% | -17)
Aug 10, 2013 9:31 AM 0026943 1387 1440 2/2 0:26 0:15 Passed (88% | +9)
Aug 10, 2013 9:31 AM 0026157 1388 1431 2/2 0:32 0:19 Passed (88% | +10)
Aug 10, 2013 9:30 AM 0104161 1387 1421 0/1 0:20 0:21 Failed (0% | -17)
Aug 10, 2013 9:29 AM 0058030 1370 1438 0/2 1:01 0:19 Failed (0% | -17)
Aug 10, 2013 9:29 AM 0044576 1479 1455 2/2 0:33 0:14 Passed (93% | +15)
Aug 10, 2013 9:28 AM 0190084 1525 1440 1/3 1:12 0:27 Failed (31% | -2)
Aug 10, 2013 9:27 AM 0032278 1500 1442 0/2 3:37 0:07 Failed (0% | -14)
Aug 10, 2013 9:26 AM 0184052 1476 1456 1/1 0:18 0:09 Passed (90% | +13)
Aug 10, 2013 9:26 AM 0028581 1390 1443 2/2 0:37 0:14 Passed (94% | +10)
Aug 10, 2013 9:25 AM 0108552 1398 1433 1/2 0:45 0:07 Failed (50% | +1)
Aug 10, 2013 9:25 AM 0020893 1414 1432 0/5 4:02 0:19 Failed (0% | -17)
Aug 10, 2013 9:24 AM 0021916 1469 1449 0/3 0:32 0:15 Failed (0% | -15)
Aug 10, 2013 9:23 AM 0033696 1572 1464 0/2 0:32 0:15 Failed (0% | -11)
Aug 10, 2013 9:23 AM 0038331 1458 1475 1/2 1:12 0:18 Failed (49% | -1)
Aug 10, 2013 9:22 AM 0039817 1518 1476 2/2 1:13 0:16 Passed (99% | +18)
Aug 10, 2013 8:25 AM 0039305 1486 1458 1/3 1:41 0:23 Failed (33% | -4)
Aug 9, 2013 6:33 PM 0021251 1364 1462 5/5 1:13 0:25 Passed (95% | +10)
Aug 9, 2013 6:32 PM 0035287 1461 1452 2/2 0:59 0:14

Passed (99% | +15)

Martin0

I think it's important the problems you get are within 100 points range. That seems to be the case for cshuenss, but we have seen others with a higher rating difference. There could indeed be something wrong with the random algorith.

Josechu

I collect my TT data because I like to do my own analysis. Graph below shows the average problem rating of the problems I've attempted, by month, against my average rating. For ages the two lines are indistinguishable. But from 2013, roughly they have started to diverge. I assumed it was because I've done so many problems now, within quite a limited range, that there are no problems left in my range so they give me some higher and some lower, and they don't quite get it right. I may try resetting my stats just to see what happens. But like I said above, it doesn't bother me that much. It amuses me that I seem to do better on the tough problems than the easy ones!

erik

we are going to tweak this algorithm again to make it better!

nameno1had
Martin0 wrote:

Did some problems on my android. No idea why I got so low rated problems or why I was allowed to do this many tactics at the same day. Hard to maintain a rating with problems rated 1000 points below.

Mar 22, 2013 1:41 AM 0051998 1166 2037 2/2 1:01 0:21 Passed (95% | +1) Mar 22, 2013 1:40 AM 0033508 1244 2036 3/3 1:02 0:59 Passed (81% | +1) Mar 22, 2013 1:39 AM 0029466 1106 2035 4/4 1:49 1:01 Passed (89% | +1) Mar 22, 2013 1:38 AM 0048779 1056 2034 4/4 0:52 0:40 Passed (85% | +1) Mar 22, 2013 1:36 AM 0000793 1187 2033 0/3 1:47 1:12 Failed (0% | -37) Mar 22, 2013 1:35 AM 0052316 1124 2070 3/3 1:16 0:15 Passed (100% | +1) Mar 22, 2013 1:35 AM 0033571 1123 2069 3/3 1:57 1:21 Passed (86% | +1) Mar 22, 2013 1:33 AM 0027811 965 2068 2/2 1:12 0:59 Passed (84% | +1) Mar 22, 2013 1:32 AM 0000746 1057 2067 1/1 0:40 0:10 Passed (98% | +1) Mar 22, 2013 1:32 AM 0000905 980 2066 3/3 1:09 0:45 Passed (87% | +1) Mar 22, 2013 1:31 AM 0000766 824 2065 2/2 1:14 0:23 Passed (96% | +1)

I am not sure why you failed the one problem that is nearly 1000 points below your rating. It could have been a touch screen issue. Phones suck for that. I won't even try the TT on my phone because of failing, due to it.

I often feel that the system for which they are established in difficulty by rating is severly flawed.

If you failed it because, you simply had a different and most likely, completely winning idea, that was .001 of a point below the computer's assessment, especially, if it was a 7 move puzzle, you shouldn't get dinged for it in my opinion, to begin with. I get frustrated when I see 2 move puzzles that I can solve in 3 tries that are rated 2000+, but 5 to 7 move puzzles at my rating range, that take 5 to 10 tries. That is pretty good evidence to suggest that we get dinged for choosing lines that are winning when we shouldn't and that the system for predetermining difficulty is flawed.

I propose a system that rewards you according to your true ability, as opposed to how often you can match a specific engine's top choice for a move. My goal isn't match Houdini move for move. The last time I checked, that gets you banned. I should instead be rewarded for chosing winning lines. That is the goal of any chess player, at any level.

andrewmay

Erik,

Thanks for looking at this to improve the algorithm.  My main complaint is that for the past 2-3 weeks, I am only getting problems that are 100-250 points below my current rating (around 1700), which makes it very difficult to improve my rating.  My rating had consistently been between 2200-2400, then three days in a row I only got problems rated between 1600-1700, which caused a big drop.  All I want is problems that are consistent with my rating, which will help me gain rating points and get better problems.  I hope to see a change soon.

Martin0
nameno1had wrote:
Martin0 wrote:

Did some problems on my android. No idea why I got so low rated problems or why I was allowed to do this many tactics at the same day. Hard to maintain a rating with problems rated 1000 points below.

Mar 22, 2013 1:41 AM 0051998 1166 2037 2/2 1:01 0:21 Passed (95% | +1) Mar 22, 2013 1:40 AM 0033508 1244 2036 3/3 1:02 0:59 Passed (81% | +1) Mar 22, 2013 1:39 AM 0029466 1106 2035 4/4 1:49 1:01 Passed (89% | +1) Mar 22, 2013 1:38 AM 0048779 1056 2034 4/4 0:52 0:40 Passed (85% | +1) Mar 22, 2013 1:36 AM 0000793 1187 2033 0/3 1:47 1:12 Failed (0% | -37) Mar 22, 2013 1:35 AM 0052316 1124 2070 3/3 1:16 0:15 Passed (100% | +1) Mar 22, 2013 1:35 AM 0033571 1123 2069 3/3 1:57 1:21 Passed (86% | +1) Mar 22, 2013 1:33 AM 0027811 965 2068 2/2 1:12 0:59 Passed (84% | +1) Mar 22, 2013 1:32 AM 0000746 1057 2067 1/1 0:40 0:10 Passed (98% | +1) Mar 22, 2013 1:32 AM 0000905 980 2066 3/3 1:09 0:45 Passed (87% | +1) Mar 22, 2013 1:31 AM 0000766 824 2065 2/2 1:14 0:23 Passed (96% | +1)

I am not sure why you failed the one problem that is nearly 1000 points below your rating. It could have been a touch screen issue. Phones suck for that. I won't even try the TT on my phone because of failing, due to it.

I often feel that the system for which they are established in difficulty by rating is severly flawed.

If you failed it because, you simply had a different and most likely, completely winning idea, that was .001 of a point below the computer's assessment, especially, if it was a 7 move puzzle, you shouldn't get dinged for it in my opinion, to begin with. I get frustrated when I see 2 move puzzles that I can solve in 3 tries that are rated 2000+, but 5 to 7 move puzzles at my rating range, that take 5 to 10 tries. That is pretty good evidence to suggest that we get dinged for choosing lines that are winning when we shouldn't and that the system for predetermining difficulty is flawed.

I propose a system that rewards you according to your true ability, as opposed to how often you can match a specific engine's top choice for a move. My goal isn't match Houdini move for move. The last time I checked, that gets you banned. I should instead be rewarded for chosing winning lines. That is the goal of any chess player, at any level.

You don't need to speculate too much why I failed a puzzle. It was a good puzzle and I simply don't manage to solve all problems rated that low. I'm not that consistent at solving lower rated puzzles. I just couldn't see a knight fork for some reason.

Martin0
erik wrote:

we are going to tweak this algorithm again to make it better!

I'm assuming your referring to the search algorithm when starting a new puzzle. In that case you could also keep in mind that some people get a lot of lag when they have made a lot of puzzles (which they solved by using reset history, if I remember an old forum I cannot find correctly)

Thanks for looking into this.

nameno1had
Martin0 wrote:
nameno1had wrote:
Martin0 wrote:

Did some problems on my android. No idea why I got so low rated problems or why I was allowed to do this many tactics at the same day. Hard to maintain a rating with problems rated 1000 points below.

Mar 22, 2013 1:41 AM 0051998 1166 2037 2/2 1:01 0:21 Passed (95% | +1) Mar 22, 2013 1:40 AM 0033508 1244 2036 3/3 1:02 0:59 Passed (81% | +1) Mar 22, 2013 1:39 AM 0029466 1106 2035 4/4 1:49 1:01 Passed (89% | +1) Mar 22, 2013 1:38 AM 0048779 1056 2034 4/4 0:52 0:40 Passed (85% | +1) Mar 22, 2013 1:36 AM 0000793 1187 2033 0/3 1:47 1:12 Failed (0% | -37) Mar 22, 2013 1:35 AM 0052316 1124 2070 3/3 1:16 0:15 Passed (100% | +1) Mar 22, 2013 1:35 AM 0033571 1123 2069 3/3 1:57 1:21 Passed (86% | +1) Mar 22, 2013 1:33 AM 0027811 965 2068 2/2 1:12 0:59 Passed (84% | +1) Mar 22, 2013 1:32 AM 0000746 1057 2067 1/1 0:40 0:10 Passed (98% | +1) Mar 22, 2013 1:32 AM 0000905 980 2066 3/3 1:09 0:45 Passed (87% | +1) Mar 22, 2013 1:31 AM 0000766 824 2065 2/2 1:14 0:23 Passed (96% | +1)

I am not sure why you failed the one problem that is nearly 1000 points below your rating. It could have been a touch screen issue. Phones suck for that. I won't even try the TT on my phone because of failing, due to it.

I often feel that the system for which they are established in difficulty by rating is severly flawed.

If you failed it because, you simply had a different and most likely, completely winning idea, that was .001 of a point below the computer's assessment, especially, if it was a 7 move puzzle, you shouldn't get dinged for it in my opinion, to begin with. I get frustrated when I see 2 move puzzles that I can solve in 3 tries that are rated 2000+, but 5 to 7 move puzzles at my rating range, that take 5 to 10 tries. That is pretty good evidence to suggest that we get dinged for choosing lines that are winning when we shouldn't and that the system for predetermining difficulty is flawed.

I propose a system that rewards you according to your true ability, as opposed to how often you can match a specific engine's top choice for a move. My goal isn't match Houdini move for move. The last time I checked, that gets you banned. I should instead be rewarded for chosing winning lines. That is the goal of any chess player, at any level.

You don't need to speculate too much why I failed a puzzle. It was a good puzzle and I simply don't manage to solve all problems rated that low. I'm not that consistent at solving lower rated puzzles. I just couldn't see a knight fork for some reason.

If that is the case, it lends even more creedance to what I am saying. Master level players shouldn't make mistakes that that sub 1200 players make...I realize even GM's make silly mistakes from time to time. If a puzzle is really worthy of that rating in my opinion, you don't make that mistake.

Having said all of this, considering I am not a computer programmer, I am not sure how to incorporate a preattempted rating system for puzzles, or a way to integrate a system for rewarding a player for choosing winning lines. This could be done on a basis of points rewarded for how close to the best and all of the way down to a drawing line, for which you would be rewarded nothing. When you play losing moves, that is when you should get dinged. Also, the system should be set up so that you can chose it to reward or penalize according to the type of chess you play and the difficulty of the problem. Three out of the 10 people who try a puzzle might have gotten lucky and picked the right move quickly, driving the average solved time down, too low, while 7 people might have failed it. My skill shouldn't be required to keep pace with the luck of others.

The_Evil_Ducklings

My tactics tainer rating was 2500-2600 for almost 4 years then fell suddenly to the 2300 range. While disappointing, it better represents my Fide rating (2390). It might be that once you fall out of a range, it's very hard to get back up to that level unless you work very hard and don't cut corners. This includes doing too many problems at once, doing problems sleep deprived, or playing the move that "looks right" without calculation. 

sisu

Let's make it happen!

Martin0
nameno1had wrote:
Martin0 wrote:
nameno1had wrote:
Martin0 wrote:

Did some problems on my android. No idea why I got so low rated problems or why I was allowed to do this many tactics at the same day. Hard to maintain a rating with problems rated 1000 points below.

Mar 22, 2013 1:41 AM 0051998 1166 2037 2/2 1:01 0:21 Passed (95% | +1) Mar 22, 2013 1:40 AM 0033508 1244 2036 3/3 1:02 0:59 Passed (81% | +1) Mar 22, 2013 1:39 AM 0029466 1106 2035 4/4 1:49 1:01 Passed (89% | +1) Mar 22, 2013 1:38 AM 0048779 1056 2034 4/4 0:52 0:40 Passed (85% | +1) Mar 22, 2013 1:36 AM 0000793 1187 2033 0/3 1:47 1:12 Failed (0% | -37) Mar 22, 2013 1:35 AM 0052316 1124 2070 3/3 1:16 0:15 Passed (100% | +1) Mar 22, 2013 1:35 AM 0033571 1123 2069 3/3 1:57 1:21 Passed (86% | +1) Mar 22, 2013 1:33 AM 0027811 965 2068 2/2 1:12 0:59 Passed (84% | +1) Mar 22, 2013 1:32 AM 0000746 1057 2067 1/1 0:40 0:10 Passed (98% | +1) Mar 22, 2013 1:32 AM 0000905 980 2066 3/3 1:09 0:45 Passed (87% | +1) Mar 22, 2013 1:31 AM 0000766 824 2065 2/2 1:14 0:23 Passed (96% | +1)

I am not sure why you failed the one problem that is nearly 1000 points below your rating. It could have been a touch screen issue. Phones suck for that. I won't even try the TT on my phone because of failing, due to it.

I often feel that the system for which they are established in difficulty by rating is severly flawed.

If you failed it because, you simply had a different and most likely, completely winning idea, that was .001 of a point below the computer's assessment, especially, if it was a 7 move puzzle, you shouldn't get dinged for it in my opinion, to begin with. I get frustrated when I see 2 move puzzles that I can solve in 3 tries that are rated 2000+, but 5 to 7 move puzzles at my rating range, that take 5 to 10 tries. That is pretty good evidence to suggest that we get dinged for choosing lines that are winning when we shouldn't and that the system for predetermining difficulty is flawed.

I propose a system that rewards you according to your true ability, as opposed to how often you can match a specific engine's top choice for a move. My goal isn't match Houdini move for move. The last time I checked, that gets you banned. I should instead be rewarded for chosing winning lines. That is the goal of any chess player, at any level.

You don't need to speculate too much why I failed a puzzle. It was a good puzzle and I simply don't manage to solve all problems rated that low. I'm not that consistent at solving lower rated puzzles. I just couldn't see a knight fork for some reason.

If that is the case, it lends even more creedance to what I am saying. Master level players shouldn't make mistakes that that sub 1200 players make...I realize even GM's make silly mistakes from time to time. If a puzzle is really worthy of that rating in my opinion, you don't make that mistake.

Having said all of this, considering I am not a computer programmer, I am not sure how to incorporate a preattempted rating system for puzzles, or a way to integrate a system for rewarding a player for choosing winning lines. This could be done on a basis of points rewarded for how close to the best and all of the way down to a drawing line, for which you would be rewarded nothing. When you play losing moves, that is when you should get dinged. Also, the system should be set up so that you can chose it to reward or penalize according to the type of chess you play and the difficulty of the problem. Three out of the 10 people who try a puzzle might have gotten lucky and picked the right move quickly, driving the average solved time down, too low, while 7 people might have failed it. My skill shouldn't be required to keep pace with the luck of others.

Well, having a tactics rating of 2000+ doesn't makes me a master level player.

There are some bad problems with several winning lines, but those are just bad problems that should be removed/changed (since alternative winning lines currently isn't an option). It shouldn't matter how close you are from solving a puzzle, a failed puzzle is a failed puzzle. I don't care weather you turn a win into a draw or a win into a loss, both should result in a failed puzzle.