Arbitrary or Inaccurate Ratings of Some Tactics Puzzles

Sort:
CheddarStyle

I have been using chess.com's tactics trainer for a few years now, slowly getting better and better, where recently I have been able to consistently stay above 2000 rating. In this process, I have played thousands of puzzles, not as many as a lot of people, but still a pretty good amount. In this time, especially lately, I've noticed that some rather easy puzzles seem to be rated much higher than they should be, giving me a lot of points even though it's really just a simple mate in two, and then other puzzles that are much harder than their rating suggests. 

I can't seem to be able to view tactics I've done very far in the past, only like the past thirty or so, so I can't find many examples, but I did just run into an absolutely ridiculous puzzle only rated 1676, with a target time of only 23 seconds.


It didn't take me long to see the initial idea, queen is pinned, but there is 1. Rd8 Rxd8 2. Qxg7 Ke8 3. Bg6#, so I realized 1. ...Rxd8 was impossible, he'd have to move the king. 1. ...Kf7 didn't make sense to me, because it'd just let me check him again, so I looked at what would happen if he advanced towards my rook. I saw after 2. Rxc8 he'd have to do something with the queen. This line of thinking so far took me maybe 20-30 seconds at most. This is where things appeared to get complicated. Sure, if 2. ...Qxc3+ then 3. Rxc3 and I'd enjoy being up material, but why would black do that when there's a juicy check practically RIGHT THERE forking my king and the rook on g4. Of course, I saw I had checks of my own after that, and eventually calculated the entire winning line, 1. Rd8 Ke7 2. Rxc8 Qe2+ 3. Kb1 Qxg4 4. Qc7+ Kf6 5. Rf8+ Kg5 6. Qc1+ Kh4 7. Rf4, pinning the queen and winning the game. Everything after 3. ...Qxg4 is very forced, so it wasn't the worst thing to calculate, I just had to find the checks, but still, finding a seven move forced win left me feeling accomplished, and as I started putting the moves in, I was eager to receive the points from what must have been a highly rated tactic, solved relatively quickly. But, instead of 2. ...Qe2+, black played 2. ...Qxc3+??, rendering most of my calculation pointless. When I completed the problem, despite calculating seven moves in 53 seconds, it only gave me +7 to my rating, which prompted me to look up the rating and target time of the puzzle, and saw it is only rated 1676 with a target time of 23 seconds! This honestly feels worse than missing a tactic due to a miss-click, at least then I'm losing rating due to my own carelessness. If someone only calculated that tactic to 2. Rxc8 without thinking about 2. ...Qe2+, then they didn't really solve it, 2. ...Qe2+ is a very obvious response to 2. Rxc8 and needs to be accurately responded to if white wants a chance of winning this game.


This isn't the only example I've found, it's just the most extreme I've seen in quite a while, and the only one still in my tactics history. To contrast, here is a tactic I also did today that is rated 1844 with a target time of 20 seconds, so according to this arbitrary ratings scale, it should certainly be harder than this previous forced win in seven moves.

Wow, what a toughy, huh? This one could not be more obvious for me at least. Knight just took my knight, black's queen is left attacked by the rook, but if I take, the knight on c3 takes my queen, and I'm down material in the end. If only there was a way to get my queen out of danger while maintaining the double attack I currently have? I quickly saw no queen move would help, and saw not long afterwards that if I recapture the knight with the pawn, I'd be threatening another knight and his queen would still be hanging, and I'd be up a piece in the end. A pretty simple two move tactic, there are probably harder ones in the intermediate chess.com lessons.

Even if the first tactic was only three moves like it'd have you believe, with 2. ...Qc3+ really being the best move for black there, it'd still have been harder to find and calculate than the second tactic. It really makes me question how these puzzle ratings are derived, and whether my increase in rating is even very meaningful, or if I've just been lucky with getting highly rated easy tactics.

Here's the link to the first tactic, which needs to be reported so it can be fixed

And here's the link to the second, which probably shouldn't be reported but only sneered at

Anyways, am I alone in these findings? Is there some actual concrete method from which these puzzles' ratings are derived, or is it just some number a random GM pulled out of nowhere? Thanks to all that read this giant block of text, this problem has peeved me for awhile and I wanted to see if I was the only one who felt like these puzzle ratings are more or less arbitrary.

Lord_Hammer

How is the first one hard? 

CheddarStyle

Are you really saying a tactic that requires you to calculate seven moves ahead deserves to be rated in the 1600s while a simple two move tactic deserves to be rated in the 1800s?
Good job on being an IM and everything, and I understand that if I was able to figure it out in under a minute, someone actually good at chess like you could probably do it very quickly, but that is very much unrelated to the post.

Unless you just came here to brag about how good you are at chess, in which case good job, everyone has seen the IM in your name so you can rest easy.

Lord_Hammer

1. I did not come here to brag. 

2. The first one is a 3 move tactic, not a 7 move tactic. 

CheddarStyle

Did you actually read the post? Or at least some of it? It's a seven move tactic, whoever made it just screwed it up.

Lord_Hammer
PonkPone wrote:

Did you actually read the post? Or at least some of it? 

I did. 

 

PonkPone wrote:

It's a seven move tactic, whoever made it just screwed it up.

Precisely why it is 1600. 

CheddarStyle

But I thought it was just a simple three move tactic? It'd be better to not respond than move the goalpost here. And, like I mentioned in the post, I definitely believe that if 2. ...Qe2+ wasn't possible and 2. ...Qxc3+ really was the best move there, it'd still be a harder tactic than the second example. There is simply more to calculate in the first, and the first move in the sequence isn't near as natural of a move as in the second.

Martin_Stahl
HyperCube97 wrote:

The ratings of some tactics are definitely inaccurate by any reasonable metric.

 

Tactics change ratings just like players by being solved correctly or failed.

CheddarStyle
HyperCube97 wrote:

The randomness inherent to this way of computing the puzzle ratings (along with the features of the used algorithms, Chess.com definitely has some responsibility here) apparently sometimes lead to very unnatural ratings on some puzzles ...

This is a super old topic, but just the other day I found a simple backrank mate in two (where you have a queen-rook battery and sac the queen for a rook with mate on the next move, simplest thing ever) that was rated like 2300+ lol.
Most puzzles have a decently accurate rating, but others are just wayyyy off, even at ratings well above 2000.

AlekhineHound
CheddarStyle wrote:
HyperCube97 wrote:

The randomness inherent to this way of computing the puzzle ratings (along with the features of the used algorithms, Chess.com definitely has some responsibility here) apparently sometimes lead to very unnatural ratings on some puzzles ...

This is a super old topic, but just the other day I found a simple backrank mate in two (where you have a queen-rook battery and sac the queen for a rook with mate on the next move, simplest thing ever) that was rated like 2300+ lol.
Most puzzles have a decently accurate rating, but others are just wayyyy off, even at ratings well above 2000.

 

Yeah, you probably know how the ratings are determined now, but I believe it's just based on player data of what players are rated who get the question correct. Meaning, sometimes a player will just get lucky, getting it right despite the subtleties you mention. Or sometimes they'll get it wrong because a seemingly obvious move is overlooked for another seemingly obvious move (especially with people trying to answer quickly).