Difficulty of chess puzzles question - Chess Forums

TheMachinist87

May 18, 2022

0

#1

Hi to all. I asking myself, how is the difficulty level of chess puzzles calculated? Is this number show in e.g. Puzzle Rush related to ELO strength ? How is this determined ,by CP loose , number of moves in puzzle ?

llama51

May 18, 2022

0

#2

When someone correctly solves the puzzle, the puzzle's rating goes down.
When someone fails the puzzle, the puzzle's rating goes up.

It's not related to centipawns or a player's other ratings. In fact puzzle ratings are much higher than other ratings. I don't know why chess.com chose to make them higher. It would be simple to make them similar to game ratings.

Martin_Stahl

May 19, 2022

0

#3

llama51 wrote:

... In fact puzzle ratings are much higher than other ratings. I don't know why chess.com chose to make them higher. It would be simple to make them similar to game ratings.

How do you do that? You're already in an artificial situation and know there is something there. So it's solvable given enough time. If they went back to only giving a maximum of 1 point for getting it right if you take too long, that would help some, but even before that, ratings were higher.

Maybe they could tweak the K-value or something, but even if the max gain was something like 5 points, there likely would be a lot of players with higher tactics ratings compared to any other rating pool.

llama51

May 19, 2022

0

#4

Martin_Stahl wrote:

llama51 wrote:

... In fact puzzle ratings are much higher than other ratings. I don't know why chess.com chose to make them higher. It would be simple to make them similar to game ratings.

How do you do that? You're already in an artificial situation and know there is something there. So it's solvable given enough time. If they went back to only giving a maximum of 1 point for getting it right if you take too long, that would help some, but even before that, ratings were higher.

Maybe they could tweak the K-value or something, but even if the max gain was something like 5 points, there likely would be a lot of players with higher tactics ratings compared to any other rating pool.

Yeah, maybe it's harder than I think, but for example, first of all, if the person doing the puzzle doesn't have an active live rating, then they can't affect the puzzle's rating.

And to start, have someone on staff that actually plays chess, even just one person, solve some puzzles and get a feel for how hard they are. Then apply some transform that sets a new mean and cap. Do it conservatively so you don't get things too far wrong... but for example humans can't solve 3000 rated puzzles unless they're super GMs... so the cap will be below 3000.

1g1yy

May 19, 2022

0

#5

I have no idea how they do it but I often solve puzzles which the rating makes absolutely no sense. For instance the other day I had one that sort of stumped me for a moment because it was so easy I wasted time looking for some ridiculous trick in it. Well there wasn't any and I got it right.

It was played about 1450 times and had a success rate around 95%. The puzzle rating is currently 2200 but at the time I did it I want to say it was 2350 or so.

Sorry to say but this one pretty much plays itself.

https://www.chess.com/puzzles/problem/1729532/practice

Martin_Stahl

May 19, 2022

0

#6

llama51 wrote:

Yeah, maybe it's harder than I think, but for example, first of all, if the person doing the puzzle doesn't have an active live rating, then they can't affect the puzzle's rating.

And to start, have someone on staff that actually plays chess, even just one person, solve some puzzles and get a feel for how hard they are. Then apply some transform that sets a new mean and cap. Do it conservatively so you don't get things too far wrong... but for example humans can't solve 3000 rated puzzles unless they're super GMs... so the cap will be below 3000.

There might be a way to cap tactics ratings at some point, but if a tactic is rated 3000, it could be either because it's really deep or a lot of people are going for something obvious but that isn't the best line. Someone taking their time may be able figure out the line (especially when there can only be one line).

Just like any other part of the game, a particular rating may not be familiar with some patterns and very familiar with others. I do think there is higher tendency for the selection algorithm to weight selecting tactics below the current member's puzzle rating, which can put some upwards pressure on ratings.

I know there are some high rated puzzles I get right with some thought along with lower ones I get wrong. On a good day, I've pushed almost to 2900. On other days, I tank 300 points and have trouble getting or maintaining over 2600. So, if my OTB rating is any indication, most of the puzzle I get would have to be considered between 1500 and 1800 rating (and not 2200+ where most of the batch of my last session had me with only 1 puzzle over my rating at the time)

llib2

May 19, 2022

0

#7

llama51 wrote:

When someone correctly solves the puzzle, the puzzle's rating goes down.
When someone fails the puzzle, the puzzle's rating goes up.

It's not related to centipawns or a player's other ratings. In fact puzzle ratings are much higher than other ratings. I don't know why chess.com chose to make them higher. It would be simple to make them similar to game ratings.

Thanks for explaining how the puzzles are rated. The solution is so simple! I bet in 100 years I could have figured out how to do it. 200?

llama51

May 19, 2022

0

#8

Martin_Stahl wrote:

llama51 wrote:

Yeah, maybe it's harder than I think, but for example, first of all, if the person doing the puzzle doesn't have an active live rating, then they can't affect the puzzle's rating.

And to start, have someone on staff that actually plays chess, even just one person, solve some puzzles and get a feel for how hard they are. Then apply some transform that sets a new mean and cap. Do it conservatively so you don't get things too far wrong... but for example humans can't solve 3000 rated puzzles unless they're super GMs... so the cap will be below 3000.

There might be a way to cap tactics ratings at some point, but if a tactic is rated 3000, it could be either because it's really deep or a lot of people are going for something obvious but that isn't the best line. Someone taking their time may be able figure out the line (especially when there can only be one line).

Just like any other part of the game, a particular rating may not be familiar with some patterns and very familiar with others. I do think there is higher tendency for the selection algorithm to weight selecting tactics below the current member's puzzle rating, which can put some upwards pressure on ratings.

I know there are some high rated puzzles I get right with some thought along with lower ones I get wrong. On a good day, I've pushed almost to 2900. On other days, I tank 300 points and have trouble getting or maintaining over 2600. So, if my OTB rating is any indication, most of the puzzle I get would have to be considered between 1500 and 1800 rating (and not 2200+ where most of the batch of my last session had me with only 1 puzzle over my rating at the time)

True, different people are familiar with different patterns, but if someone were to attempt to bring the puzzle rating more in line with live ratings it would take extreme incompetence to end up with what we have right now.

As for puzzles below your rating, I'm not sure that pressures it up or down. For example if the puzzles I'm given are slightly below my rating, to where I'm expected to score 60%, and then I get 6 out of 10 right, then over a long period of time both my rating and the puzzle's ratings don't go up or down.

llama51

May 19, 2022

0

#9

It's an interesting thought experiment though -- whether new players in the pool push ratings higher or lower.

What I discovered is, as long as the initial rating period gets them close to their "true" rating, then there is no effect at all i.e. everyone's rating stays the same... even when the group of new players is very large, and regardless of whether they're much higher or lower than the average player. Mostly this is because between two established players the number of rating points gained is equal to the number lost. Therefore the system maintains its point-to-skill ratio.

It would be the same for frequently facing off against players (or puzzles) worse than you.