Cat bot ratings are ALL questionable?

Sort:
Avatar of pds314

Alright, so I am rated 663 at rapid. I am by no means a great player. I do not have good knowledge of openings, make outrageous blunders that cost me the game, don't always see fairly basic tactics coming, etc. Most of the usual reasons people are 700-rated.

What I'm surprised at is that the first 3 cat bots were not hard to beat first try even though they are nominally 800, 1000, and 1200. And not because I played well either. I'm fairly certain I straight up blundered a queen in some of the games. Finally, I played against the 1400 rated Catspurrov and lost because I got greedy and blundered M1 trying to pull off a mate against them. I rematched them, played poorly, was down 13 points of material in the endgame, and noticed my king is completely immobilized and all I had was pawns. So I push pawns until my last two pawns cannot move, and Catspurrov just makes a random pawn move and blunders a draw. I will point out that 1 loss 1 draw is not what you'd expect for 700 or 800 points of rating difference and me not really trying that hard. Like, that much rating difference really implies that it should win around 98% of the time and draw 2% or something. And certainly drawing up 13 points of material is.... interesting....

I felt like I was playing rather poorly against them. Definitely not as well as I would try to play against humans. A lot more blunders that SHOULD have cost me the game against an actual 600 or 700-rated opponent. But the thing is, the cat bots, while they are aware they can use tactics, don't really seem to know that I can use tactics too. Catspurrov seems about as aware of my tactics as a 700 might be. But the rest seem very oblivious to even basic forks and pins when they're not the one doing them. What do you think the "real" rating of the cat bots if there's no time control but you don't think too long? My opinion is maybe 350, 450, 500, 850?

Except Mittens. Mittens is also questionable but in the other direction. She's 3300 or something. The only way I'm beating or drawing her without playing bullet and winning on time is with a lot of help from stockfish.

In any case, what do you think the real ratings of the first 4 cats are? Am I the only one who thinks they are wildly overrated?

Avatar of 1cbb

beating mittens is so easy

Avatar of 1cbb

mitten is genuinely 1 elo

Avatar of pds314
1cbb wrote:

beating mittens is so easy

Beating her maybe. *checkmating* her is very hard.

Avatar of 1cbb

to beat mittens, just play 1+0 but to checkmate mittens, you just request the help of your 3600 elo friend

Avatar of llama36
pds314 wrote:

In any case, what do you think the real ratings of the first 4 cats are? Am I the only one who thinks they are wildly overrated?

Yeah, all the bots (not just cats) are massively overrated.

It's an open question why chess.com chooses to do that... I mean, obviously overrating them a little is a way to encourage people to use their site, which makes sense, but they're overrated by something like 800 points (just a guess) which is weird.

Avatar of pds314
llama36 wrote:
pds314 wrote:

In any case, what do you think the real ratings of the first 4 cats are? Am I the only one who thinks they are wildly overrated?

Yeah, all the bots (not just cats) are massively overrated.

It's an open question why chess.com chooses to do that... I mean, obviously overrating them a little is a way to encourage people to use their site, which makes sense, but they're overrated by something like 800 points (just a guess) which is weird.

It would be interesting if the lower rated bots (low enough that they wouldn't constitute a significant source of pollution for very very high rated players with their smaller number of opponents) played in random matchmaking. Only occasionally, but just enough that their ratings remained accurate.

Avatar of 1cbb

https://www.chess.com/forum/view/game-showcase/mittens-vs-lichess-engine

I did 5 games where I forced the Lichess Analyser Practice Computer to play openings where it has a slight disadvantage but it still 4-1'ed Mittens.

Avatar of pds314

It would be great if there were a system you could plug games into to guess the ELO of each side. Perhaps with some time control information so it knows whether the players have a lot of think time or not and can adjust its ELO guess accordingly. I'm not sure what that system would be, since some move that hangs mate in a few moves at Stockfish's level might actually be very strong at say 300 ELO. And likewise, Stockfish would have difficulty analysing a game played by stronger stockfish.