How to Evaluate Your Chess Progress (with Real Metrics).

How to Evaluate Your Chess Progress (with Real Metrics).

Avatar of Oleh_Sych
| 0

If you want to improve at chess, it’s important to measure your current level somehow and track how it changes over time. Let’s forget for a moment whether we play stronger or weaker than other people. If you’re not a chess champion, many people will play stronger than you — so put that out of your head for now. What matters is whether you are better now than you were some time ago.

Let's talk about metrics and how to track them.

Chess is largely based on ratings: FIDE ratings (three categories — classical, rapid, blitz), national federation ratings, online ratings on platforms like Chess.com, Lichess, and ChessArena, puzzle ratings (Chess.com and Lichess), ratings inside different chess programs, and so on.

My coach always tells me: “Forget about ratings. Just play well, and the rating will follow.” I agree — rating shouldn’t be your goal at any cost. Still, rating is a great tool for understanding where you stand in chess.

Elo ratings

In chess we use the Elo system. It uses mathematics to predict the outcome between two players and to adjust their ratings after each game. Essentially, Elo is a way to compare playing strength and to predict game results. If ratings are equal, the probability is roughly 50/50. If one player is about 600 Elo points higher, theory predicts about a 97% chance for the stronger player to score better — in theory. In practice, results can differ: statistics is the science of large numbers. For example, if two players with a 600-point difference played 100 games, the higher-rated player would be expected to score about 97 points to the lower-rated player’s 3.

Elo is used beyond chess too — in sports like table tennis, some basketball ratings, and other games.

A player’s first rating can be assigned by default (many online platforms give a starter rating) or calculated from games against already-rated opponents (as in most over-the-board rating systems).

One question that often comes up is: can you compare a Chess.com rating with an official FIDE rating or a US Chess Federation rating? Theoretically yes, but there’s no exact conversion — many factors influence final ratings in each system: cheating, the pressure of over-the-board play (players often perform differently than online), time controls, pool of opponents, and other nuances.

I found a useful comparison table on ChessDojo that maps ratings across different systems. I like it, although in my personal case the estimated numbers didn’t match my real results: based on my Chess.com rating, the table suggested my FIDE rating should be about 200 points higher. Maybe I simply haven’t played enough rated tournaments yet.

Which metrics can we use?

I track four metrics to measure my chess progress:

1) FIDE rating

My primary metric is FIDE ELO. FIDE is the International Chess Federation, and to get or change this rating you must play in FIDE-rated tournaments. FIDE has three main categories:

  • Classical — long games (typically at least 60 minutes per player). This is, in my view, the most important category — titles like Candidate Master, Master, and Grandmaster are based on classical play.
  • Rapid — games longer than 10 minutes but shorter than 60 minutes per player.
  • Blitz — games between 3 and 10 minutes per player.

 For me, Classical rating is the most important. Rapid is fun and useful for practice; Blitz I treat mainly as entertainment.

The minimum rating you can have in the FIDE system is 1400. If, for some reason, after a series of games your rating drops below 1400, you actually get a rating of zero — in other words, you become unrated.

As of now, the highest rating belongs to Magnus Carlsen (around 2839)

The Candidate Master title is typically associated with a rating of about 2200 in the open category, and formal title requirements to maintain that level for 30 rated games in order to earn the title.

2) Chess.com Rapid rating

My second metric is my Chess.com Rapid rating. I chose Rapid because it’s the longest practical online time control that still allows serious play. Blitz and Bullet often turn into races against the clock, while rapid gives enough time to think and make quality decisions

Online you can play many games daily, so ratings are based on a much larger sample size. You can analyze opening statistics, rating graphs, and trends over time.

The downside is cheaters — but platforms like Chess.com actively fight cheating, review reports, and sometimes restore rating points when cheaters are caught. I’ve personally had rating points returned a few times after opponents were found using engine assistance.

3) Puzzle rating

The next metric is puzzle rating. Solving puzzles daily is a key part of training and helps improve tactical vision. Puzzle ratings go up when you solve correctly and down when you miss solutions. As your puzzle rating rises, the puzzles become harder, and your correct-solve percentage often falls — so the rating can plateau or even drop temporarily.

I track a moving average of my puzzle rating (I use a 7-day average). The moving average smooths spikes and troughs and shows the overall trend. 

4) Playing vs. chess programs (computer rating)

Human games depend on opponent behaviour — you can’t control who you face, how often you play, or whether the opponent disconnects or cheats. Over-the-board I might play only 10–15 games per quarter (40–60 per year), which isn’t enough for smooth statistics. 

Playing online also depends of specific factors, for example you can meet a lot of cheaters (which will be not detected by platform) and your rating will go down. Or you get a lot of easy wins, because you opponent just dropped connection or don’t care about the game after one bad move and just disconnect. That’s mean you can believe to such a measurments for 100%.

Computers can help by giving a consistent benchmark.

For a calibrated “PC rating” I use HIARCS (because I’m on Mac, and it is best chess program for mac platform). It exists for laptop/desctop version and for iPad as well. HIARCS Chess Epxplorer adapts to your level, simulates opponents of similar strength, and computes an ELO-like rating from your results.

For example, the program once estimated my rating at 1327 Elo and then simulated opponents around that rating; based on results, the program updates your rating.

Whether this rating correlates exactly with FIDE or online ratings is less important. Developers actually said it’s approximately FIDE, but we don’t care about it even if they are wrong. What matters is the trend: is it going up or down? To get a reliable baseline, play at least 10 games with the program so it can calibrate your rating.

Summary & practical advice

In real life I work in IT and I love numbers, statistics, and graphs. To track my chess progress I place four graphs side by side — Chess.com, FIDE, puzzle rating, and HIARCS Elo. Together they reveal patterns.

My metrics are not set in stone. This is a methodology, not dogma. Key recommendations:

  • Record your measurements regularly — even when you’re not training heavily.
  • Don’t measure too often; keep consistent intervals. I log my values once a week, every Sunday.
  • Use multiple criteria that cover different aspects of the game.

A few notes on what each metric shows:

  • Chess.com reflects frequent online play in comfortable conditions and is useful for experimenting with openings.
  • FIDE is like an exam: over-the-board, classical time control, and usually shows your maximal practical strength.
  • Puzzle rating measures tactical problem-solving ability — essential and trainable.
  • Computer rating gives a clear, repeatable benchmark and is useful for steady calibration.

Find your own set of evaluation criteria and track them. Seeing progress visually is hugely motivating and proves your time is well spent — that you’re moving closer to your goal.

Stay with me, subscribe to the channel, and let’s go on this chess journey together. Move by move toward the goal.