Is there any metric to rate the overall complexity of a position/game?

Sort:
Avatar of DrFleonSunoco

Here's what our all-knowing AI friend had to say (quite interesting):

Yes, there are emerging metrics and research-backed approaches that aim to quantify the complexity of a chess position from a human perspective, and your intuition is spot-on. Here are some of the most promising ideas:

🧠 1. Fragility Score

A recent study introduced the Fragility Score (F), which measures the tension within a position by analyzing the interaction graph of pieces. It reflects how sensitive a position is to small changes—i.e., how easily a single inaccurate move can lead to a drastic shift in evaluation.

  • High fragility = sharp, tactical, and volatile positions.
  • Low fragility = stable, strategic, or drawish positions.
  • Fragility tends to peak around move 15, often aligning with critical turning points.

πŸ“Š 2. Decisiveness Metric

Another approach uses engine evaluations to define a decisiveness metric, which quantifies how much a move influences the outcome of the game:

  • If many moves are playable but only one avoids a steep drop in evaluation, the position is highly decisive and thus complex.
  • Players tend to perform better in positions with clear decisiveness, suggesting that ambiguity increases difficulty.

πŸ” 3. Move Ambiguity and Refutation Depth

From a practical standpoint, complexity can be estimated by:

  • Number of reasonable candidate moves (e.g., those within ±0.5 of best engine move).
  • Depth of refutation: If a suboptimal move is only punished after 5+ precise replies, it's harder for humans to spot.
  • Branching factor: More plausible continuations = more cognitive load.

Avatar of 15Symphonies
DrFleonSunoco wrote:

Here's what our all-knowing AI friend had to say (quite interesting):

Yes, there are emerging metrics and research-backed approaches that aim to quantify the complexity of a chess position from a human perspective, and your intuition is spot-on. Here are some of the most promising ideas:

🧠 1. Fragility Score

A recent study introduced the Fragility Score (F), which measures the tension within a position by analyzing the interaction graph of pieces. It reflects how sensitive a position is to small changes—i.e., how easily a single inaccurate move can lead to a drastic shift in evaluation.

  • High fragility = sharp, tactical, and volatile positions.
  • Low fragility = stable, strategic, or drawish positions.
  • Fragility tends to peak around move 15, often aligning with critical turning points.

πŸ“Š 2. Decisiveness Metric

Another approach uses engine evaluations to define a decisiveness metric, which quantifies how much a move influences the outcome of the game:

  • If many moves are playable but only one avoids a steep drop in evaluation, the position is highly decisive and thus complex.
  • Players tend to perform better in positions with clear decisiveness, suggesting that ambiguity increases difficulty.

πŸ” 3. Move Ambiguity and Refutation Depth

From a practical standpoint, complexity can be estimated by:

  • Number of reasonable candidate moves (e.g., those within ±0.5 of best engine move).
  • Depth of refutation: If a suboptimal move is only punished after 5+ precise replies, it's harder for humans to spot.
  • Branching factor: More plausible continuations = more cognitive load.

This makes perfect sense to me.

Avatar of mikewier

the term “tension” is often used to describe positions in which there are multiple possible pawn or piece exchanges in a position. At every move, both players must judge whether it is better to break the tension, through exchanging or moving the pawns/pieces, or to maintain the tension.

Tension is one aspect of complexity that is easily measured. Although high/low tension is not necessarily related to the value of the position, high tension often leads lower rated players to make mistakes. As they must make more decisions and judgments, they are more likely to make mistakes.

Avatar of mikewier

Try this on for size. program Stockfish to find positions that have the greatest change in evaluation from move n to move n+1—the horizon effect. We could start from a set of known positions—say the last 10 world championship matches. Then find the positions that have the greatest horizon effect changes, up through a horizon of 20 moves.

we could then have a group of grandmasters review the positions to see what they have in common.

Perhaps this could provide an objective measure of complexity, as used by the OP.

Avatar of Cythaera
dr Fleon, who authorized your use of Vonnegut's self-portrait for your PFP?

(kidding, I love it)

what was this goofy thread about?
Avatar of 15Symphonies
mikewier wrote:

the term “tension” is often used to describe positions in which there are multiple possible pawn or piece exchanges in a position. At every move, both players must judge whether it is better to break the tension, through exchanging or moving the pawns/pieces, or to maintain the tension.

Tension is one aspect of complexity that is easily measured. Although high/low tension is not necessarily related to the value of the position, high tension often leads lower rated players to make mistakes. As they must make more decisions and judgments, they are more likely to make mistakes.

Thank you for chiming in, I infer from what you write that tension could also be a tool to be used to take advantage of a time imbalance, and that is perhaps one of its most important uses ( in addition to any genuine fog of war implications) .

Avatar of DrFleonSunoco
Cythaera wrote:
dr Fleon, who authorized your use of Vonnegut's self-portrait for your PFP?
(kidding, I love it)
what was this goofy thread about?

Oh, good to finally encounter a person, who spotted this! happy.png I have this tattooed on my hand too happy.png
P.S. and DrFleonSunoco is a Vonnegut obscure character too