Glossary of Computer Chess

the_real_greco

Updated: Jul 19, 2019, 11:49 AM | 0

(Note: Similar lists such as this apparently exist in numerous places on the internet, but seem to all be from open-source projects. The Lc0, project has their version, for example, where many of the definitions are the same as here. But with continued addition and editing, I feel comfortable posting here.)

Hello everyone!

One of the barriers to entry in any field is the jargon- words that only have meaning within a specialty, or have a different (often technical) meaning than they do in normal conversation. Chess engines are loaded with such jargon, to the point that it can intimidating to enter conversations- what is an LR drop? How are AB, MCTS, and NN related? For your reading and reference pleasure, here is a glossary of words that are useful to know. They range from concepts you probably know, to concepts that I don't fully understand. Thanks to Cscuile (Mr. Antifish) and contributors.

The focus here is on neural-network engines, since they are the most mysterious. Soon AB engines will be included (I'll keep updating!) Let me know it something is missing or confusing, and I'll edit!

Term	Definition
6b/10b/20b/40b (See Resudual Block)	Shorthand for 6/10/20/40 residual blocks
20x256	Shorthand for the size of the NN. 20 residual blocks and 256 filters.
Alpha Zero (A0)	You really should just read the papers. https://science.sciencemag.org/content/362/6419/1140 https://deepmind.com/research/alphago/alphazero-resources/
Alpha-Beta Search (AB)	Advanced minimax search. Prunes moves that are especially unlikely to be the best continuation.
Attack-set	The set of pseudolegal moves available to a piece in a given game-state, plus the friendly pieces defended by that piece. Compare to "Move-set."
Backend	Neural network computational backend to use, downloadable from the Lc0 website. As a rough guide, use Cuda for modern nvidia GPUs, Opencl for all other GPUs, and Blas if no GPU.
Batch Size	How many positions the GPU can train on simultaneously.
Bestmove	The move with the most visits.
Binary	In relation to NN engines, the 'program' (or executable) part of the engine. Does not include the net used in the evaluation function- that is stored in a separate text file.
Bitboard	A 64-bit string describing the locations of one type of piece (P,N,B,K,R,Q, p,n,b,k,r,q) in one position. Called a 'board' because it is conceptualized onto an 8x8 grid.
Blessed Loss	(same as Cursed Win) A position from which checkmate can be forced, but is a tablebase draw in the context of the 50-move rule.
Centipawn	Output of traditional AB engine evaluation. An indication of which side is 'winning'. A value of 100 centipawns (+1.00) means that white is winning by approximately one pawn. Negative evaluations indicate black is winning.
Certainty propagation	Min-maxing certain nodes (win, loss, TB etc). It leads to much better strength when the net is 3 to 10 ply away from transition into tablebase.
Convolutional Neural Network (CNN)	The largest part of Lc0's NN uses this architecture, which convolves (slides) several 3x3xN filters across the entire board.
CPU	Central Processing Unit; used for 'normal' work in most computers. The type of processor used by AB engines and Komodo-MC; also used to a limited extent by NN engines. Compare to GPU and TPU.
Cpuct	A constant controlling the "UCT search" algorithm. Higher values promote more exploration and wider search, lower values promote more confidence and deeper search.
Cursed Win	(same as Blessed Loss) A position from which checkmate can be forced, but is a tablebase draw in the context of the 50-move rule.
Depth (See Seldepth)	Number of half-moves (ply) from the root node that the engine is evaluating.
DTM	"Distance to Mate"; one type of information that can be stored in a tablebase.
DTZ	"Distance to Zeroing"; one type of information that can be stored in a tablebase. Shows the number of moves the winning side must make before an irreversible move is played (resetting the 50-move counter).
Elo	A method for calculating the relative skill levels of a pool of players in zero-sum games. Relative within a pool, I say! Engine Elos are not comparable to human Elos, and should not be mistaken as such.
Evaluation Function	The part of an engine that examines a position and returns a value, in centipawns (handwritten evaluations) or Win% (neural networks).
Fawn	"Fawn Pawn"; "Thorn Pawn." A pawn installed near the opponent's king, most commonly on h6 or h3. Commonly used by Lc0.
Fully Connected (FC) Layer (See Filters)	Unlike filters which only look at a 3x3 area of the board, the FC layer looks at the entire board at once. This is used in the policy and value heads.
Filters	A 3x3xN pattern. The N means it is looking at several input features. A "20x256" network has 256 3x3x256 filters in each layer
First Play Urgency (FPU) Reduction	Normally when a move has no visits, it's evaluation is assumed to be equal to parent's evaluation (that is, the position is no better or worse for either side after the previous move). With non-zero FPU reduction, evaluation of unvisited move is decreased by some value, discouraging visits of unvisited moves, and saving those visits for more promising moves.
Fishtest	The Stockfish project's framework for testing new patches. Testing requires a huge amount of volunteer CPU time; this is managed in fishtest.
Game Tree	The set of all possible moves and countermoves in a game of chess, arranged in a graph from the starting position. Contains an extremely large number of positions.
GPU	Graphics Processing Unit; a specialized processor traditionally used in image processing. Its circuitry is efficient in processing well-organized data in parallel. Used for the evaluation function in neural-network engines (excluding AlphaZero). See also CPU and TPU.
Learning Rate (LR)	How fast the neural net weights are adjusted during selfplay. Too low and you waste GPU time training. Too high and you can undo what you have already learnt.
Leela Ratio	Ratio to estimate the 'fair' speed, in nps, of CPU engines in relation to GPU engines. Often valued as 875, in reference to DeepMind's AlphaZero paper.
Max Prefetch	When the engine cannot gather a large enough batch for immediate use, tries to prefetch a number of positions which are likely to be useful soon, and put them into cache.
Mean Squared Error (MSE) of the value (V) NN output.	One of the terms the NN training process tries to minimize. The NN improves in a feedback loop by trying to predict who wins each self-play game. The mean squared error of the predictions is the "loss". See also Policy Loss and Reg Term.
Minibatch Size	How many positions the engine tries to batch together for parallel NN computation.
Monte Carlo Tree Search (MCTS)	Algorithm for deciding which moves and resulting positions should be evaluated, and how to weight their evaluations in ultimately choosing a move. Used by AlphaZero, Lc0, and Komodo MCTS. Compare to AB pruning.
Move Overhead	Amount of time, in milliseconds, that the engine subtracts from its total available time to compensate for slow connection, interprocess communication, etc. The engine will make a move with at least this much time remaining to avoid flagging.
Move-set	The set of pseudolegal moves available to a piece in a given game state. Compare to "attack-set."
MultiPV (See UCI)	Number of game play lines (principal variations) to show in UCI info output.
Neural Network (NN)	Used by AlphaZero and Lc0. The NN is a set of numeric “weights” that can be used to evaluate a position. Contrasted with traditional AB evaluations.
NNCache	Stores NN evaluations, which can be reused if the same position is reached by transposition.
Nodes	A potential game position in the tree of future gameplay. The root node of the tree is the current position.
Noise	Introduced randomness added to root node prior probabilities according to Dirichlet distribution. This allows the engine to discover new ideas during training by exploring moves which it thinks to be bad. Off by default in regular play.
Nodes per second (NPS)	Includes NNCache hits and terminal node hits, but excluding nodes carried over from tree reuse. In other words, total playouts generated per second.
Overfitting	If the network trains on the same positions too much or with too low learning rate, it may memorize those positions and not generalize well to other similar positions.
Policy Head (P) (See NN)	Neural network's raw policy output (probability this is the best move)
Parallelism	Number of games to play in parallel
Playout	In a MCTS, a playout starts at the current position and follows the algorithm: 1) Pick a move to explore according to the policy and confidence of the choices. 2) Travel to the resulting game position node 3) If this child node is already explored at least once, repeat steps 1 and 2. 4) Evaluate this node via the neural network, creating value and policy estimates for this position, and use this new value estimate to update all parent nodes' values.
Plane	One set of 12 bitboards that describes the piece locations in one position.
Policy Head	The part of the NN that, for each legal (and illegal!) move, outputs the probability that move is the best move.
Policy Loss	One of the terms the NN training process tries to minimize. The NN improves in a feedback loop by trying to predict what the self-play games say the policy should be. The amount the NN differs from the self-play game outputs is the "loss". See also MSE Loss and Regularization Term.
Policy Masking	Removing illegal moves from consideration without requiring the neural network to assess them.
Policy Sharpening	Exponent applied to visit distribution to emphasize moves with more visits
Principal Variation (PV) (See MultiPV)	The moves that would be taken if we chose the node that has the most visits for each level (i.e. most playout traversals). Assumes ‘optimal’ play from the opponent.
Pruning	Reducing the number of positions to be evaluated by eliminating moves that are unlikely to be strong, much as a human identifies candidate moves. This allows engines to look more deeply into promising lines.
Pseudolegal move	A move that might be legal, but has not been evaluated for leaving the friendly king in check.
Puct	Scales the relative contributions of evaluation and policy when choosing which moves should be further analyzed.
Q (See Value)	Expected value output of the NN, ranging from -1 (100% black win) to 1 (100% white win). Contrasted to Z during selfplay.
Regularisation Term	The L2-norm regularisation term. The feedback training of the NN tries to minimize this (along with Policy Loss and MSE loss), aiming to improve the generalizability of the NN and prevent overfitting.
Residual Block	A residual block is a popular NN architecture that combines two CNN layers with a skip connection. A "20x256" network has 20 residual blocks. This is slightly different from the Deepmind AlphaZero paper, wherein a "20x256" network has 19 residual blocks.
Retrograde Analysis	Process for calculating tablebases, in which checkmating positions are examined first and other positions labelled by working backwards.
Sampling Ratio	How many times each position from self-play games is used for training. Too high and your net may overfit and not be generalizable. Too low and your progress is unnecessarily slow.
Self Elo	Leela's ego. During a training run, a way to track the progress of a training run and make sure subsequent nets aren't getting worse with more training. No relation to any other Elo scale.
Selfplay	The training phase of AlphaZero's or Lc0's development, in which copies of the engine play each other and learn from their game. Used to obtain neural-net weights.
Smart Pruning Factor	Avoids spending time on the moves which cannot be determined as the best move given the remaining time to search. When no other move can overtake the current best move, the search stops. Smart pruning factors greater than 1 prevent less promising moves from being considered even earlier. Values less than 1 cause hopeless moves to still receive attention.
Policy softmax temperature	Higher values make prior probabilities of move candidates closer to each other, widening the search.
Piece-Square Table (PSQT)	In a handwritten evaluation function, contains value adjustments for each piece on each square. Must be continuously optimized.
Syzygy	UCI standard for tablebase compression; includes DTZ information. Does not include trivial positions.
Tablebase	A database of positions in which the outcome is perfectly known, and can guarantee perfect play. Currently tablebases cover all positions up to 7 pieces, counting both kings. Can require huge amounts of disk space to store.
Temperature Decay (Tempdecay)	Reducing Temperature after a certain number of moves has been played, usually to 0, so that afterwards the engine plays only the moves it sees as best. Not enabled by default.
Temperature	A parameter that influences move selection. If equal to 0, the move assessed as "best" during tree search is always made. Larger values increase the probability of choosing other moves. Typically, on self-play training games, temperature is set to 0. Mathematically, each visit frequency (see MCTS) is exponentiated by (1/τ), where τ is the temperature, and the sum of the resulting values is normalized to sum to 1.0. The move to play is then chosen in proportion to this adjusted frequency. For τ =1 for example, this results in the probability of making each move is proportional to its MCTS visits. As τ approaches 0, the most commonly-visited node is played more and more frequently.
Terminal node	A node that has a game result (checkmate or draw).
Test10 (or 20, 30, 40...)	An individual training run by the Lc0 team, with its own parameters. Each generates a large number of successively-numbered nets. Test10, 30, 50, etc. are experimental, while Test20, 40, etc. are more 'serious'. Test20 was a spectacular failure.
Threads	Number of (CPU) worker threads to use.
TP	Trade Penalty; a programmed avoidance of otherwise-equal trades. Helps an engine avoid simplification to draws.
TPU	Tensor Processor Unit; Google's proprietary type of processor designed specifically for machine learning. As of this writing they cannot be purchased (although one can rent time on one). Used by AlphaZero in training and its games against Stockfish.
Trade Avoidance	Otherwise known as contempt, it modifies evaluations slightly to discourage trades unless they improve the position. Included to avoid draws against weaker opponents.
Train/Test Sets	NN learning tends to split data into two sets, train and test sets. The training set is used to improve the NN weights. The test set is used to check if the NN can generalize to positions it has not encountered.
Universal Chess Interface (UCI)	Communication protocol that allows chess engines to interact with user interfaces, such as Arena or Cutechess. All modern engines are UCI-compliant.
Upper Confidence Bound (U or UCB)	This is the part of the PUCT formula that encourages exploring moves that have not been searched much yet. See also the other half, Q.
Upper Confidence Bound 1 applied to trees (UCT)	An enhancement to MCTS used by AlphaZero
Value (V)	Expected value output of the NN, ranging from -1 (100% black win) to 1 (100% white win)
Visits (n)	Total number of playouts that have traversed a node. This is equal to the total size of the tree below this node, unless playouts have hit a terminal node, in which case: visits = nodes + playouts_that_hit_terminal_node.
Z	Final result of a selfplay game (+1, 0, or -1).

As always, you should be watching the CCC!

Glossary of Computer Chess

the_real_greco’s Blog