On the number of chess positions

Sort:
tygxc

I would like to point out an applications of your research

You found a way to express all possible chess positions in 19 byte and thus all legal, sensible positions in 17 byte. The standard FEN notation for the initial position rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1 
counts 56 ASCII characters i.e. bytes. Hence there must exist a notation 3 times more efficient than FEN in terms of both storage and parsing needed. Most convenient would be if that notation represents the initial position as 0.

johntromp
tygxc wrote:

Hence there must exist a notation 3 times more efficient than FEN in terms of both storage and parsing needed.

A ranking integer is about the worst possible notation to parse, taking over 20 minutes of computation:-(

tygxc

#42
That depends on the way you rank. There should be a ranking integer easy to parse. If that is found, then there should exist a bijection between your ranking integer and the easier to parse ranking integer.

s_sajwani123
johntromp wrote:

I wrote a Haskell program to compute an upper bound on the number of possible chess positions, appended below. Its output is

fixwr=0 fixbr=0 ep=0 4317116501858047620299900728599356147494556640
fixwr=0 fixbr=0 ep=1 31999595200733582973106880061728861929069928
fixwr=0 fixbr=1 ep=0 6922142764395483618137561107749568789790148
fixwr=0 fixbr=1 ep=1 54444384271688044810990508417111948991768
fixwr=0 fixbr=2 ep=0 136530769984693307040227830128737122354029
fixwr=0 fixbr=2 ep=1 1035365889143551932685537903363800096422
fixwr=1 fixbr=0 ep=0 6922142764395483618137561107749568789790148
fixwr=1 fixbr=0 ep=1 54308353407259673025385006313129004055128
fixwr=1 fixbr=1 ep=0 11745419798256512510493235052589222172668
fixwr=1 fixbr=1 ep=1 98172517157950055940864091510815802248
fixwr=1 fixbr=2 ep=0 235958281122206691085936171885340863152
fixwr=1 fixbr=2 ep=1 1903336650826558863672909067902430108
fixwr=2 fixbr=0 ep=0 136530769984693307040227830128737122354029
fixwr=2 fixbr=0 ep=1 1028637537652354045548219736317757984742
fixwr=2 fixbr=1 ep=0 235958281122206691085936171885340863152
fixwr=2 fixbr=1 ep=1 1895642836539897140574420164647093916
fixwr=2 fixbr=2 ep=0 4729971278292293446735355275667009679
fixwr=2 fixbr=2 ep=1 36635290891989131864827262732080222
total positions: 8726713169886222032347729969256422370854716254

The 3*3*2 = 18 different counts, all assuming white to move, distinguish the number of fixed white rooks (i.e. ways that white can castle), the number of fixed black rooks, and whether a white pawn can capture a black pawn en-passant.

The final number (call it N ~ 8.7E45) is twice the sum of these 18 numbers, to account for positions with black to move. This is a 2x improvement over the 1.77894E46 bound obtained by Shirish Chinchalkar in 1996.

 

More importantly, I was able to write a much more complex Haskell program that can efficiently map back and forth between positions and integers 0..N-1. This allows me to sample positions at random to determine the fraction that is legal. From a sample of size 100, I found 8% to be legal, which suggests a count of (very roughly) 0.08*N ~ 7E44 legal chess positions. To have some confidence in a correct first digit, a much larger sample is needed though, between 1000 and 10000 random positions. I've just completed generating sets of both sizes,

and am currently writing code to help analyze them.

I'll be happy to share the list of 1000 random FENs if people are interested.

Feedback is more than welcome.

-John

 

Haskell source code:

{-# LANGUAGE TupleSections #-}
module CountChess where

import Control.Monad
import Data.Array
import qualified Data.Map as M

-- tuple of #pieces #pawns #promotions #factorial_product
data Army = Army !Int !Int !Int !Integer deriving (Eq, Ord, Show)

-- given a number of fixed pawns (normally 0, but 1 with fixed en-passant)
-- and a number of fixed rooks (normally 0, but 1 or 2 with castling)
-- return list of possibly armies
_armies :: Int -> Int -> [Army]
_armies fixr fixp = do
let ur = 2 - fixr -- number of unfixed rooks
k <- [ur `div `2] -- number of kings
let np0 = 8 - fixp -- number of promotable pawns
q <- [0..1+np0] -- number of queens
let np1 = np0 - max (q-1) 0 -- adjust for queen promotions
r <- [0..ur+np1] -- number of rooks
let np2 = np1 - max (r-ur) 0 -- adjust for rook promotions
b <- [0..2+np2] -- number of bishops
let np3 = np2 - max (b-2) 0 -- adjust for bishop promotions
n <- [0..2+np3] -- number of knights
let np4 = np3 - max (n-2) 0 -- adjust for knight promotions
let proms = np0 - np4 -- number of promotions
p <- [0..np4] -- number of pawns
return $ Army (k+q+r+b+n) p proms (fac k * fac q * fac r * fac b * fac n)

-- pair unique elements in a list with their multiplicity
count_unique :: Ord a => [a] -> [(a ,Integer)]
count_unique = M.toList . M.fromListWith (+) . map (,1)

-- precompute unique armies with multiplicity into array
-- indexd by 3x2 parameter combinations for efficiency
armies :: Int -> Int -> [(Army, Integer)]
armies fixr fixp = armies_!(fixr,fixp)
armies_ = array ((0,0),(2,1)) [((fr,fp), count_unique (_armies fr fp)) | fr<-[0..2], fp<-[0..1]]

-- precompute first 63 factorials into array for efficiency
fac :: Int -> Integer
fac n = fac_!n where
fac_ = listArray (0,64) (scanl (*) 1 [1..64])

-- precompute first 65x29 falling powers into array for efficiency
fp :: Int -> Int -> Integer
fp n k = fp_!(n,k)
fp_ = array ((0,0),(64,28)) $ do
n <- [0..64]
((n,0), 1) : [((n,k+1), fp n k * fromIntegral (n-k)) | k<-[0..27]]
-- let n' = fromIntegral n in zip (zip (repeat n) [0..]) (scanl (*) 1 [n',n'-1..n'-27])

-- precompute first 49x9x9 trinomial coefficients into array for efficiency
choose2 :: Int -> Int -> Int -> Integer
choose2 0 0 0 = 1
choose2 n k1 k2 = if k1<0||k2<0||k1+k2>n then 0 else choose2_!(n,k1,k2)
choose2_ = array ((1,0,0),(48,8,8)) [((n,k1,k2), let c = choose2 (n-1) in c (k1-1) k2 + c k1 (k2-1) + c k1 k2) | n<-[1..48], k1<-[0..min n 8], k2<-[0..min (n-k1) 8]]

-- given the number of white and black rooks fixed by castling
-- and the number of pawns to fix for each color (1 for en passant)
-- return the number of possible diagrams
count :: Int -> Int -> Int -> Integer
count fixwr fixbr fixp = sum $ do
let np = 8 - fixp -- number of unfixed pawns
let fixwk = if fixwr /= 0 then 1 else 0
(Army wpcs wp wproms wprod, wmul) <- armies fixwr fixp
let wpx = np-wp-wproms -- white pawns captured
let fixbk = if fixbr /= 0 then 1 else 0
(Army bpcs bp bproms bprod, bmul) <- armies fixbr fixp
let bpx = np-bp-bproms -- black pawns captured
-- number of captures
let caps = 32-2*fixp-fixwk-fixbk-fixwr-fixbr-wp-bp-wpcs-bpcs
-- a pawn can pass its original opposite if either captures or latter is captured
-- guard $ wproms <= caps + bpx
-- guard $ bproms <= caps + wpx
-- the slack in these inequalities limits unopposed pawn
-- as they could promote without increasing captures
let maxuwp = bpx + caps - wproms -- unopposed white pawns
let maxubp = wpx + caps - bproms -- unopposed black pawns
guard $ maxuwp >= 0
guard $ maxubp >= 0
-- white (resp. black) must have fixp+wp-maxuwp (resp. fixp+bp-maxbwp) of its pawns opposed
-- min #files with opposing pawns (multiple opposing per file considered overcounted)
let minopp = max 0 (fixp + wp-maxuwp)
let space = 64-4*fixp-fixwk-fixbk-fixwr-fixbr-wp-bp -- space for pieces
-- choose wp+bp among pawn space and then all pawns/pieces among space-wp-bp
return $ wmul * bmul * (pawns fixp wp bp minopp * fp space (wpcs+bpcs) `div` (wprod * bprod))

-- ways to distribute wp white pawns and bp black pawns over space ps with opposing pawns on opp files
pawns :: Int -> Int -> Int -> Int -> Integer
pawns 0 wp bp opp = sum [fromIntegral (mopps opp s 8) * choose2 (48-2*opp-s) (wp-opp) (bp-opp) | s <- [0..4*opp]]
pawns 1 wp bp opp = pawnsep 0 0 0
+ sum [pawnsep 1 1 (ds1+ds2) | opp>1, ds1 <- [0..2], ds2 <- [0..1]]
+ sum [pawnsep 1 0 ds1 | opp>0, ds1 <- [0..2]]
+ sum [pawnsep 0 1 ds2 | opp>0, ds2 <- [0..1]] where
-- put dw white pawns in file of black pawn just moved
-- and db black pawns opposite white's pawn that can capture it en-passant
-- together spanning ds sandwiched space
pawnsep dw db ds = let opp' = opp-dw-db in sum [fromIntegral (mopps opp' s 6) * choose2 (44-opp-opp'-ds-s) (wp+db-opp) (bp+dw-opp) | s <- [0..4*opp']]

-- opps p s os counts ways for p opposing pawns to sandwich s others in n files
opps :: Int -> Int -> Int -> Int
opps 0 0 _ = 1 -- done
opps 0 _ _ = 0 -- short of sandwiched space
opps _ _ 0 = 0 -- no space left for pawns
opps p s n = mopps p s (n-1) + sum [(5-i) * mopps (p-1) (s-i) (n-1) | i <- [0..min s 4]]

-- precomputed version
mopps :: Int -> Int -> Int -> Int
mopps p s n = mopps_!(p,s,n) where
mopps_ = array ((0,0,0),(8,32,8)) [((p,s,n), opps p s n) | p<-[0..8], s<-[0..32], n<-[0..8]] where

cases :: [(Int, Int, Int)]
cases = [(fwr,fbr,ep) | fwr <- [0..2], fbr <- [0..2], ep <- [0..1]]

multFR :: Int -> Integer
multFR 0 = 1
multFR 1 = 2
multFR 2 = 1

multEP :: Int -> Integer
multEP 0 = 1
-- each of the squares a5-h5 can have a black pawn en-passant
-- capturable by 2 white pawns, except a5/h5, which could only
-- be captured by 1 white pawn, giving 8*2-2 = 14 multiplier
multEP 1 = 14

main = let
-- given fixed white and fixed black rooks,
-- en passant flag
-- per-file pre-occupied central squares
-- a multiplier
-- show and return number of possible positions
-- this assumes white-to-move
showcount :: (Int, Int, Int) -> IO Integer
showcount (fwr,fbr,ep) = do
let mul = multFR fwr * multFR fbr * multEP ep
let cnt = count fwr fbr ep * mul
putStrLn $ "fixwr=" ++ show fwr ++ " fixbr=" ++ show fbr ++ " ep=" ++ show ep ++ " " ++ show cnt
return cnt
in do
whiteToMove <- sum <$> mapM showcount cases
putStr "total positions: "
-- adjust for either side-to-move
print $ 2 * whiteToMove

I pasted the exact code on an online compiler, but it shows an error

 

$ghc -O2 --make *.hs -o main -threaded -rtsopts
[1 of 1] Compiling CountChess ( main.hs, main.o )

main.hs:33:14: error:
parse error on input ‘=’
Perhaps you need a 'let' in a 'do' block?
e.g. 'let x = 5' instead of 'x = 5'

johntromp

The formatting of the code in the post was ruined by eliminating the white space at the start of each line. Please try the correctly formatted version at https://github.com/tromp/ChessPositionRanking/blob/main/src/CountChess.hs instead. And consider removing the long quote from your post.

johntromp
tygxc wrote:

#42
That depends on the way you rank. There should be a ranking integer easy to parse. If that is found, then there should exist a bijection between your ranking integer and the easier to parse ranking integer.

Yes, there are much simpler rankings that are fast to parse. But they will have a vastly larger size than my N=8.7e45... Getting the size this small requires rather complex rankings that need to iterate over all pairs of white and black armies, which causes the slowdown.

johntromp

Having completed the manual analysis of the 1000 random positions, I found 52 to be legal.
Three of these 52 positions occur at multiple ranks though. Two occur at 2 ranks, and one at 3 ranks.
So they account for a total of 56 occurrences in the set of all ranked positions, giving an average 56/52 = 1.077 number of occurrences
 (slightly higher than the 1.033 avg over all positions in the sortedRnd1mFENs set).

We can thus estimate the number of legal positions as (52/1000) * 8726713169886222032347729969256422370854716254 / 1.077 = 4.2e44 +- 1.1e44, where the latter part reflects the 95% confidence bounds.
So we still have less than one digit of accuracy. But we can say the number is roughly 4e44...

A full analysis of the 10x larger sample at https://raw.githubusercontent.com/tromp/ChessPositionRanking/main/sortedRnd10kFENs will give a significantly more accurate estimate, with confidence bounds +- 0.35e44

tygxc

#47
So #21 was right:
"I guess the real fraction of legal positions will be less than 8% of your present N."
So the number of possible and legal positions is 4e44.
The number of legal and sensible positions with no more than 2 queens, rooks, bishops, knights per side is less by about a factor of 1 million.

johntromp

All the software and a summary of this research project  is now available at

https://github.com/tromp/ChessPositionRanking

including some bug bounties for finding mistakes.

 

Happy bounty hunting!

DiogenesDue

Nice work.  So adding the confidence bounds back in, we get 10^44.72 possible positions?

(10^44.623249290398 + 10^44.041392685158) = 10^44.724275869601 = (4.2^44 + 1.1^44) = 5.3^44

You knocked off 2 orders of magnitude, that's impressive.

johntromp

Except that 10^44.72 is not a proven upperbound. There could be more legal chess positions, and I simply had the 5% bad luck of getting an unrepresentative sample...

If you really must derive an upper bound from my sample analysis, then a better choice would be 9.48% * N / 1.056 ~ 7.8x10^44 ~ 10^44.89

since 9.48% is the fraction of possibly legal positions in my huge 10million sample and 1.056 is the avg multiplicity for those 9.48%...

DiogenesDue
johntromp wrote:

Except that 10^44.72 is not a proven upperbound. There could be more legal chess positions, and I simply had the 5% bad luck of getting an unrepresentative sample...

If you really must derive an upper bound from my sample analysis, then a better choice would be 9.5% * N / 1.1 ~ 7.5x10^44 ~ 10^44.88

since 9.5% is the fraction of possibly legal positions in my huge 10million sample and 1.1 is a pretty solid upper bound on avg multiplicity (will calculate it for those 9.5%)...

Understood, thanks.

tygxc

#51
And then what is the lower bound?
There could be less legal chess positions and you could simply had the 5% good luck of getting an unrepresentative sample.
Besides: proving illegal = by some logic.
Proving legal = constructing a game from the initial position.
So it would be logical to have a bias in too many positions considered legal without constructive proof. If a position does not meet some criteria for illegality, then that does not prove the position legal.

johntromp

I cannot think of a natural lower bound.  I might say it's 1e40 simply because it's a round number and many standard deviations away from the sample average, but that's still rather arbitrary.

I provided the essence of a proof game for all 52 legal positions; namely a sequence of possible captures and promotions, accounting exactly for the current material and pawn configuration on the board. In no case do the pawns prevent the other pieces from being moved around.

johntromp

I rewrote the README at https://github.com/tromp/ChessPositionRanking

while adding lots of images. A milestone has been set for obtaining a more accurate estimate by analyzing the 10k sample, which still has 895 unresolved issues.

Please consider volunteering to resolve at least one issue!

or upvote https://news.ycombinator.com/item?id=28187478

tygxc

#54
"I provided the essence of a proof game for all 52 legal positions; namely a sequence of possible captures and promotions, accounting exactly for the current material and pawn configuration on the board. In no case do the pawns prevent the other pieces from being moved around."
This methodology may be flawed. It probably has a bias towards too many positions considered legal.
A position is legal if and only if a proof game from the initial position proves it.
Try your legality checker on this position:

What does it give: legal or illegal?
Now the same position with the black pawn at h3 instead of h4: legal or illegal?

johntromp
tygxc wrote:

#54
Try your legality checker on this position:

What does it give: legal or illegal?
Now the same position with the black pawn at h3 instead of h4: legal or illegal?

 

My legality checking program doesn't notice any illegality, so I'd have to analyse it manually.

 

Well, that certainly is an intricate retrograde analysis.

White must have made captures a2bx3, d2xc3, and e2xd3xc4xb5xa6.

Black must have made captures a7xb6, e7xd6xc5xb4xa3, d[76]xc[56]xb[45]xa[34] and a2xb1B.

Since those 9 captures necessarily include the white f,g,h pawns, these must have promoted in order to get to the queenside.

I can see how the last moves could be Kb5-a5, Na4-c5, h6-h5, Nc5-d7, h5-h4, Nd7-b8, but I see no useful white move prior to that...

 

Anyway, this is all much more intricate than the positions arising in my samples. The most moves I had to retract were about 3 ply in positions like https://github.com/tromp/ChessPositionRanking/issues/22

and

https://github.com/tromp/ChessPositionRanking/issues/156

 

johntromp

Cool; my project made it to the front page of Hacker News:

https://news.ycombinator.com/item?id=28423029

with some interesting discussion ensuing.

tygxc

#29
"have you ever seen an attempt to estimate the smallest number of moves in which any legal position can be reached?" The largest smallest proof game we found so far is 94 moves.
#57
We found proof that that position is illegal, but legal with the black pawn on h3 instead of h4.

johntromp
tygxc wrote:

We found proof that that position is illegal, but legal with the black pawn on h3 instead of h4.

 

Then, with the black pawn on h3, what were the last 8 half moves?