Since I started playing 4 player chess (4pc), I have always been curious about the impact of points in FFA. Every type of piece, the pawn/1 point queen, the knight, the bishop, the rook, the queen and the king, are worth different amounts of points. It is important to find the most correct values of each piece to make a balanced game.
How have these numbers been reached? For a chess player, it is clear most of them come from 2 player chess (2pc). The pawn is the unit piece, so the piece values are measured in pawns. At the same time, the pawn is one of the most complicated pieces. The knight, rook and queen all have values corresponding to 2pc. However, it is not necessarily true that the values of the pieces are the same in 2pc and in 4pc. For example, the bishop is worth 5 points instead of 3, which is generally how much the bishop is worth in 2pc. Why? The qualitative explanation is that the board is bigger, so the bishop should be worth more than the knight. Translating the king value to 4pc is tricky because its value is infinite in 2pc.
A more fundamental question: which factors decide the value of a piece? How powerful it is, you could say, which translated to chess terms is how many squares it generally controls. Controlling squares is all good, but it doesn’t win you any games on its own. You need to get points. So, the strength of a piece can be measured by its ability to grab points. With this in mind, we can go into the game itself to find out how much each piece should be worth. Let us for a moment not think about 4pc as a battle between 4 players, but rather a battle between 6 types of pieces.
I would like to find an exact way to derive the correct piece values, based on mathematics instead of intuition. Perhaps the current piece values are based on some algorithm, but if so, I have never heard about it. Thus, I have tried to do it myself, using statistical methods. I have picked 30 high level FFA rapid/blitz games with a variety of players as a basis for my results. How can the value of a piece be described statistically? For example, why is the rook worth 5 points? Let us define the variable Expected Point Gain for each type of piece. I have decided to count the pawn and the 1-point queen as the same piece. Sticking to the same example, the rook being worth 5 points means that a rook is on average expected to capture 5 points. But that is not all. In FFA, a piece can turn grey instead of being captured, meaning that it does not give away any points. So, the definition also needs to have granted that it becomes captured. In total, that becomes The value of a piece is the expected amount of points that the piece will capture, granted that it becomes captured. Using this as a basis, I have made the following indicator: EPG(Expected Point Gain)=Expected Points Won – Expected Points Given Away. That is, for every one of those 30 games, I have gone through every single capture to see how much each type of piece has been able to capture and how much they have given away by being captured. Then I have taken the average over all the games and divided by the amount of the type of piece in the starting position(32 pawns, 8 knights etc), to get the EPG. In general, if the EPG is close to 0, it implies the piece is worth its value and balanced. If the EPG is very big, the piece takes more than it gives away, so the piece could be worth more. If the EPG is negative, the piece is probably worth too much.
That is the general theory, here is an example to make things clearer: let us say the EPG of the rook is 2. That means that on average, the rook takes 2 more points than it gives away, which implies that the rook should be worth more than it is currently. It does not necessarily mean that the rook needs to be worth exactly 2 points more = 7. Why? Two reasons: it is a complex system consisting of 6 different pieces with one EPG each, and they are dependent on each other. Changing the value of one piece will affect the EPG for other pieces, it is a compromise. The second reason is that the value of pieces affects how we play with them. Changing the value of the rook from 5 to 7 would for instance make a rook for bishop trade less tempting. The numbers found are indicators.
To clarify, here is one of the 30 games:
So, we can see that there are 3 numbers for each type of piece. “Won” means how many points that type of piece has captured, “Given Away” is the amount of times that piece has been captured times its value. And then the “Gain” is simply the difference. Feel free to go to the game and see if I have counted correctly. 😉
A couple of things to note: the king has a very negative score and that is completely normal. How often do you see a king capturing 20 points worth of material? Anyhow, that is beside point, the king is supposed to be a bonus piece, you get a bonus reward for “capturing” it. Other things that meet the eye is that the queen scores very well and the rook scores very badly, but it is only one example so we cannot draw any conclusions from it. You might wonder why the total score is -40 and not 0. The reason is that very often, at the end of a game, there are one or several kings left that have not been checkmated, so the kings give away 20 points each without being captured. Another element that could lead to strange total scores is double checks, but over the course of the 30 games double checks have been quite rare.
If you have read everything so far, you are probably curious to see what my experiment has yielded. The knowledgeable reader might have an idea or two about the results already. Here are the EPGs I have found:
A bunch of different values have been calculated; these are the ones I consider the most important. I decided to add the amount of checkmates for reasons you will see later. Let’s go through each piece separately.
Knight
With an EPG of 0.1, it seems like the knight is doing fine at 3 points.
Bishop
Despite its ability to develop fast and point straight at an enemy flank player, with a score of -0.9 it is clear that the bishop is not worth 5 points. Setting it down a point or two should be considered.
Rook
The rook scores close to 0 and seems balanced currently.
King
The king is by far the piece with the worst score, which is understandable as the king is not designed to be worth its value. Whether or not that is a good idea, I will not discuss that here. What this analysis can show is how the bonus points that you get from checkmates/capturing kings are distributed among the other pieces.
Queen
4.1 is much farther away from 0 than anything seen so far. It implies that the queen should be worth several points more than it currently is, which is strange considering that 9 points is balanced for the queen in 2pc. Having gone through the games and looked at what happens, the high score of the queen can be seen to be caused by the value of the king. In the games, the queen takes 54 % of all checkmates, and 30 % are dealt by pawns/1-point queens (read: queens). That leaves only 16 % of all checkmates for non queen types of pieces and contributes to making the queen look like an unbalanced piece.
Pawn
Scoring 0.7 for the pawn might look innocent, but it is not! Remember, this number applies to every single one of the 32 pawns on the board. Every pawn is expected to capture 0.7 points more than it gives away. The main reason is how powerful 1-point queens are and how often pawns get promoted in FFA. But I believe there is reason to think that pawns by themselves are more powerful in 4pc than in 2pc. In 2pc there is less space for pawns to move, you easily get blocked pawn structures where the pawns are immobile. In 4pc pawns are very mobile and thus much easier to promote. Blocked pawn structured are usually only seen on the flanks. The impact is devastating; as the pawn is defined as the unit value, instead of increasing its value we would have to divide the value of all other pieces by 1.7. Then knights would be 2 points, bishops and rooks 3 and queens 5. And the king? Who knows.
 
I would also like to share the results of an identical analysis of a format I have made myself, the chaturaji hyper fiesta. The only difference is that I have used 15 games instead of 30. The game looks like this:
The rules are Capture the King, 3 points for kings, promotion into (5-point) rook on the 8th rank and ¼|0 hyper bullet time control.
The Expected Point Gain of the pieces are the following:
We can see that these numbers are all looking close to 0, at least compared to normal FFA.
The pawn looks balanced.
The bishop underperforms the most, which I think is understandable. It is worth 5 points, the same as the rook. But with a normal 8x8 chess board, one would think that the piece values should be the same as in 2pc. That’s why I would like to ask our developers to add the possibility of adjusting piece values for different variants.
Knights do well, maybe because they often get to trade with 5 point bishops.
Rooks do ok, underperforming a bit. Could be because it's hyperbullet.
Kings do much better than in normal FFA, helped by its reduced value and the ability to sacrifice itself for another piece. They still underperform a little bit, which I believe is due to the fact that the last man standing gets the points for all remaining kings.
 
To conclude, I think there are several reasons to believe that the current point system in FFA is inaccurate. Queens, whether they are worth 9 points or 1 point, are the supreme rulers of FFA. Pawns are powerful. Kings and bishops are point donators. Rooks and knights are the only pieces that seem to be worth their price. For variants with different board sizes, I think we should have the option of adjusting the value of each piece.
Questions and comments are much appreciated. Is it easy to understand how the experiment has been conducted, and why it has been conducted in this particular way? Do you think this method is useful for evaluating the strength of the pieces? Is there something you would have done differently? Is there any step in the process that is unclear? Do you agree or disagree with my interpretation of the results?