"Any sufficiently advanced technology is indistinguishable from
-Arthur C. Clarke
Formal Computer Chess Paper
Search Algorithms and Game Playing
A general discussion on the minimax and alpha-beta search procedures and their application in computer programs that play chess
Graham Owen CSMP2 April 1997
In the field of search algorithms there is a type of search known as adversary search that is applicable to game playing. Adversary search can be used in games that have the following characteristics:
Two person games in which the players have alternate moves.
Games of perfect information.
A game in which each player has alternate moves requires thinking about which move to make, but also requires thinking about how your opponent will move in response, how you will respond to their response, and so on. A game of perfect information is one in which all relevant information is available to the players: Noughts-and-crosses is an example of a game of perfect information, as are draughts, go, and chess; whilst poker is not as (presumably) each player does not know what cards the other players are holding.
Chess is an example of an adversary search problem that has solutions (a win for white, a win for black, or a draw), but the computational cost of finding a solution is prohibitive. This essay shows how the minimax search procedure, and its improvement in the form of the a -b procedure, is used in chess-playing computers. It also covers the heuristics used in guiding the search, and discusses some of the problems with the minimax procedure and some of the ways that have been used to try and overcome some of those problems.
The Combinatorial Explosion
In chess the average branching factor (the number of legal moves from any position) is approximately 35. To search a tree of average branching factor, b, to depth, d, requires that bd positions be generated and evaluated. A chess playing program wishing to play at an acceptable level needs to search to a depth of four plies which means examining around 1.5 million positions each move. A search to a depth of six plies requires the generation and evaluation of around 2 billion nodes.
This combinatorial explosion means that any program wishing to play chess at anything above novice level has to apply techniques to tame the growth in the size of the search space. Just applying more computing power to the problem does not help: The number of legal positions in chess is estimated to be 1040, and the number of different legal games that can be played is estimated to be 10120.
Most chess playing programs rely on a combination of two techniques to improve play; use of the a -b search algorithm, and a good heuristic to classify positions.
As it is impossible in a game of chess to perform an exhaustive search on a position, a heuristic needs to be applied to guide the search towards the desired goal. The heuristic applied in chess playing programs is known as the static evaluation function. The purpose of this function is to provide an estimated measure of the winning chances of either side based on features of the position, and returns a score for that position. The evaluation function often takes the form of a weighted linear function:
w1f1 + w2f2 + + wnfn
where f is a feature of a particular position, and w is the weight given to that feature. By taking f as the number of pieces in a position, and w as a value associated with each piece, we can build up a simple heuristic based on the material value of the pieces on the board. As an example, the GNU Chess program assigns the following values to pieces:
white pawn 100 black pawn -100
white knight 350 black knight -350
white bishop 330 black bishop -330
white rook 520 black rook -520
white queen 980 black queen -980
white king 10000 black king -10000
At the start of the game, an evaluation of the position will produce a score of zero. As pieces are lost, the score will move to be positive for an advantage to white, and negative for an advantage for black. The king effectively has an infinite value because capturing the king wins the game.
The static evaluation function can be made as sophisticated as we like by adding positional factors such as good pawn structure, or greater piece mobility, or any other factors that seem relevant. However, the trade-off between the time taken to do a search and the time taken to apply the heuristic must always be considered.
The Minimax Algorithm
The diagram below shows the first three (simplified) plies of the search space for a position in a game of chess. The evaluation function has been applied to the nodes at the edge of the tree (the leaf nodes), to give the values shown:
The scores given to the leaf nodes show that some lines of play result in white winning a pawn (those scoring around 100), one gains a knight, and the remaining moves result in slight positional advantages for white or for black. If the computer is playing white, what move is the correct one to select?
If the computer plays into the right-hand side of the tree hoping to gain a knight, its opponent will surely play so as prevent this, and play into left-hand branch at ply 1. If we assume perfect play from the computer's opponent, we can see that the general rule is that each player will attempt to maximize the gain from each position while allowing only the minimum gain from his opponent at the next move.
The minimax algorithm assigns one player as the maximizer, and the other player as the minimizer. At each ply, the maximum (or minimum) score returned from all of the nodes is backed-up to the node above. This process is followed recursively until the root node is assigned the backed-up score of the target node.
The diagram above shows how the maximum score from each of the sub-trees at ply 3 has been backed up to ply 2, the minimum score from that ply backed up to ply 1, and finally, the maximum score of 102 assign to the root node. This shows that the next move to make will gain white an advantage of a pawn.
A Minimax Implementation
The following code is an implementation of the minimax search algorithm written as an Eiffel function. Instead of writing separate functions for the maximizer and the minimizer, the code takes advantage of the fact that a positive score for one player is equivalent to the same negative score for the opposing player. The function can then call itself recursively at it searches down the tree returning the backed-up score from the lower plies. The minimizing and maximizing at alternate plies is achieved by negating the result returned by the previous recursion. This method of implementing the minimax algorithm is often called negamax.
The class POSITION is a representation of a board position. The function evaluate takes a POSITION as a parameter and returns a score for the static evaluation of that position. The class SUCCESSORS is created with a reference to a POSITION as a parameter to its creation procedure. It then generates all the legal successor positions which can then be accessed by the procedure next. A BOOLEAN value, empty, is set to true when all the successor positions have been returned.
The recursion is stopped when the value of depth falls to zero and the function applies the static evaluation function to the leaf node.
minimax (p : POSITION; depth : INTEGER) : INTEGER is local
best, score : INTEGER
succ : SUCCESSORS
pos : POSITION
if depth = 0 then
Result := evaluate (p)
best := -infinity
pos := succ.next
score := -minimax (pos, depth - 1)
if score > best then
best := score
end -- if
end -- loop
Result := best
end -- if
end -- minimax
The code as it stands only provides the backed-up value of the target node. Additional code would need to added to identify the move selected to play by the algorithm.
The Alpha-Beta Algorithm
In the search tree above, the first three nodes have been evaluated, and a maximizer value of 102 has been backed-up to the ply above. The fourth node evaluates to a value of 187. Do we have to evaluate the fifth node?
We can see that the backed-up value of this branch is going to be ³ 187, but the minimizer already has a choice to play the left-hand branch with a value of 102. As we know that the minimizer will never play into this line, we do not need to spend any more time on evaluating the remaining nodes on this branch. The process of discarding branches that we do not need to evaluate is known as pruning, and the result of a prune is a cutoff.
The alpha-beta search algorithm is based on maintaining two values, a and b . Initially set to -¥ and +¥ respectively, a representing the value of the best path found so far for the maximizer, and b the value of the best path found so far for the minimizer. An a -b search updates the values of a and b as it goes along, with maximizing nodes only able to increase the value of a , and minimizing nodes only able to decrease the value of b . This creates a window which gets narrower as the values of a and b approach each other. Any branches of the tree that have values that fall outside of this range are identified as being worse than the current best choice and are immediately pruned.
Deep Alpha-Beta Pruning
In the part of a search space shown, the left-hand side of the tree has been evaluated and a value of 102 has been backed up to node B. The right-hand side of the tree is now being searched and node F has been evaluated to a value of -28. What can we say about node G? Since black is the minimizer, we know that the backed-up value at node E is £ -28. As this is less than the value of 102 at node B, we also know that white will not play into this line, and node G can be cut off.
A situation such as this, where the pruned node is more than one ply below the reason for the pruning, is typically referred to as deep a -b pruning and occurs only in searches to depths of four plies or more.
An Alpha-Beta Implementation
The following Eiffel function implements the alpha-beta search algorithm as a modification to the minimax function shown earlier:
alphabeta (p : POSITION; depth, a, beta : INTEGER) : INTEGER is
alpha, best, score : INTEGER
succ : SUCCESSORS
pos : POSITION
alpha := a
if depth = 0 then
Result := evaluate (p)
best := -infinity
succ.empty or best >= beta
pos := succ.next
if best > alpha then
alpha := best
end -- if
score := -alphabeta (pos, depth - 1, -beta, -alpha)
if score > best then
best := score
end -- if
end -- loop
Result := best
end -- if
end -- alphabeta
As before, the minimaxing is provided by returning the negated backed-up score from the plies below, and in addition, the values and signs of a and b are swapped. When the value of best equals or exceeds beta the loop exits without examining any more nodes at that depth, producing a cutoff. Because of the negamax technique used, all the cutoffs are beta cutoffs.
The Advantages of Alpha-Beta Search
An a -b search always returns the same result as an equivalent minimax search, but a reduction in the number of nodes that need to be evaluated can be made by producing cutoffs. The exact number of cutoffs depends upon the order in which nodes are examined.
We have already seen that a chess playing program using minimax needs to generate and evaluate bd nodes on each move. If the successors of every node are perversely ordered worst-move-first, then the nodes examined later will always be the main line (the course the game would take if both players played optimally), and no cutoffs will occur. In this situation static values have to be computed for all bd leaf nodes, that is, the same as for a simple minimax search.
In the case where nodes are ordered best-move-first, the node on the main line is always evaluated first, and the branching factor for the maximizer is reduced to 1. The number of static evaluations, s, needed to discover the best move in an optimally arranged tree is:
s = 2bd/2 - 1 for d even
s = b(d+1)/2 + b(d-1)/2 - 1 for d odd
s » bd/2 + bd/2 = 2bd/2
Of course this maximum is a theoretical ideal; if there was a way of ordering the moves at any particular ply with the best move first, there would be no need to do the actual search! However, we can see that the real value for the number of static evaluations that need to be done lies somewhere in the range bd and approximately 2bd/2.
In practice, using a fairly simple ordering function, the number of static evaluations required gets close to the best-case result. In this case, the effective branching factor, b', is given by
b'd = 2bd/2
b' = 21/dÖ b » Ö b
In other words, a branching factor of 35 can be reduced to an effective branching factor of around 6 allowing us to double the depth searched in a fixed amount of time. This is a significant improvement over a simple minimax search.
The heuristic used in most chess playing programs assigns a single value to a position based on the evaluation function. There are two assumptions made when using this kind of heuristic. First, for any two positions, it is possible to decide which has the greater utility (by assigning a number to each of them), and second, that this relationship is transitive. (That is, if position A has a higher score than position B, and position B has a higher score than position C, then position A is better than position C). Some chess players doubt whether a single number can be used to classify a chess position as better or worse than another.
A problem with the minimax search procedure and its alpha-beta refinement is the assumption that an opponent will always make the best possible move. In a winning position this assumption appears to be valid; if an opponent makes a worse move than expected then it only hastens his demise. However, there are situations when defending a loosing position when this assumption can be harmful.
Imagine that a computer program, after doing an extensive search, has found a forcing line for its opponent that results in its defeat. The computer's next move will be made on the assumption that its opponent (be it human or machine) has also discovered this forcing line. Imagine also that there is a similar, but slighter worse move, that the program can make which is, in effect, a bluff - a move that will draw attention away from the best reply. A human player may try and bluff in a losing position in the hope that his opponent has not seen the best reply, but a computer using a minimax search cannot.
In addition to these problems there is also the philosophical question of how a computer using a brute-force look-ahead program appears to be able to play chess at the very highest level. Certainly human chess players, even Grand Masters, cannot look as far ahead into the game as a computer. It appears that humans play chess mainly with a combination of experience, pattern recognition, and intuition; quite unlike a computer.
The Horizon Effect
All programs that search to a limited depth to determine the next move to play suffer in some degree from the horizon effect. To see what this is, consider a program that searches to a depth of six plies on a chess position and discovers that it is going to loose its queen on ply five. The program will naturally discard this line of play in favour of a better one in which its queen is not lost. However, it could be that the resulting sequence of moves is just delaying the inevitable loss of the queen by introducing some delaying tactics. The queen will still be lost, but it has been delayed so that it does not appear in the tree; the inevitable bad event has been pushed "over the horizon".
Although techniques exist to mitigate its effect, there is no general solution to this problem.
The observation that a -b pruning is most effective when the nodes are examined in best-move-first order, has led to an idea known as iterative deepening. Rather than searching to a fixed depth, the game tree is first searched to a depth of one ply and a static evaluation function is applied to the result of each of the possible moves. These results are then used to order the moves so that the best likely candidate is examined first on a re-search, this time to a depth of two plies. The process can then be applied to a depth of three plies, four plies, and so on. Although that this might seem to be a costly procedure, the time spent in evaluating the nodes at shallower levels is more than repaid because the move ordering at each ply ensures near optimal a -b pruning.
Another advantage of iterative deepening is that at any moment the best move found so far is always known. This is important in playing games such as chess when moves are always being made under a time constraint.
The evaluation function used to assign a score to a position is only meaningful if the position being searched is relatively 'quiet' or quiescent. An example of a tactical or non-quiescent position would be one in which an exchange of pieces is in progress. A position may be given a high score when the opponent's piece is captured, but this is a false value because this apparent advantage is lost in the next move. Quiescence search says that lines of play that involve an exchange of pieces should be followed until a quiet position is reached, ensuring that the evaluation function is only applied to positions that are quiescent. We can detect tactical portions of the search space because these are characterized by rapidly changing values of the heuristic function.
Singular extension is the technique of following lines of play beyond the normal search depth in cases where there is one option that is vastly superior the rest; in chess terms a forced move. As this move is forced, the effective branching factor is small and a search to a depth of a few more plies becomes computationally practical.
Singular extensions were first used in the chess computer CHIPTEST (Anantharaman et al. ). In one example, CHIPTEST was able to find a forced checkmate in 18 moves (35 plies) in 65 seconds on a nominal 8-ply search.
Quiescence search and singular extension are both examples of secondary search.
Chess playing computers using techniques described in this essay are now playing at Grand Master level. Some observers see the time approaching when a computer will be able to beat any human player at chess; others say that human skill and ingenuity will always find a way to beat the machine.
In May this year, the world chess champion Gary Kasparov will play a match against Deeper Blue, IBM's latest chess supercomputer. Kasparov, regarded as the best player in history, won his last match against Deep Blue, but lost a game in the process. Can Deeper Blue beat Kasparov this time round and prove itself better than any human player? Follow the match at the website http://www.chess.ibm.com/.
Anantharaman, Thomas, Murray S. Campbell and Feng-hsiung Hsu, Singular Extensions: Adding Selectivity to Brute-Force Searching, Artificial Intelligence, 43: 99-109, 1990.
Atkinson, George W., Chess and Machine Intuition, Ablex, Norwood, NJ, 1993.
Dean, Thomas, James Allen and Yiannis Aloimonos, Artificial Intelligence: Theory and Practice, Addison-Wesley, Menlo Park, CA, 1995.
Ginsberg, Matthew L., Essentials of Artificial Intelligence, Morgan Kaufmann, San Francisco, CA, 1993.
Luger, George F. and William A. Stubblefield, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, Second Edition, Benjamin/Cummings, Redwood City, CA, 1993.
Rich, Elaine and Kevin Knight, Artificial Intelligence, Second Edition, McGraw-Hill, New York, 1991.
Russell, Stuart J. and Peter Norvig, Artificial Intelligence: A Modern Approach, Prentice-Hall, Englewood Cliffs, NJ, 1995.
Winston, Patrick Henry, Artificial Intelligence, Third Edition, Addison-Wesley, Reading, MA, 1992.
Many thanks to Graham Owen for allowing the author to use this fine paper within Barnet Chess Club's section on computer chess. Any feedback is welcome by Graham.
This page belongs to: