MiniMax With Alpha-Beta Pruning
Project Goal: To write an AI program in python that finds the winning solution in checkers given a strong enough AI opponent
Personal Goal: To apply my knowledge of the minimax search algorithm with pruning to win the game of checkers
Checkers
Objective: A two-player game in which the goal is to get rid of all your opponent's pieces by jumping over them. See the rules used here.
The minimax algorithm is built on the idea that as a player you want to minimize your opponent's move score/quality and maximize yours. Minimax on its own is too costly as it requires us to look at all the possible moves at every stage in the future. Instead, we can implement alpha-beta pruning to realize when a move should not be considered further as it will not lead to a winning solution consequently saving computational time and space. This is done by keeping track of the highest value choice for a move via alpha for Max, and the lowest value choice for a move via beta for Min; alpha and beta are effectively the lower and upper bounds respectively. Now, whenever we find that a move would cause alpha to be greater than or equal to beta we will prune (stop considering that move). To do so we will let 'v' be the value for the current state. For Min, when the alpha value is greater than or equal to v we will prune the children (possible moves) of this state (lines 484-485). For Max, when the v-value is greater than or equal to beta we will prune the children of this state as well (lines 469-470). If no pruning occurs we simply update the optimal move. We begin by first calling the max function, given that we make the first move, with alpha set to negative infinity and beta set to positive infinity. Note that we have to give an indicator of what a 'good' and 'bad' move is through a score/value system that will be used to update the alpha-beta values when necessary. This is because searching all the way through to the end of the game to find whether a certain path is preferable or not is too expensive. The possible endings of the game, the terminal states, have a set value indicating a good or bad solution path referred to as the utility value. To avoid having to go as far as the terminal states to get a value, an evaluation function estimates the utility values from any given state. The evaluation function will be called whenever a pre-determined depth limit is reached (I found that a depth limit of 7 proved to balance a valid solution and computational speed). The depth limit dictates how far into the future possible moves are considered such that the program doesn't run until it arrives at the terminal states (lines 461-462 & 476-477). The evaluation function finds a value score for each opponent and then returns a utility of the difference between these scores. The way that the score is found is by looking at each piece on the board and giving it an individual score from an arbitrary point system that penalizes pieces in less dominant parts of the board and rewards having king pieces. At each move, we write the board to the output file and if the game is over (decided by the game_over_func ) then we finish the search.
Alpha-Beta Pruning Function
Max Value Function
Min Value Function
Input & Output:
Input: A text file with a representation of the current board state.
Output: A text file with all consecutive moves to the solution from the starting state.
The states are represented as follows:
Black Pieces -> b
Red Pieces -> r
King Pieces -> B or R
Empty Spots -> .
Github: Input File
Github: Output File
Evaluation Function
Since this is a school-related project I am unable to share all the code publicly but contact me for the entire code if you want it.