Articles
BY OUR STUDENT CONTRIBUTORS
Malhar Rajpal What do games like chess, Go and tic-tac-toe have in common? They are all based on choices. On a daily basis, humans train their decision making skills numerous times, making selections that can be as simple as figuring out when to go to bed or as complex as evaluating the quickest route from one place to another. Despite our evident practice in decision-making, artificial intelligence still crushes humans in games such as chess. Since computers are programmed by humans, you may wonder how it is possible that they can make much better choices in complex games that are nigh-impossible for any human to master. The answer is minimax. Minimax is a decision making rule that is widely used to program AI to make better actions/choices. Its foundation is based on creating an overall score of the game so the AI can see who’s ‘winning’ at any given position. Although most games don’t have an explicit score while the game is in progress (only a winner and loser at the end), coming up with a score system is not difficult. Score systems can be quite simple: for example in chess, we can give each piece a certain score (e.g. rook/knight = 5 points and queen = 9 points) and then sum up the scores of the pieces on the board for both players at a given point in time, and then subtract the score for player 2 from the score for player 1 (player 1 - player 2). If the final value is positive, then player 1 is winning, but if it is negative, then player 2 is winning. If the score is 0 then the position is equal. The higher the score, the larger the margin by which player 1 is winning. Likewise, the lower the score, the larger the margin by which player 2 is winning. Now that we understand how scoring works, we can move onto the algorithm. Imagine programming an engine to play a simple two player game, where each player has only three decisions. As seen before, player 1 wants the evaluated score to be positive, and as high as possible. Player 1 is therefore called the ‘maximising player’ or ‘max player’. On the contrary, player 2 wants a negative score that is as low as possible, so they are the ‘minimising player’ or ‘min player’. Now let’s look at the Minimax tree diagram shown below this paragraph. Every triangle in the diagram represents a certain state or position and is formally called a ‘node’. Each green, upwards triangle indicates a maximising state/position - that is, given some potential scores in the branches below it, the ‘max’ player will always choose the highest one. Each red, downward triangle indicates a minimizing state/position - given some potential scores in the branches below it, the ‘min’ player will always choose the lowest one. These chosen scores will then be assigned to their respective triangles in the diagram. Finally, each branch, or line, connecting two triangles represents a potential move/action. Since the top triangle (formally called the ‘root node’) represents a maximising state, the goal in the diagram is to find the best move for the max player. To do that, we have to find the highest possible score for the root node (assuming optimal play) and play the move that gives us this score. Each player has three decisions at every turn, so we can see that from the top, maximising triangle, we have three minimising triangles that branch off it. Likewise, for each minimising triangle, we have three more maximising triangles branching off it. Now, shown by the diagram below, our minimax program would ‘look’ into every possible position that is two moves ahead and calculate a score for each of the 9 possible positions. Since we are not playing a real game, I have assigned some arbitrary scores for each of the 9 positions: For each of the red minimising triangles or states, we have three scores assigned below them. For the leftmost minimising state, the min player has three choices: one choice gives the min player a score of 4, another gives a score of 8 and the final choice gives a score of 5. Since the minimising player wants to choose the minimum score and we must always assume that both players play optimally (the max player will always choose the highest choice and the min player will always choose the lowest choice in a minimax tree), the red triangles are therefore filled up as shown in the diagram below the paragraph. For the first minimising state, out of the possible positions evaluated as 4, 8, and 5, the minimising player, when playing optimally, will always choose the move that evaluates as the lowest score, hence choosing the move that results in score 4. This principle is the same for the other two triangles shown for the minimising player. Finally, from the three minimising positions with the scores 4, 3, and 2, the maximising player will always choose the highest possible choice. The max player, which represents our program, therefore chooses the position which gives a score of 4, and plays the move which is assigned to it, labelled below: The example shown above is an example of depth-limited minimax. The depth for the minimax shown above is 2 since we evaluated all the possible positions 2 moves ahead, followed by an evaluation of all the possible positions 1 move ahead which was then used to figure out the move that the program plays. This is a repetitive process where the score for the positions less than x moves away are used to evaluate the score for the position x-1 positions away, until we reach the current position and therefore evaluate the best move. Because of this property, the minimax function is a recursive function. It is also an example of a depth first search algorithm.
Depth-limited minimax is used when the game has too many possibilities for the computer to evaluate every single position from start to end. This kind of function is used in games like chess which have over 13^64 possible positions. Depth-limited minimax, although faster, doesn’t always play the optimal move in certain positions. The higher the depth, the higher the chance of the computer playing an optimal move. Minimax without depth, however, will always play the best move. Minimax without depth is used in games like tic-tac-toe where there aren’t that many possible positions. Minimax without depth works similarly, except that the AI is programmed to evaluate every single position from start until the end and find a forced win that happens in the least number of moves possible. Minimax is widely used to create game engines. However, you may realise that for complex games like chess, depth limits above 3 can take a huge amount of time with just minimax. To counter that, programmers use a process called alpha-beta pruning which drastically reduces the amount of time needed. Alpha-beta pruning, used along with minimax, can be used to create game engines that are much smarter than any human is.
1 Comment
2/8/2022 10:38:33 am
Takipçi paketi mi arıyorsunuz? Instagram Türk takipçi satın al seçeneği ile sizlerde istediğiniz paketlere ulaşım sağlayabilirsiniz. Tamamen garantili bu hizmetler adresimizde sizleri bekliyor.
Reply
Leave a Reply. |
Our AuthorsWe are high school and college students from around the world who are passionate about maths, and want to share that passion with others. Categories
All
|