Gobang AI algorithm Third-alpha beta pruning

Source: Internet
Author: User

Pruning is a must.

In the previous article, we talked about the maximal minimum search, but the pure Minimax search algorithm has no practical significance.

Can do a simple calculation, the average one step to consider 50 possibilities, thinking to the fourth level, then the number of search nodes is 50^4 = 6250000 , on my core I7 computer A second can compute nodes not more than 5W, then 625W nodes need more than 100 seconds. It is certainly unacceptable for a computer to think for 100 seconds, and it is actually better to control it in less than 5 seconds.

By the way, the number of layers must first be considered an even number. Because an odd node is an AI, even a player, if the AI next is not considered the player to defend a bit, then this estimate is obviously problematic.
Then, at least 4 layers of thinking, if not even 44 layers are considered, it is only to see the immediate benefits, then the force will be very very weak. If you can do 6 layers of thinking can basically achieve casual win the level of ordinary players (ordinary players are not specifically studied Gobang players, chess force is about 4 levels).

Alpha Beta Pruning principle

The basic basis of the Alpha Beta pruning algorithm is that the player does not make an adverse choice for himself. According to this premise, if a node is obviously unfavorable to its own node, then you can directly cut off the node.

As mentioned earlier, the AI will select the maximum node in the Max layer, and the player will select the most bar point at min level. The following two cases are, respectively, adverse choices for both parties:

In the max layer, assuming that the current layer has already searched for a maximum value of x, if you find that the next layer of the next node (that is, the min layer) produces a value that is smaller than X, then cut the node straight away.

Explain, that is, in the max layer when the current layer has been searched for the maximum value of x saved, if the next node of the next layer will produce a value that is smaller than X y, then said the player will always choose the minimum value. This means that the player's score does not exceed Y, so the node obviously doesn't need to be counted.

Popular point is that AI found this step is more beneficial to the player, then of course, will not take this step.

In the max layer, assuming that the current layer has already searched for a maximum value of x, if you find that the next layer of the next node (that is, the min layer) produces a value that is smaller than X, then cut the node straight away.

This is the same truth, if the player took a step to find that in fact the AI more favorable, the player will not take this step.

The following illustration shows that lazy drawing, directly on the wiki diagram:

As shown in the second layer, the Min layer, when computing to the third node, it is known that there is a 3 and a 6, that is, the minimum value is 3. When calculating the third node, it is found that the result of its first child is 5, because its child is the Max node, and the Max node will choose the maximum value, then the value of this node is no less than 5, so this node's post-the child is not necessary to calculate, because this node can not be less than 5, The same layer already has a node with a value of 3.

In fact, the third-level node with a score of 7 does not need to be calculated.

This is the pruning of the MAX node, the pruning of min nodes is the same reason, no longer speak. Alpha Beta pruning of alpha and beta respectively refers to the max and Min nodes.

Code implementation

Although the principle said a lot, but in fact the implementation of the code is particularly simple.

Add one and a parameter to both the Max and Min functions alpha beta . In max a function, if the value of a child node is found to be greater than the post- alpha order node is no longer computed, this is Alpha pruning. If a child node is found to be less than the value in the function, the post- min beta order node is no longer evaluated, which is a beta pruning.

The code is implemented as follows:

1 varMin =function(board, deep, alpha, beta) {2   varv =Evaluate (board);3Total + +;4   if(Deep <= 0 | |win (board)) {5     returnv;6   }7 8   varBest =MAX;9   varPoints =Gen (Board, deep);Ten  One    for(vari=0;i<points.length;i++) { A     varp =Points[i]; -BOARD[P][P] =R.hum; -     varv = max (board, deep-1, best < alpha?)Best:alpha, beta); theBOARD[P][P] =R.empty; -     if(V <Best ) { -Best =v; -     } +     if(v < Beta) {//AB Pruning -Abcut + +; +        Break; A     } at   } -   returnBest ; - } -  -  - varMax =function(board, deep, alpha, beta) { in   varv =Evaluate (board); -Total + +; to   if(Deep <= 0 | |win (board)) { +     returnv; -   } the  *   varBest =MIN; \$   varPoints =Gen (Board, deep);Panax Notoginseng  -    for(vari=0;i<points.length;i++) { the     varp =Points[i]; +BOARD[P][P] =r.com; A     varv = min (board, deep-1, alpha, best > Beta?)Best:beta); theBOARD[P][P] =R.empty; +     if(V >Best ) { -Best =v; \$     } \$     if(V > Alpha) {//AB Pruning -Abcut + +; -        Break; the     } -   }Wuyi   returnBest ; the}

According to the wiki, the optimization effect should reach the 1/2 sub-side, that is, to optimize to the left and right 50^2 = 2500 , the actual test I did not so ideal. However, the number of nodes is less than the previous one-tenth, on average about each step of the calculation of 50W nodes, takes time in about 10 seconds. Compared to the previous 600W node has been greatly improved.

However, even after the Alpha Beta pruning, the thinking layer can only reach four layers, which is the level of a normal player who does not play Gobang. And for each additional layer, the amount of time required or the number of nodes counted is exponentially increased. So the current code to calculate to the sixth level is very difficult.

Our time complexity is an exponential function M^N , where base m is the number of child nodes of each layer of nodes and N is the number of layers to think about. Our pruning algorithm can cut off a lot of unused branches, equivalent to reduce N , then we need to reduce the next step, M if we can M reduce by half, then the four-layer average thinking time can be reduced to 0.5^4 = 0.06 times, that is, from 10 seconds to less than 1 seconds.

And how did this m come from? Mis actually the gen optional empty space returned by the function. In fact, the Gen function has a great space for optimization, and this optimized gen function is actually a heuristic search function.

Gobang AI algorithm Third-alpha beta pruning

Related Keywords:
Related Article E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth \$300-1200 USD