Step-by-Step wuziqi AI [3] cornerstone-alpha-beta tailoring beyond the boundary

Source: Internet
Author: User

Note: The evaluation method is updated today. It mainly adds a set of variables to record the number of sunspots or whitelists on each vector, so as to selectively use the corresponding template for matching, this causes changes to the pos class. In addition, an error was found in the Vector Evaluation code, leading to black and white obfuscation. The addpipe was not used in the interface code and has been corrected. However, the pcgo function has forgotten to change it and does not need to upload it again. The source code has been uploaded again. Now it is basically possible to continue iteration by 4 or 5, so there will be no false positives. Because the evaluation problem is corrected, the replacement table code still has some problems (mainly the problem of fast array or Fast Hash Table), so it is not completed. It is estimated that this part of code will be completed tomorrow or the day after tomorrow. Then optimize the evaluation function, and prepare to adopt the "look-up table" Method for vector scoring rather than matching. Of course, the optimized evaluation function does not occupy much processing time when evaluating using performance evaluation tools. (the most prominent processing time is the pcgo function-alpha-beta tailoring function, the same is true for the test results ).

First of all, the source code is mostly used for reference to the source code of open-source chess software. It mainly refers to some open-source software source code provided on the chess little wizard website. The framework is basically the same, so there is nothing new. Thank you for your selfless contribution! Some of the following technical implementation code may be different or even very different, but the basic framework is the same.

I have just completed the replacement table, that is, the sixth program, but I deleted the program because it is still messy and does not seem clear. I will rewrite it later. According to the test results of the replacement table, our evaluation function does have a big problem and requires a much faster evaluation function! In general, it is satisfactory to deepen the iteration to 7, and 6 is barely acceptable. However, my final test result is still 4 or even 3 or 2. Although the game is extended in the program, this can make the replacement table richer, although static evaluation is also used, but when static evaluation is not used, the speed is not very good, although empty step pruning is used, the improvement of chess power is much more important than the time consumption. Therefore, the slow speed is all due to our evaluation function. However, I decided to use the current evaluation function to write the entire serialization. This annoying problem may be solved after Episode 6. So, let's continue to discuss how to make the program play the game. This will be a very exciting moment!

1. How to make the program work?

The program determines which game to take based on the score. In other words, we always consider what will happen if we go here, and what will happen if we go there, as does the program. But it is not that easy to implement.

2. How is alpha-beta tailoring achieved?

This is based on the idea that if the evaluation is based on the score, then it is the turn of everyone to make the score higher. In this case, the program guesses that the opponent is playing the game. Therefore, if the evaluation function is: Your score-the score of the other party, the more positive the score, the closer we are to victory, and the more negative the score, the closer we are to lose. As a result, we always look for a way to go with a higher score, and a way to go with a lower score than a certain degree (for example, the right side is 5) is cut (so-called cut off, in the program, it means not to continue scanning -- exit ).

3. How to implement this recursive function?

There are more than one way to design recursion, but it is undoubtedly necessary to achieve your own goals. Our goal is to find a way to score higher. Here are several key points:

A. evaluation score and Comparison

B. The highest score must be recorded.

C. You need to alternate search for moves

Obviously, recursive statements are in the loop of traversal points; the highest score points can be recorded using global variables. Next we will discuss the question of where to evaluate the function:

It's nothing more than pre-recursion or post-recursion. What we want is not infinite recursion-it's an unfinished job (thanks to the players who will wait, perhaps the flowers thanked him for shutting down the program or even the computer angrily), so recursion requires depth. It is a good practice to evaluate the results when recursion reaches the depth of recursive depth we require. In fact, the alpha-beta tailoring gives every sub-user an evaluation of this matter to "iterative deepening". In other words, we will evaluate this matter every step and separate it, formed "iterative deepening". The advantage of doing so is obvious. Let's look back at how alpha-beta tailoring is a loop? The process of traversing the tree is a forward traversal. Therefore, before the tree is finished, we cannot sort the steps of each layer to obtain the best walk in this layer, iteration deepens to solve a very important problem: it allows alpha-beta tailoring to run Layer 1 first, and then Layer 1 and Layer 2, so that layer 1, Layer 2, and Layer 3 ...... Of course, you may think that this waste of time, in fact, the relative traversal of the N layer before the N-1 layer of the time is wool, let alone we have "replace table". This means that we can better sort the last search to make it easier to truncate, and when we reach the specified time, you can return the best result you get without searching the current layer, because the result will not be worse than the best result of the previous layer-it may be as good, better, but not worse. Of course, iteration deepening and table replacement are the last words. Well, our conclusion is that the alpha-beta tailoring looks so bad:

'Input alpha value, beta value, and depth.

Function alpha-beta (vlAlpha As Integer, vlBeta As Integer, nDepth As Integer) As Integer

'Reaching deep return ratings

If nDepth <= 0 then return Evaluate

'Obtain all reasonable methods

GenerateMoves

'Traversal Methods

For I = 1 To the number of moves

Pos. AddPiece (mvs (I )'
Vl =-SearchFull (-vlBeta,-vlAlpha, nDepth-1) 'recursion. This sentence is followed by an output stack. Let's start counting.
Pos. DelPiece (mvs (I )'

Record a larger alpha, record the corresponding steps, and perform beta truncation.

Next

Return score

End function

This is the entire alpha-beta recursive function framework.

 

Then, our entire alpha-beta function looks very familiar:

 

'Alpha-Beta search process beyond the boundary (Fail-Soft. The returned value of this process is a score, and the best walk is recorded to a global variable (pos. mvResult ).
Public Function SearchFull (vlAlpha As Integer, vlBeta As Integer, nDepth As Integer) As Integer
'Cycle variable, maximum subscript of the walk array (number of walk-through-1)
Dim I, nGenMoves As Integer
'Score, highest score, best way to go
Dim vl, vlBest, mvBest As Integer
'All generated route entries (fixed length, and the specific route entries are determined by nGenMoves)
Dim mvs (MAX_GEN_MOVES) As Byte
'A complete Alpha-Beta search can be divided into the following phases:

'1. When the horizontal line is reached, the value of situation evaluation is returned.
If nDepth = 0 Then
Return pos. Evaluate ()
End If

'2. initialize the best value and best walk
VlBest =-MATE_VALUE 'so that you can know whether you have never walked through the same method (kicker)
MvBest = 0' to check whether the Beta or PV steps are found so that they can be saved to the historical table.

& Apos; 3. generate all route entries and sort them by history tables.
NGenMoves = pos. GenerateMoves (mvs)
Array. Sort (mvs, 0, nGenMoves, mCompare)

4. perform these steps one by one and perform recursion
For I = 0 To nGenMoves
Pos. AddPiece (mvs (I ))
Vl =-SearchFull (-vlBeta,-vlAlpha, nDepth-1)
Pos. DelPiece (mvs (I ))

'5. Alpha-Beta size determination and truncation
If (vl> vlBest) then' finds the optimal value (but it cannot be determined whether it is Alpha, PV or Beta)
VlBest = vl '"vlBest" is the best value to be returned, which may exceed the Alpha-Beta boundary.
If (vl> = vlBeta) then' finds a Beta walk
MvBest = mvs (I) 'beta method to save to the History Table
Exit For 'beta Truncation
End If
If (vl> vlAlpha) then' finds a PV route
MvBest = mvs (I) 'pv method to save to the History Table
VlAlpha = vl 'narrow the Alpha-Beta Boundary
End If
End If
Next

'5. All the methods have been searched. Save the best walk (not Alpha walk) to the History Table and return the best value.
If vlBest =-MATE_VALUE Then
'If you are playing a game, give an evaluation based on the number of game moves
Return pos. nDistance-MATE_VALUE
End If
If mvBest <> 0 Then
'If it is not an Alpha walk, save the best walk to the History Table.
Pos. nHistoryTable (mvBest) + = nDepth ^ 2
If pos. nDistance = 0 Then
'There is always an optimal way to search for the root node (because the full-window search will not go beyond the boundary). Save this walk.
Pos. mvResult = mvBest
End If
End If
'Returns the highest score.
Return vlBest
End Function

 

 

There is a very important question to be explained here, that is, the History Table. We all know that it is used for sorting and can be truncated faster after sorting of historical tables. The same is true if you comment out this sentence:

 

Pos. nHistoryTable (mvBest) + = nDepth ^ 2

 

The depth 3 in the sample code is also time-consuming and terrible ...... So Let's explain how history tables play a role:

Look at the structure of the History Table:

Public nHistoryTable (224) as integer

That is to say, nvBest is the best method coordinate, and the value of each element is the value of this coordinate. If it is the best method, the score is relatively high. How does this score play a role in sorting? Let's take a look at the sequencer:

 

'Sorter' sorts the rational moves based on the historical steps.
Class mvsCompare
Implements IComparer
'This array is the basis for sorting by method and a reference to a historical table.
The Public Shared MS () As Integer
Public Function Compare (x As Object, y As Object) As Integer Implements System. Collections. IComparer. Compare
Return MS (y)-MS (x)
End Function
End Class

 

The ms array is the reference of the History Table. I passed it directly. In this way, the conclusion is very obvious:

 

The higher the score of the best walk element in the History Table, the higher the score after the generated walk method is sorted. That is to say, when searching in the future, the search method that follows this position will be performed first. First, to what extent, it is related to depth (specifically, the accumulation of depth square). The deeper the obtained optimal method, the more searched first.

 

The pos class involved is nothing more than auxiliary functions:

 

Public Class mPosition
'72 become a chess Vector
Public Vectors As New mVectors
'Who is the turn to go, 0 = red, 1 = black
Public sdPlayer As Integer
'Chess piece, 0 = red, 1 = Black, 2 = no child
Public ucpcSquares As mBitBoard
'The number of steps from the root node
Public nDistance As Integer
'Disabled gamers
Public RtPlayer As Integer = 2
'Computer go chess
Public mvResult As Integer
'History table
Public nHistoryTable (224) As Integer

'Initialize the checkerboard class
Sub New ()
UcpcSquares = New mBitBoard ()
End Sub

'Clear the History Table
Public Sub ClearnHistoryTable ()
MvResult = 0
Array. Clear (nHistoryTable, 0,225)
End Sub

'Switcher
Sub ChangeSide ()
SdPlayer = 1-sdPlayer
End Sub

'Place a pawn on the checkerboard
Sub AddPiece (sq As Integer)
'Update the checkerboard
UcpcSquares. Set (sq, sdPlayer)
'Update the number of pawns on the flag and vector.
For Each v As mVector In Vectors. hs (sq)
V. pipecount (ucpcSquares. Get (sq) + = 1
V. update = True
Next
'Switcher
ChangeSide ()
'Update steps
NDistance + = 1
End Sub

'Take a piece from the chessboard
Sub DelPiece (sq As Integer)
For Each v As mVector In Vectors. hs (sq)
V. pipecount (ucpcSquares. Get (sq)-= 1
V. update = True
Next
UcpcSquares. Set (sq, 2)
ChangeSide ()
NDistance-= 1
End Sub

'Situation evaluation function
Function Evaluate () As Integer
Return Vectors. Evaluate (ucpcSquares, sdPlayer, RtPlayer)
End Function

Sub Startup () 'initializes the chessboard
SdPlayer = 1
NDistance = 0
For I As Integer = 0 To 224 ', the setall function of bitarray is not used.
UcpcSquares. Set (I, 2)
Next
End Sub

'Mvs offers all reasonable methods
Function GenerateMoves (mvs () As Byte) As Integer 'generates all the steps
Dim GenBoard = ucpcSquares. GetGeneratePoints
Dim I As Integer = 0, nGenMoves As Integer = 0
For I = 0 To 224
If GenBoard (I) Then
Mvs (nGenMoves) = I
NGenMoves + = 1
End If
Next
Return nGenMoves-1
End Function

End Class

 

 

Well, here we can explain why white games are 0 and black games are 1. Of course, black 0 and white 1 can also be changed ...... Haha. This is to simplify the player switching function and make it easier to operate the array ............ In fact, the benefits are also great.

Reprinted please indicate the source:

Http://www.cnblogs.com/zcsor/

 

Updated source code:

/Files/zcsor/qingyue lianzhu 0.3.7z

 

Next episode notice:

The original plan was four items: Board tailoring, iterative deepening, empty step tailoring, and Chongqi extension. However, there was a problem, which was reduced to three. Because board pruning is already in the source code of this article. Nested loops are used to find all spaces within three cells around all the pawns as the basis for generation.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.