Today is part two: players and AI
The player mainly realizes the function of undo
AI is mainly search, maximum minimum algorithm, branch shear algorithm
1, each step lazi step, in order to be able to undo
typedef struct reversistep{Reversibitboard m_lastmap; reversistep& operator= (const reversistep& temp) {m_lastmap = Temp.m_lastmap; return *this; }}reversistep;
This is a direct record of Lazi before the board state (this step to who Lazi has been recorded in Reversibitboard), Undo is to restore the board state to the person before the Lazi state, and then still by the player Lazi again.
Lazi time, fill in the Reversistep, into the chain list, undo, from the end of the list to exit a node, and the board state to restore the state of the tail node, that is, the realization of undo.
A total of 60 steps to full, here with a pool of objects, a one-time application of 60 steps required memory, so as to avoid frequent lazi undo process, frequent application of memory.
Undo is actually very simple, very good implementation, the following is the key: Ai how to do.
2. AI
As far as I know, the AI of Othello is divided into two major kinds:
The first: is the template implementation (not a C + + template AH), is the template of the chess game.
In simple terms, it is the analysis of a large number of chess games and games. Find out, which position to form a situation, is an advantage, or a disadvantage. Then make a score for the different situations.
Ai when playing chess, respectively, look at each position, can form a template in which situation in the library, what kind of score, to determine whether this step is good chess, or bad chess, should not fall in this position.
(Of course, the author does not have this large number of available analysis of the game to study, and here finally did not use this method)
The second type: real-time analysis.
That is, every step Lazi before, the scene search, see where we can have lazi. Then, the different position is the superiority or the disadvantage, can get how many points;
After they fall in these positions, the other side can then lazi in which positions, the other side can get how many points,
Then loop over and over again, searching for a few steps back.
Then the comprehensive assessment of the current Lazi where the best position, you can get the advantages of the greatest possible, the other side to obtain the advantage as small as possible.
Here comes the question of a valuation function: What is the advantage, what is the disadvantage?
Valuation table:
64 positions on the board by experience to do a valuation, a preliminary determination of which positions to seize, which position is to induce (or force) the other side to occupy, that the other side except this point, there is no choice
Like what:
Like 1A this position, if you take up after, the other party is such as how can also not turn off the position (the following will refer to the position called OK), so if you can occupy this position, this position is definitely yours, will not be eaten, so to occupy as much as 1A such as the position of 4 corner.
Like 1b,2a,2b, this position, if you occupy, the other side can easily occupy the corner, so try to avoid occupying such a position (such a position has 12). There is no better place to be without a child.
According to this experience, we set an estimate table for the board as above, different positions have different values, the position of the 4 corners dark Green, the score is the highest (here is 0x01000000), like 1b,2a,2b, the position of the red representation, The score is the lowest (here is 0x00000001). As the following table:
const int G_weight[reversi_max_row][reversi_max_column] = {{0x1 <<, 0x1, 0x1 <<, 0x1 << 16,0 X1 << 16,0x1 <<, 0x1, 0x1 << 24},{0x1, 0x1, 0x1 << +, 0x1 << 4, 0x1 &L t;< 4, 0x1 <<, 0x1, 0x1},{0x1 <<, 0x1 << +, 0x1 << 8, 0x1 <<, 0x1 &L t;< 8, 0x1 <<, 0x1 << +, 0x1 << 20},{0x1 << 8, 0x1 << 4, 0x1 << 0, 0, 0x1 << 8, 0x1 << 4, 0x1 << 16},{0x1 << +, 0x1 << 4, 0x1 << 8, 0, 0, 0x1 << 8, 0x1 << 4, 0x1 << 16},{0x1 <<, 0x1 <<, 0x1 <<, 0x1 < ;< 8, 0x1 << 8, 0x1 <<, 0x1 <<, 0x1 << 20},{0x1, 0x1, 0x1 <<, 0x1 &L t;< 4, 0x1 << 4, 0x1 << +, 0x1, 0x1},{0x1 <<, 0x1, 0x1 <<, 0x1 << 16,0x1 << 16,0x1<<, 0x1, 0x1 << 24}};
Line power:
That means you currently have how many places you can lazi.
Above in the valuation table said, to seize the advantageous position, and forcing the other side lazi to the disadvantage of the position. So there is the concept of a force of action.
(There is a sentence to go their own way, let others have no way to go, this is the truth.) )
In order to make oneself occupy advantageous position, then oneself can lazi position to be as much as possible, so oneself can choose the most advantageous position.
(The chessboard total only 60 positions can be lazi, you can place as much as possible, the other side can be under the position of as little as possible, this is called to take the road of others. )
To let the other side can lazi position as little as possible, so as to force the other side to go to not want to place, but have to next position up.
(This is called to let others have no way to go.) )
Of course there are special circumstances, such as their own this side has 3 positions can be lazi, but are some perfunctory place, the other side only one position can be lazi, but it is occupied angle.
So the action force to be used in conjunction with the valuation table, the simple way is: to let each other's moves position, each step corresponding to the value of the valuation table, the sum, as small as possible, the sum of their own as large as possible.
For example, the side has 3 steps to go, the score is 5, 10, 15 points, the other side only 1 steps can go, the score is 100 points. It is certainly not a priority to consider such a scheme.
OK Sub:
There is also a translation into the stable son.
The rule of Othello is that each other's pieces are clamped between the two children and can be flipped. And determine the son, is the other side anyway, no matter how moves, it is impossible to flip off the pieces.
Clearly, the 4 horns are the deterministic sub-
Another example:
All the white children in the box are the deterministic children.
When a party determines that the number of children reaches 33, it must win.
There are various concepts, walls, balance, wife, my husband, and so on, here is not introduced. If you are interested, you can Baidu "guide to Othello"
The AI implemented in this article is searched by the valuation table.
However, the number of steps in search is exponentially increased. I do not feel the maximum number of steps on the machine is about 9 steps, 10 steps will probably be card 1, 2 seconds.
So it's impossible to search every situation, to do some pruning.
For each step of the search:
First of all see can occupy angle, can occupy angle, the current branch will not continue to search down (even if not the maximum depth, also do not continue), began to search the other possible lazi position.
(This is actually a pre-experience-based branch cut).
The second is the maximum minimum search algorithm:
This assumes that the AI opponents are the smartest, choosing the optimal solution, which will choose the most unfavorable choice for AI.
So:
The result set found is the result of the AI side, so choose the location with the highest final score.
The result set that is found is the result of the player side, so choose the position with the lowest final score.
Such as
Suppose the circle represents an AI node, and a square represents the player node.
For both A2 and A3, AI is clearly choosing A2 10 points. For both A4 and A4, AI is clearly choosing A4 20 points.
But for B1,B2, if the player under the B2, so that the AI can get 20 points, under B1, so that the AI can only get 10 points, then the player is obviously the next B1.
So finally A1 this step, AI can only get 10 points. This is the maximum minimum algorithm.
Then there is the Βsearch shears:
Now A2,a3 has chosen the maximum value of 10,B1 's score is 10 points.
And for B1,B2 is to choose the minimum value, since B1 score is 10 points, then b1,b2 between the final result is <=10.
and A4 score is 20 points, for A4,A5 is the choice of the most worthwhile, that is, a4,a5 between the final result is >=20, the final result of B2 is >=20.
So this is definitely the choice of B1, for the A5 node has not yet searched, has not affected the final selection results, so you can not consider.
This is the branch cut.
Then the score is calculated:
The score for each step here is a score relative to the AI.
Ai Lazi a position, get a positive point, then the opponent Lazi a position, the score for the AI is a negative score (that is, the advantage of the player, for the AI is a disadvantage).
For a node that has already found the maximum depth, its score is the score of the position itself (since it is no longer searched).
For the halfway point, it should be the score of the position itself, plus the next opponent's choice results score. This cannot be reversed only with the result of the last step.
As an example:
such as the left and right two cases.
Suppose the circle represents an AI node, and a square represents the player node.
Where the score represents the score in the valuation table that the node itself lazi the position. The player node takes negative points.
If you only use the deepest node score to calculate the top node score, then according to the maximum minimum algorithm above, the AI final score: The left is 10 points, the right is 5 points. So AI chooses 10 points to the left of this case.
But in the middle of the process, the player can get 50 points of such a relatively good score.
And AI should not allow the player to achieve such a better advantage.
So the final score should be considered by combining the lazi position of the opponent and the score:
Ai's final score: 30 points on the left and 15 points on the right. The final selection is to the right, not to the left.
Well, that's the basic AI. Although just a very simple AI, the next win the author is more relaxed.
The specific code is given below
ReversiPlayer.h
#ifndef _reversiplayer_h_#define _reversiplayer_h_#include "TBDLinkList.h" #include "TObjectPool.h" #include " ReversiCommon.h "#include" ReversiBitBoard.h "typedef struct REVERSISTEP{REVERSIBITBOARD m_lastmap; reversistep& operator= (const reversistep& temp) {m_lastmap = Temp.m_lastmap;return *this;}} Reversistep;class reversiplayer{public:void Init (enumreversipiecestype type); void Play (reversibitboard& reversi , Char Row_y, char column_x); void Cancel (reversibitboard& reversi); Enumreversipiecestype getplayertype ();p rivate:void addreversistep (reversibitboard& reversi); Enumreversipiecestype M_playertype; Tbdlinklist<reversistep> m_reversisteplist; Tobjectpool<tbdlinker<reversistep>> M_reversisteppool;}; #endif
ReversiPlayer.cpp
#include "ReversiPlayer.h" void Reversiplayer::init (Enumreversipiecestype type) {M_playertype = Type;m_ Reversisteplist.init (Enum_disablelock); M_reversisteppool.init (Reversi_max_row * reversi_max_column,0,enum_ Disablelock_objpool,enum_disableassign_objpool);} void Reversiplayer::P Lay (reversibitboard& reversi, Char row_y, char column_x) {addreversistep (Reversi); Reversi. Setpieces (M_playertype, row_y, column_x); Reversi. Doreversi (M_playertype, row_y, column_x); Reversi. Swapplayer ();} Enumreversipiecestype Reversiplayer::getplayertype () {return m_playertype;} void Reversiplayer::addreversistep (reversibitboard& reversi) {tbdlinker<reversistep> *pLinker = m_ Reversisteppool.malloc (); if (NULL! = plinker) {plinker->m_value.m_lastmap = Reversi;plinker->m_plinklist = NULL; M_reversisteplist.pushtail (Plinker);}} void Reversiplayer::cancel (reversibitboard& reversi) {tbdlinker<reversistep> *plinker = m_ Reversisteplist.poptail (); if (NULL! = plinker) {Reversi = Plinker->m_value.m_lastMap;m_reversisteppool.free (Plinker);}}
ReversiAI.h
#ifndef _reversiai_h_#define _reversiai_h_#include "TObjectPool.h" #include "tool.h" #include "ReversiCommon.h" # Include "ReversiPoint.h" #include "ReversiBitBoard.h" const int max_depth = 9;const int max_weight = my_max_int;const int MI N_weight = my_min_int;typedef struct Reversiairecord{enumreversipiecestype m_type;//current Lazi party Reversipoint m_point;// Current Lazi Party's lazi position Reversibitboard m_resultboard;//current Lazi Party Lazi result (checkerboard State) int m_weight;//current Lazi Party's Lazi weight//If the player, the weight value is negative, if the AI, the weight value is positive void Setrecord (enumreversipiecestype type, char row_y, Char column_x, reversibitboard& lastboard);} Reversiairecord;class reversiai{public:void Init (enumreversipiecestype type); void Play (reversibitboard& reversi ); Enumreversipiecestype getplayertype ();p rivate:int Find (reversibitboard& lastreversi, int lastWeight, int Lastdepth, Enumreversipiecestype Lasttype); Enumreversipiecestype M_aitype; Tobjectpool<reversiairecord> M_reversiairecordpool;}; #endif
ReversiAI.cpp
#include "ReversiAI.h" void Reversiairecord::setrecord (enumreversipiecestype type, char row_y, Char column_x, reversibitboard& lastboard) {m_type = Type;m_point.m_row_y = row_y;m_point.m_column_x = Column_x;m_resultboard = Lastboard;m_resultboard. Setpieces (M_type, m_point.m_row_y, m_point.m_column_x); M_resultboard. Doreversi (M_type, m_point.m_row_y, m_point.m_column_x); m_weight = 0;} void Reversiai::init (Enumreversipiecestype type) {M_aitype = Type;m_reversiairecordpool.init (+), enum_ Disablelock_objpool, Enum_disableassign_objpool);} void Reversiai::P lay (reversibitboard& reversi) {int currweight = Min_weight; Reversipoint Currpoint = {-1, -1};int i = 0;for (; i < 4; i++) {if (Reversi. CanPlay (M_aitype, g_weightorder[i][0], g_weightorder[i][1])) {currpoint.m_row_y = G_weightorder[i][0];currpoint.m_ column_x = G_weightorder[i][1];break;}} if (!currpoint.isvalid ()) {for (; I < Reversi_max_row * REVERSI_MAX_COLUMN-4; i++) {if (REVERSI. CanPlay (M_aitype, g_weightorder[i][0], G_weightoRder[i][1]) {Reversiairecord *currrecord = M_reversiairecordpool.malloc (); if (NULL! = Currrecord) {currrecord-> Setrecord (M_aitype,g_weightorder[i][0], g_weightorder[i][1], reversi); int weight1 = G_weight[g_weightorder[i][0]][g _weightorder[i][1]];int weight2 = Find (Currrecord->m_resultboard, Currweight, 1, m_aitype); currrecord->m_ Weight = weight1 + weight2;if (Currrecord->m_weight > Currweight) {currweight = Currrecord->m_weight;currpoint = Currrecord->m_point;} else if (currrecord->m_weight = = currweight) {if (!currpoint.isvalid ()) {currweight = currrecord->m_weight; Currpoint = Currrecord->m_point;}} M_reversiairecordpool.free (Currrecord);}}} if (Currpoint.isvalid ()) {Reversi. Setpieces (M_aitype, currpoint.m_row_y, currpoint.m_column_x); Reversi. Doreversi (M_aitype, currpoint.m_row_y, currpoint.m_column_x);} Reversi. Swapplayer ();} int Reversiai::find (reversibitboard& lastreversi, int lastweight, int lastdepth, Enumreversipiecestype lastType) { EnumreversipiecestypeCurrtype = Swaptype (lasttype); int currdepth = lastdepth + 1;int currweight = 0;if (Currtype = = m_aitype) {currWeight = MIN_ WEIGHT;} Else{currweight = Max_weight;} int i = 0;for (; i < 4; i++) {if (Lastreversi.canplay (Currtype, g_weightorder[i][0], g_weightorder[i][1])) {if (Currtype = = M_aitype) {return g_weight[g_weightorder[i][0]][g_weightorder[i][1]];} ELSE{RETURN-G_WEIGHT[G_WEIGHTORDER[I][0]][G_WEIGHTORDER[I][1]];}} for (; i < Reversi_max_row * REVERSI_MAX_COLUMN-4; i++) {if (Lastreversi.canplay (Currtype, g_weightorder[i][0), G_Weig Htorder[i][1]) {Reversiairecord *currrecord = M_reversiairecordpool.malloc (); if (NULL! = Currrecord) {currRecord-> Setrecord (Currtype,g_weightorder[i][0], g_weightorder[i][1], lastreversi); int weight1 = 0;int weight2 = 0;if (Currtype = = M_aitype) {weight1 = g_weight[g_weightorder[i][0]][g_weightorder[i][1]];} ELSE{WEIGHT1 =-g_weight[g_weightorder[i][0]][g_weightorder[i][1]];} if (currdepth = = max_depth) {weight2 = 0;} Else{weight2 = Find (currrecord->m_rEsultboard, Currweight, currdepth, Currtype);} Currrecord->m_weight = weight1 + weight2;if (currtype = = M_aitype) {if (Currrecord->m_weight > Currweight) { Currweight = currrecord->m_weight;if (Currrecord->m_weight > Lastweight) {m_reversiairecordpool.free ( Currrecord); break;}}} Else{if (Currrecord->m_weight < currweight) {currweight = Currrecord->m_weight;if (currRecord->m_weight < Lastweight) {m_reversiairecordpool.free (Currrecord); break;}}} M_reversiairecordpool.free (Currrecord);}}} return currweight;} Enumreversipiecestype Reversiai::getplayertype () {return m_aitype;}