Implementation and consideration of Go game programs (1) -- Situation Evaluation

Source: Internet
Author: User

Two months after graduation, I wrote this article.ArticleIt's my bid-Go gameProgramDesign and implementation. I chose this topic as an example. I didn't know how to implement it at one time. Fortunately, the omnipotent Weibo gave me the core of modern computer go.Algorithm-- Uct algorithm. The Monte Carlo situation evaluation algorithm is the basis of the uct algorithm. As the first article in this series, this article first introduces the Monte Carlo situation evaluation algorithm.

A major difficulty in the computer go game is that it is difficult to design a simple and effective situation evaluation algorithm. The traditional go program mainly uses expert knowledge such as the impact function to evaluate the situation. It is difficult to abstract the expert knowledge of Go (think about thick, thin, gas, and other words ......), The evaluation is often inaccurate. How can so be accurately evaluated? Poor effort! Yes, if both sides of the black and white sides are "correct" for each subsequent game and the black game wins, then the current situation must be the advantage of the black game.

Quantum computers have not yet worked out, so I 'd like to find a feasible method. Suppose that if the black and white sides do not have the same playing power, then I will continue this situation. In the end, black games won, who was the advantage of the current situation? You may say that the advantages of black games are more likely. Furthermore, the same situation has continued for 1000 times, and 800 times have won the game. Do you basically believe in the advantages of the game?

So what if neither the player nor the player can go at random? They continued to deal with this situation for 1000 times, and 800 of them were successful in black games ......

This is the Monte Carlo situation evaluation algorithm, so easy ~


**************


In this paper, we attempt to explore the rationality of the Monte Carlo situation evaluation algorithm from a mathematical perspective (it was later known that someone had proved it ......

3.3.4 mathematical definition and precise quantification of Go game value

The theoretical basis of the Monte Carlo method is the "Big Number Theorem" in probability statistics ". In this section, I will try to explore the rationality of the use of the Monte Carlo Method in go situation evaluation from the mathematical perspective.

The general idea is: the value of every player in go can be accurately quantified using mathematical methods, in this paper, we propose a mathematical definition of "Chess value" (as well as a series of mathematical definitions of the relevant concepts in the Go field). It is also a method for precisely quantifying "Chess value.

First, for a zero-sum game with complete information such as go, for any given situation, there must be an objective "best chess step" for the sub-party (the best game step for the Party to win the most in the final game). In theory, we can use the exhaustive method to find the "best chess step ", the following describes the pseudo-Recursive Implementation of the exhaustive method.Code:


 

Table 3-7 pseudocode 2

Move bestmove (goboard Board)

{

For (move in board. next_move_set ){

Next_board = board. Move (MOVE );

Value = bestmove (next_board );

Use best_move to record the maximum value of this loop;

}

Return best_move;

}

 

Set the set of GO games to S, and the pair to the other. In the current situation, the lower point set can be combined. In this case, if a player has a game, there is a value function of playing M, the calculation steps are as follows:

1) For situation A, record the situation after the game moves M as B. Then both parties should take the "best game" and calculate the P income as C in the final game.

2) For situation A, the P-party pass (that is, the change to the sub-party) should be followed by the "best game", and P should gain d in the final game.

3) V (m) = c-d.

Therefore, the "best chess step" can be defined as "M" of the maximum value, which is recorded as "M '. Easy to get, and the value of chess and walking is also a recursive definition. It is easy to think that it can also be used to characterize the "intensity" of situation ".

Based on the above definition, we can introduce the concept of "chess and walking quality ". When q = 0, it is generally referred to as "Waste chess (useless chess )". When the game is over (I .e., M' = 0), Q is not defined as "game quality.

Furthermore, we can give a mathematical definition of the commonly used term "Chess force" in go (describing a person's go level): Set someone for a period of time (or in a certain game) if the set of chess steps is generated, the calculation formula of this person's "Chess force" is as follows:

The "Chess force" W is the average value of "Chess quality. It should be pointed out that "Chess force" is an inherent attribute of the player. The above formula is just its evaluation method, which is similar to the electric formula.

Based on the above mathematical definitions, I believe that to prove the rationality of the Monte Carlo Method in the evaluation of the Go situation, we need to prove the following two conjecture:

1) for a given situation A, both parties should take the best game and obtain the final result. Similarly, for situation A, the mathematical expectation of the final result is B, which is continued by two "players" with equal power. Then a is approximately equal to B.

2) R, a player who only generates random chess steps, can obtain R's "Chess force" through a large number of matches. R is an unknown constant.

This article does not provide mathematical evidence for the above two guesses, but I personally believe that these two guesses are correct.


********


I couldn't prove it at last. The last sentence was a bit rogue .........

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.