Implementation and consideration of Go game programs (1) -- Situation Evaluation

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Two months after graduation, I wrote this article.ArticleIt's my bid-Go gameProgramDesign and implementation. I chose this topic as an example. I didn't know how to implement it at one time. Fortunately, the omnipotent Weibo gave me the core of modern computer go.Algorithm-- Uct algorithm. The Monte Carlo situation evaluation algorithm is the basis of the uct algorithm. As the first article in this series, this article first introduces the Monte Carlo situation evaluation algorithm.

A major difficulty in the computer go game is that it is difficult to design a simple and effective situation evaluation algorithm. The traditional go program mainly uses expert knowledge such as the impact function to evaluate the situation. It is difficult to abstract the expert knowledge of Go (think about thick, thin, gas, and other words ......), The evaluation is often inaccurate. How can so be accurately evaluated? Poor effort! Yes, if both sides of the black and white sides are "correct" for each subsequent game and the black game wins, then the current situation must be the advantage of the black game.

Quantum computers have not yet worked out, so I 'd like to find a feasible method. Suppose that if the black and white sides do not have the same playing power, then I will continue this situation. In the end, black games won, who was the advantage of the current situation? You may say that the advantages of black games are more likely. Furthermore, the same situation has continued for 1000 times, and 800 times have won the game. Do you basically believe in the advantages of the game?

So what if neither the player nor the player can go at random? They continued to deal with this situation for 1000 times, and 800 of them were successful in black games ......

This is the Monte Carlo situation evaluation algorithm, so easy ~

**************

In this paper, we attempt to explore the rationality of the Monte Carlo situation evaluation algorithm from a mathematical perspective (it was later known that someone had proved it ......

3.3.4 mathematical definition and precise quantification of Go game value

The theoretical basis of the Monte Carlo method is the "Big Number Theorem" in probability statistics ". In this section, I will try to explore the rationality of the use of the Monte Carlo Method in go situation evaluation from the mathematical perspective.

The general idea is: the value of every player in go can be accurately quantified using mathematical methods, in this paper, we propose a mathematical definition of "Chess value" (as well as a series of mathematical definitions of the relevant concepts in the Go field). It is also a method for precisely quantifying "Chess value.

First, for a zero-sum game with complete information such as go, for any given situation, there must be an objective "best chess step" for the sub-party (the best game step for the Party to win the most in the final game). In theory, we can use the exhaustive method to find the "best chess step ", the following describes the pseudo-Recursive Implementation of the exhaustive method.Code:

Table 3-7 pseudocode 2

Move bestmove (goboard Board)

{

For (move in board. next_move_set ){

Next_board = board. Move (MOVE );

Value = bestmove (next_board );

Use best_move to record the maximum value of this loop;

}

Return best_move;

}

Set the set of GO games to S, and the pair to the other. In the current situation, the lower point set can be combined. In this case, if a player has a game, there is a value function of playing M, the calculation steps are as follows:

1) For situation A, record the situation after the game moves M as B. Then both parties should take the "best game" and calculate the P income as C in the final game.

2) For situation A, the P-party pass (that is, the change to the sub-party) should be followed by the "best game", and P should gain d in the final game.

3) V (m) = c-d.

Therefore, the "best chess step" can be defined as "M" of the maximum value, which is recorded as "M '. Easy to get, and the value of chess and walking is also a recursive definition. It is easy to think that it can also be used to characterize the "intensity" of situation ".

Based on the above definition, we can introduce the concept of "chess and walking quality ". When q = 0, it is generally referred to as "Waste chess (useless chess )". When the game is over (I .e., M' = 0), Q is not defined as "game quality.

Furthermore, we can give a mathematical definition of the commonly used term "Chess force" in go (describing a person's go level): Set someone for a period of time (or in a certain game) if the set of chess steps is generated, the calculation formula of this person's "Chess force" is as follows:

The "Chess force" W is the average value of "Chess quality. It should be pointed out that "Chess force" is an inherent attribute of the player. The above formula is just its evaluation method, which is similar to the electric formula.

Based on the above mathematical definitions, I believe that to prove the rationality of the Monte Carlo Method in the evaluation of the Go situation, we need to prove the following two conjecture:

1) for a given situation A, both parties should take the best game and obtain the final result. Similarly, for situation A, the mathematical expectation of the final result is B, which is continued by two "players" with equal power. Then a is approximately equal to B.

2) R, a player who only generates random chess steps, can obtain R's "Chess force" through a large number of matches. R is an unknown constant.

This article does not provide mathematical evidence for the above two guesses, but I personally believe that these two guesses are correct.

********

I couldn't prove it at last. The last sentence was a bit rogue .........

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Implementation and consideration of Go game programs (1) -- Situation Evaluation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Implementation and consideration of Go game programs (1) -- Situation Evaluation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support