A article read Monte Carlo method: Google go robot Popular Science

Source: Internet
Author: User

Introduction to the Monte Carlo method the Monte Carlo method is introduced in this paper through five examples.

An overview of the Monte Carlo method is a computational method. The principle is to understand a system by a large number of random samples, and then get the value to be computed. It is very powerful and flexible, and quite simple to understand and easy to implement. For many problems, it is often the simplest method of calculation, and sometimes even the only feasible method.

It was born in the 40 's American "Manhattan Project", the name from the city of Monte Carlo, symbolic probability. The first example of π calculation is how to calculate pi π by Monte Carlo method. There is a tangent circle inside the square, and the ratio of their area is Π/4.

Now, within this square, randomly generate 10,000 points (i.e., 10,000 coordinate pairs (x, y)) and calculate their distance from the center point to determine if they fall within the circle.

If these points are evenly distributed, then the points within the circle should occupy the π/4 of all points, so multiply this ratio by 4, which is the value of π. The R language script randomly simulates 30,000 points, and the estimated value of π differs from the real value by 0.07%. Third, the calculation of the integration of the above method to promote, you can calculate any one integral value.

For example, calculating the function y = x2 the integral in the [0, 1] interval is to find the area of the red part.

This function has a value of 1 at the () point, so the entire red area is inside a square with an area of 1. Within the square, a large number of random points are generated, and you can calculate how many points fall in the red area (judging condition y < x2). This proportion is the required integral value. Using MATLAB to simulate 1 million random points, the result is 0.3328. The traffic Jam Monte Carlo method can not only be used for calculation, but also can be used to simulate the stochastic motion inside the system. The following example simulates a single-lane traffic jam. According to the Nagel-schreckenberg model, the vehicle's motion satisfies the following rules. The current speed is v. If there is no car in front, it will increase to V + 1 in the next second, until the specified maximum speed limit is reached. If there is a car in front, the distance is D, and D < V, then it slows down to d-1 in the next second. In addition, the driver will slow down randomly with probability p, reducing the speed of the next second to V-1. In a straight line, randomly generate 100 points, representing 100 vehicles on the road, and the probability p is 0.3.

, the horizontal axis represents the distance (from left to right), and the vertical axes represent time (top to bottom), so each line represents the next second of the road condition. As you can see, the model randomly generates traffic jams (the black-gathered part of the graph). This proves that a single lane can cause traffic jams even for no reason. V. Product thickness a product is made up of eight parts stacked. In other words, the total thickness of the eight parts is equal to the thickness of the product.

It is known that the thickness of the product must be controlled within 27mm, but each part has a certain probability, the thickness will exceed the error. How much probability is there, the product thickness will exceed 27mm?

Take 100,000 random samples, each with 8 values, corresponding to the respective thickness of the 8 parts. The calculation found that the passing rate of the product is 99.9979%, that is, the probability of 21, the thickness will exceed 27mm. Securities Market Securities market sometimes active trading, sometimes deserted trading. Here is your forecast for the market. If the deal is deserted, you will sell 50,000 shares at an average price of 11 yuan. If trading is active, you will sell 100,000 shares at an average price of 8 yuan. If the deal is moderate, you will sell 75,000 shares at an average price of 10 yuan. Your cost is known to be between $5.5 and $7.5 per share, with an average of 6.5 yuan. What is your net profit for the next transaction? Take 1000 random samples, each with two values: one is the cost of the security (5.5 yuan to 7.5 yuan evenly distributed), the other is the current market state (deserted, active, moderate, one-third possible).

The average net profit for the simulation was 92, $427. Seven, refer to the link computer how to play chess: Monte Carlo Tree Search method

The Weiqi go Plate is composed of 19 horizontal lines and 19 vertical bars, and there are 19x19=361 intersection points. In addition, there are 13x13, 9x9 small chessboard. Go sub-divided into black and white, both sides hold a color of chess pieces, take turns a piece under the intersection point. At the time of finality, a party that occupies (enclosing) a "site" (that is, the number of intersections) wins.   The intersection of the gap is called "mesh", the surrounding site is also called "empty". In the process of chess, chess players often "number", that is, the calculation of the two sides currently surrounded by the "empty" size, in order to judge the merits of the situation.   The horizontal gap between the two sides is larger, often using the way of the sub-chess, that is, the weaker side of the board in the fixed position of the first place on the 1~9 sub-(called starters, let two sons, let three sons ...), and then the two sides take turns lazi.   "exponential growth" and "exptime-complete question"   exponential growth can be regarded as the first big "stumbling block" for large-scale computing. In a famous legend, Thessaloniki, the inventor of chess, Sessa asked his king for a reward, saying that he wanted to get a grain of rice because he invented the first lattice of the chess board, because the second lattice got two grains of rice, because the third lattice got four grains of rice, so it increased by one-fold in each subsequent lattice. The king, who did not know the exponential growth, readily agreed, and even some blamed Thessaloniki, for too little, but later discovered that the entire Treasury's rice was clean and still unable to fill the board. The end of the story is that the king was angry and ashamed to secretly send someone to kill Thessaloniki. Modern people who have studied geometric series know that the king, because of the total expenditure of the 64 checkers, 2^64-1=18446744073709551615>10^19 grain meters, which is estimated to have exceeded the total amount of rice produced throughout human history!   Return to the complexity of the changing situation. Even if the astronomical 10^19 such as "just" is a starting from the current disk, each time only consider 2 ways to go, 64 steps after the size of the possibility of space. For complex chess such as chess and go, the complexity of all changes (also known as the poor lifting complexity) from the initial disk surface is much more unimaginable. Information Theory founder Claude Shannon in 1950 the first to estimate the poor lifting complexity of chess is probably in the 10^120 species change around, the specific number is later known as "Shannon number." And go's poor lifting complexity is far beyond the chess, reached the astonishing 10^360. As a comparison, the total number of atoms in the observable universe is estimated to be "only" 10^75.   Some people will ask, in order to analyze the current disk must be poor to lift all future trend of the possibility? Is it possible to have an efficient algorithm that still makes an accurate assessment of the current disk while avoiding the possibility of traversing exponentially increasing space? The answer is, for chess and Weiqi, IWe can mathematically prove that not only is the complexity of exhaustive, but also the computational complexity of the change of the situation must be exponentially increased with the number of steps taken into account! For any given disk, we define the "optimal value" of the disk surface as the result of the final game when both sides of the game have "perfect walk". If the optimal value of a disk is "black wins", that is to say in black chess itself is not wrong in the case of white chess regardless of the effort will be defeated. Theoretical computer scientists have in 1981 and 1983 proved that chess and go are exptime-complete, which means that "any" method of correctly calculating the optimal value of the disk is "necessarily" exponentially increased with the size of the chessboard (or the average number of steps in the game). In fact, the computational complexity of most popular "double zero-sum" chess is exponentially. Some chess, such as checkers, Gobang, their size is small enough, so its initial surface of the optimal value has been calculated. But complex chess, such as chess and go, calculates the optimal value of its initial disk surface, which seems to be far from the current hardware computing power.   0 and the game is also called zero-sum game (zero-sum game), is a concept in game theory, refers to the game (game) The two sides is a competition rather than cooperative relations, or a "death-and-death" state. For example, two people in the game, one win, the other must be lost, there is no "double win". Win chess 1 points, lose 1 points, the sum of the two points is always 0, so called 0 and the game.   Monte Carlo Tree Search method

The π value is estimated using the Monte Carlo method. Map/Wikipedia

The more random points (n) you select, the closer the estimate is to the true value of π. Figure/Wikipedia   This choice is largely related to the overall abandonment of the existing Go chess system for the disk static evaluation method. As mentioned earlier, because people in the design of a certain commonality of the go disk static evaluation function of the problem for a long time, about 2002 years later people began to think of another completely different way to quickly evaluate the disk, this is the Monte Carlo sampling.   As a general computational method, the idea of Monte Carlo sampling is to "conceptually" intelligently construct a random process that, when we are solving a definite but unknown value, makes a numerical feature of the stochastic process converge to the value we require by probability, and then "in practice" This value is statistically estimated by sampling the random process.   For example, a Monte Carlo method for calculating pi π is to randomly select several points in a square region of 0≤x≤1 and 0≤y≤1 in a two-dimensional coordinate system, and to determine whether each point (x1,y1) falls within a "unit circle with a radius of 1 at the origin" (that is, the decision   X12+y12 is less than 1). According to the central limit theorem, the proportion of these random points falling in the unit circle is rapidly tending to Π/4 by the large probability. So the more random points we pick, the more likely we are to get a closer estimate of the true truth.   The same "Monte Carlo" thought can also be used for Weiqi panel evaluation. As mentioned earlier, each go disc has an "optimal value", corresponding to both sides of the game using the perfect way of the case of the final result of the disk. For Go has proved that the time to calculate the optimal value at least with the disk to the end of the number of steps between the number of exponential growth (an average of 200 steps, the average increase in each step of 200 times times the number of possible disk). Since it is theoretically impossible to get the optimal value, is it possible to do some sampling of the whole possibility space according to Monte Carlo thought, and then approximate the optimal value by the method of statistical estimation? People's thinking about this problem finally made a breakthrough in 2006, proposed a dynamic evaluation method called Monte Carlo tree Search.   It should be pointed out that the existing Monte-Carlo tree search method can ensure that the results of a large number of samples converge to the optimal surface of the disk, but the number of samples required to achieve "sufficient convergence" is still exponentially increasing with the size of the entire probability space. But in the practice of the Go Chess system, the Monte Carlo Tree search does show much more than the traditional method when the game time is limited. In recent years, people have been encouraged by this observation, in the selection strategy to add more and go related expert knowledge, so that based on the Monte-Carlo Tree Search chess system level of the go.  1, some go software will trigger "death or death" or "robbery" mode under certain conditions, but these optimizations are more like a special tactic in special cases, rather than as a basicThinking mode.  2, here "think depth" is defined as the number of steps to expand each change. Suppose a machine was supposed to be able to examine b^d changes for a second, with a corresponding depth of thought of D. The addition of the B-fold hardware capability allows the machine to examine bxb^d=b^ (d+1) changes in a second, corresponding to a depth of d+1. The use of αβ pruning method makes the machine only need to examine (b^2d) ^ (=b^d) changes to achieve and investigate b^2d changes the same effect, the corresponding depth of thinking is 2d.

3. As already emphasized, the strategy used in chess is only in "effect" equivalent to the exhaustive of all changes in a certain stage, and does not really make the whole possibility space in the actual operation process. 4. In addition, the Monte Carlo Tree search method has been widely used as a general dynamic evaluation method in "General game competition" (this kind of competition requires the game design for chess games which don't know the specific rules beforehand). Welcome to join this site open Interest groupBusiness Intelligence and Data analysis group interests include a variety of ways to make data value, practical application of case sharing and discussion, analysis tools, ETL tools, data warehousing, data mining tools, reporting systems, such as a full range of Knowledge QQ Group: 81035754

A article read Monte Carlo method: Google go robot Popular Science

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.