Automatic heap of Tetris

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Play the Russian square of MP3 when you are not sleeping at night. I think it is quite interesting to play. I want to see if I can write a program and let the machine play the game. Compare it with people to see who scored as high as possible. There is a key game rule: 100 points for a single row, 300 points for two rows, 700 points for three rows, and 1500 points for four rows. It is the easiest to delete a row at a time, with the lowest score. It is the most difficult to delete four rows at a time, and has the highest potential risk and a high score. The ultimate goal is to keep the score as high as possible when the game does not die. Obviously, this is an optimization problem.

The shape of the game's building blocks is random, assuming that they are subject to a certain probability distribution. In this way, the mathematical expectation of the total size of the building block is easy to obtain within the unit time. Set to v. Assume that the number of Delete rows is X, the number of Delete rows is Y, the number of Delete rows is Z, and the number of Delete rows is K, the final target function is:
Max (100x + 300 * Y + 700 * z + 1500 * K)

Constraints:
1. The number of remaining lines on the screen does not exceed the screen height (if the number exceeds, the game ends)
2. (x + 2 * Y + 3 * z + 4 * k) * screen width <= V

Feeling, whether the random block shape is subject to the regular pattern will affect the final result. This issue is not considered for the moment.

Obviously, it is impossible to raise the item in a similar way, or because there are only two blocks on the screen each time: the current block and the next block. It is impossible to obtain the sequence of all building blocks, and the sequence itself is random. Only heuristic policies can be considered.

Define a price function. Use the price function to determine the final position of the Building Block. After the position is determined, determine the action (movement and deformation) so that the building block can reach the specified position. There are several factors to consider:
1. Score
This is the most important
2. spaces that cannot be filled between blocks (the upper blocks can only be filled after they are removed)
This is also important. This indicator must be adjusted to a more important position than the score when there are many spaces or a large number of rows on the screen.
3. Number of blocks on the current screen
The number of rows determines the policy selection. If the number of rows is small, the score is high. When the number of rows is high, you must reduce the number of blocks on the screen to prevent the game from stopping when the number of rows exceeds the screen height.
4. Next building block
When a building block is placed in a certain position, a good choice should also be made for the next building block.

Obviously, the background of this problem cannot obtain the optimal solution unless it can obtain the order of building blocks before the game.

The above are common solutions. I am more interested in how to enable machines to get better accumulation of wood skills through continuous learning? That is to say, when a building block comes, how can we make the choice more reasonable?

In the past, when listening to machine learning, it seems that there was a similar problem, but it was just playing chess. Its learning process is to allow machines to compete with themselves, adjust various parameters through the process of confrontation, and continuously learn and evolve to achieve better chess.

How can we learn the evolution of this automatic accumulation of wood? You may consider this stupid method: randomly generate a lot of block sequence. On the one hand, you can search for its optimal solution through exhaustive search, and on the other hand, the machine can obtain a sequence through its own search strategy, then, compare the optimal solution with the machine selection, and adjust the parameters through Iterative Feedback?
It may not be feasible, because the optimal solution is difficult to come up. This is highly probable. However, you can obtain the approximate optimal solution in a certain way.

Think about it, just like the historical search system; think about it, so that the brain won't rust.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Automatic heap of Tetris

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Automatic heap of Tetris

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support