This is a creation in Article, where the information may have evolved or changed.
Http://pan.baidu.com/s/1hr3kxog
http://download.csdn.net/detail/nehemiah666/9472669
There are nature on the paper, I translated the Chinese version, and recorded a narration alphago working principle of the video, is a summary of the principle of alphago work.
Here is the summary section:
For artificial intelligence, Weiqi has always been considered the most challenging classic game, due to its huge search space and difficult to evaluate the board surface and the walking sub. Here we introduce a new method: value networks to evaluate the board surface and use the policy networks to select the Sub. To train these deep neural networks, we have an innovative combination of supervised learning (learning from human professional competitions) and enhanced learning (learning from self-confrontational competitions). In the absence of any prospective search, these neural networks have the same level of sophistication as the most advanced use of the Monte Carlo Search (Mcts:monte) program, which simulates tens of thousands of random self-opposing disk boards. We also propose a new search algorithm that combines Monte Carlo simulation with a value network and a strategy network. Using the search algorithm, Alphago in the game with other go programs, won 99.8% of the board, and 5:0 defeated the European go champion. This is the first time that a computer program has defeated a professional Weiqi player in a full-size go confrontation, a feat that was previously thought to occur at least 10 years later.