AlphaGo Zero artificial intelligence brings us inspiration

News classification

Contact us

Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
Tel: 13146317170 廖经理
Fax:
Email: 398017534@qq.com

In 2016, as one of the best go players in the world, Li Shishi lost four to one in AlphaGo in Seoul. Whether in the history of go, or in the history of artificial intelligence (AI), this is a big event. Weiqi plays an important role in the culture of China, Korea and Japan, just like chess in Western culture.
After defeating Lee Myung-bak, AlphaGo beat dozens of known human players in a series of anonymous online games, and then reappeared in May, dealing with Ke Jie, a Chinese chess player from Wuzhen, china. But Mr. Ko's performance was not as good as Mr. Li's, and finally lost to the computer with 3-0 of the score.
For AI researchers, Weiqi is also revered. Chess appeared on the computer in 1997, and Garry Kasparov fought against a IBM named deep blue computer, and finally lost the game. But before Lee Myung-bak failed, the complexity of go was hard to perform on the machine. The victory of AlphaGo is very impressive, and it fully demonstrates the power of artificial intelligence called "machine learning", which aims to teach computers to teach themselves some complicated tasks.

By studying thousands of confrontations between human expert chess players, AlphaGo learns rules and strategies from these games, and then continues to improve in millions of games, thus learning to go. That's enough to make it stronger than any human being. But AlphaGo's company, DeepMind researchers, believes they can improve this technology. In a paper just published in nature, they published the latest version of "AlphaGo Zero"". It performs better in games, learns faster, and requires less computing hardware to do well. But most importantly, unlike the original, AlphaGo Zero successfully taught himself the game without asking for help from human experts.

This technology immediately attracted a lot of attention. Like many games, learning to go is easy, but hard to play. 2 to sunspots and albino players turns in a 19 vertical line and 19 horizontal line at the intersection of the chessboard placed pawn. The goal is to occupy more territory than the opponent. The pieces that are surrounded by the opponent will be removed from the chessboard. Players continue to move forward until both sides do not want to continue. Then, each person adds the number of his pieces to the intersection of the empty grid. Finally, a large number of them will be winners.
Difficulties come from a variety of possible approaches. There are 361 different places on the chessboard of 19x19, and the black one can place the chessmen first. Then, an 360 possible way. There are 10170 ways to walk on a chessboard, which is too big to be able to make any physical analogy (for example, about 1080 atoms in the universe).
And human experts are trying to understand the game at a higher level. The rules of go are simple, but there are a lot of different situations. Game player will talk about such as "eyes" and "ladder" like a chess game, as well as a "threat" and "life and death" concept. However, although human chess players understand these concepts, it is much more difficult to interpret a computer program in a hypertext manner. Instead, the original Alpha Go studied thousands of examples of human games called "supervised learning"". Because the human game reflects the human to understand this kind of concept, an access to the game enough computers can understand these concepts. Once AlphaGo has mastered the tactics and tactics with the help of human teachers, he has overcome many obstacles and began to participate in millions of unsupervised training games, each of which has improved its skills.
Supervised learning is more useful than weiqi. This is the latest in the field of artificial intelligence has the basic idea behind the progress, it helps the computer learn to do some things, such as face recognition in the photo, the reliable identification of human speech, effectively filtering e-mail spam. But, as Deepmind boss Demis Hassabis says, supervised learning is limited. It depends on the availability of training data, and provides data to the computer, so that it shows the machine what it should do. These data must be filtered by human experts. For example, the training data for facial recognition consists of thousands of pictures, some of which are human faces, others are not, and each of them needs human tagging. This makes the data very expensive, provided that they are available. And, as the paper points out, there may be some more subtle problems here. Relying on the guidance of human experts may limit the limits of human capacity for computers.
"AlphaGo Zero" was designed to avoid all of these problems and completely skip the train wheel phase. The project takes advantage of the rules of the game and the "reward function", that is, when it wins the game, it rewards a bit, loses and deducts a point. And then experiments continue, repeatedly through games, against other versions of themselves, and constrained by the reward mechanism, that is, as much as possible to win the reward, so as to maximize the reward.
The project starts with random pieces, and the machine has no idea what it's doing. But it has made rapid progress. One day later, the chess will rise to senior expert level. Two days later, it did more than beat Lee Myung-bak's version in 2016.
DeepMind researchers can observe their self - Leather

PREVIOUS：Artificial intelligence leads the financ NEXT：Android IOS mobile phone system face rec