23216 - From the Agent to AlphaGo

N. Lygeros

To be intelligent is quite easy. To understand it is difficult and to create an artificial intelligence is tough. After the death of Marvin Minsky, the victory of Alpha Go against the three-time European Go champion Fan Hui, is a continuation of a long path. Sure AlphaGo was in a closed world but, and this is the key point, not with a brutal force. There is a big difference between this match and the previous in 1997 with Kasparov and Deep Blue in chess. And the difference is not only due to the fact that Go is even harder than chess. The difference comes from the methodology. The Agent was the first computer program capable of learning a wide variety of tasks. It learnt to play 49 different retro computer games and by this way, found its own strategies for winning. With Deep Blue programmers and grand masters, managed to teach to the machine aspects of knowledge. In the case of the Agent the lesson starts from the bottom level. So the starting point is randomness and only after, a deep learning and a reinforcement learning. And with this, the strategies which are developed by the Agent are different from the previous ones. So it is better to see AlphaGo as the evolution of Agent in the field of Go, rather than an independent product. The problem with Go in comparison with chess, is the evolution of a winning position. AlphaGo uses two neural networks to make decisions and select choices, one of them is the value network. For this reason, we are only at the beginning and not the end as with Deep Blue.