Whereas the original AlphaGo learned by ingesting data from hundreds of thousands of games played by human experts, AlphaGo Zero, also developed by the Alphabet subsidiary DeepMind, started with nothing but a blank board and the rules of the game.
Both AlphaGo and AlphaGo Zero use a machine-learning approach known as reinforcement learning as well as deep neural networks.
Reinforcement learning is inspired by the way animals seem to learn through experimentation and feedback, and DeepMind has used the technique to achieve superhuman performance in simpler Atari games.
Reinforcement learning also shows promise for automating the programming of machines in many other contexts, including those where it would be impractical to program them by hand.
In many real-world situations there may not be a large number of examples to learn from, meaning machines will have to learn for themselves.
Martin Mueller, a professor at the University of Alberta in Canada who has done important work on Go-playing software, is impressed by the design of AlphaGo Zero and says it advances reinforcement learning.
“It’s a nice illustration of the recent progress in deep learning and reinforcement learning, but I wouldn’t read too much into it as a sign of what computers can learn without human knowledge,” Domingos says.