To begin, AlphaGo took 150,000 games played by good human players and used an artificial neural network to find patterns in those games.
AlphaGo’s designers then improved the neural network by repeatedly playing it against earlier versions of itself, adjusting the network so it gradually improved its chance of winning.
To get over this hurdle, the developers’ core idea was for AlphaGo to play the policy network against itself, to get an estimate of how likely a given board position was to be a winning one.
AlphaGo combined this approach to valuation with a search through many possible lines of play, biasing its search toward lines of play the policy network thought were likely.
In an earlier paper, the same organization that built AlphaGo – Google DeepMind – built a neural network that learned to play 49 classic Atari 2600 video games, in many cases reaching a level that human experts couldn’t match.
Many of the projects employing these networks have been visual in nature, involving tasks such as recognizing artistic style or developing good video-game strategy.
With the images below, the network classified the image on the left correctly, but when researchers added to it the tiny perturbations seen in the center image, the network misclassified the apparently indistinguishable resulting image on the right.