Indeed, I wish someone could talk about how the value/policy thing works.

taneq · on March 10, 2016

As I understand it, the value network takes the place of the heuristic for scoring a given board layout, and the policy network takes the place of the heuristic for ordering moves from most to least promising.

When searching the game tree, at each ply the most promising N moves are examined (as determined by the policy network) and leaves of the game tree are scored by the value network.