Write Way Pro: reinforcement

Wednesday, November 22, 2017

Beyond Games: The Future of AI and AlphGo Zero

Intel's Bob Rogers explains the possibilities that emerge as AI progresses beyond standard machine learning. DeepMind's self-taught Go champion is just the beginning.

The next iteration of AI is the step of generating its own examples with which it builds the models to extract rules. That’s what AlphaGo Zero did in generating a million examples of different rounds of Go to improve its own play. That was achieved through reinforcement learning, which relies on “feedback — positive reinforcement for what’s right and penalties for what’s gone wrong,” Rogers said.
While that ability opens up great possibilities for systems to learn to answer the questions we want answered, the thing to remember is that the systems “are very much unitaskers,” Rogers said. AlphaGo Zero may be an unparalleled Go player, but playing Go is the only thing “the program is designed to do.” Through transfer learning, AI systems can shift to apply the same kind of deep learning to another domain. Still, they would not do so on their own; someone would have to set them up for that.

Search This Blog

Wednesday, November 22, 2017

Beyond Games: The Future of AI and AlphGo Zero

What AlphaGo Zero Means for the Future of AI

Search This Blog

Wednesday, November 22, 2017

Beyond Games: The Future of AI and AlphGo Zero

What AlphaGo Zero Means for the Future of AI

Subscribe to the blog