AlphaGoZero and Multi-Task Learning

As you probably know — unless you’ve been hiding under a rock — AlphaGoZero beat AlphaGo 100–0, with no human training.
When you look under the hood, the fascinating thing is that almost 50% of the gain was accomplished through simply updating the architecture from a “convolutional” one to a “residual” one. (°)
The other 50%, interestingly, came from moving to Multi-Task Learning (MTL) — where you train your model across two related tasks. In human terms, think of this as the Karate Kid approach, where “wax on wax off” served to also teach karate (and, don’t forget, wax the car!).
In particular, MTL is useful when you want to
  • Focus Attention: It provides additional evidence of whether data is relevant or not
  • Eavesdrop: Sometimes it is easier to learn something via unrelated tasks, a-la the Karate Kid
  • Prevent Overfitting: It keeps you honest
  • Avoid Representation Bias: It keeps you generalized, so that you can apply your model to other things too
It’s a fascinating approach, and a particularly relevant one these days, as we strive towards AGI. For a seriously deep-dive into MTL, check out Sebastian Ruder’s excellent writeup at http://ruder.io/multi-task/index.html
(°) More on AlphaGoZero by Seth Weidman at https://goo.gl/uEMzsS

Comments

Popular posts from this blog

Cannonball Tree!

Erlang, Binaries, and Garbage Collection (Sigh)

Visualizing Prime Numbers