“What Am I Thinking Of?” — Deep Learning Edition

It’s a bit of a holy grail really — if we could, somehow, figure out what all that brain activity actually corresponds to, well, that would change everything, wouldn’t it? Control systems powered by thoughts would move from science fiction to reality , as would the abuses to privacy . That said, the brain has been remarkably opaque to our probing thus far, so I guess the future is still the future, right?
Maybe not so far in the future though. The first crack came from fMRI, where, to simplify things, the experimenters basically stuck people in MRI systems, showed them a picture (of, say, a leopard), and recorded what happened in the brain.
The good news is that a bunch of stuff would light up in the brain, with the equivalent bad news being that a bunch of stuff would light up in the brain, and nobody knew what it meant.
Some researchers went about this a bit more scientifically, and instead of a leopard, would show the subjects simple stuff. Kamitani & Tong , for example, showed their subjects straight lines at different angles and recorded the fMRI results¹. With some statistical analysis, they were able to get a bit further, and say stuff like “yeah, the subject saw an edge at a 45° angle”.
This promptly got extended to building out entire databases (“this pattern in the brain means patient X saw Vermeer’s Girl with a Pearl Earring”)². Good stuff, but still a far ways away from generalization, largely because
a) You could only recognize things that were already in the database, and
b) Each database was patient specific.
This, of course, is exactly where Deep Learning (DL) comes into play. Way back in 2017 (that’s almost a century ago in DL years), Horikawa & Kamitani³ used Deep Neural Networks (DNN) to map what the fMRI saw in the subject’s brain against the image that the subject was looking at. Even more importantly, they found that the complexity levels of the visual features (think face vs oval) mapped directly to a hierarchical stimulation of brain regions in the visual cortex — which seemed to imply that the brain progressively recruits hierarchical regions of the visual cortex to decode the (complex!) visual features of whatever the eye is looking at.
Which brings us to now, when Horikawa & Kamitani⁴ (again!) have taken this whole thing one step further. They found a way to reconstruct the images that the visual cortex is “seeing”!
The trick here is work done by Aravindh Mahendran and Andrea Vedaldi⁵, who — in 2015 — had inverted DL-based image recognition, i.e., by looking at the way an image is “represented” in the DNN, they could figure out what the source image is.
The analogy at this point should be clear, no? The fMRI shows us way in which the data is “represented” in the brain’s Neural Networks. And Mahendran/Vedaldi’s technique can be used to reconstruct the original image from this encoding!
/via https://www.biorxiv.org/content/biorxiv/early/2017/12/30/240317.full.pdf
OK, it’s a bit trickier than that. After all, the brain’s Neural Network is actually not the same as a DNN, right? The way Horikawa & Kamitani got around that was by training another DNN on the same images. To quote,
The reconstruction algorithm starts from a random image and iteratively optimize the pixel values so that the DNN features of the input image become similar to those decoded from brain activity across multiple DNN layers.
There’s more, much more, go read the paper for the details. The results though shown for three subjects, are fascinating — you can clearly see some level of complexity to the features, colors, and textures. It’s not shown here but the results for recognizing alphabetical letters were even better — probably because they only involve edge detection and basic shapes.
(Incidentally, the reason for some of the images to be “photo-negative” like is that the luminance information gets lost in the fMRI to DNN translation)
/via /via https://www.biorxiv.org/content/biorxiv/early/2017/12/30/240317.full.pdf
Interesting times indeed — at this rate we may have to start thinking about privacy considerations in this area before too long!
  1. 1. Decoding the visual and subjective contents of the human brain” — by Yukiyasu Kamitani and Frank Tong
  2. 2. Identifying natural images from human brain activity” — by Kay et al.
  3. 3. Generic decoding of seen and imagined objects using hierarchical visual features” —by Tomoyasu Horikawa and Yukiyasu Kamitani
  4. 4. Deep image reconstruction from human brain activity” — by Tomoyasu Horikawa and Yukiyasu Kamitani
  5. 5. Understanding Deep Image Representations by Inverting Them” — by Aravindh Mahendran and Andrea Vedaldi

Comments

Popular posts from this blog

Cannonball Tree!