Face Synthesis Outside The Uncanny Valley

/via https://bit.ly/2KjRUCw
Remember Polar Express? For those who don’t (lucky you!), it was the ChristmasKidFlick™ for 2004, one supposed to impart the Xmas spirit, good cheer, Santa Claus, and all that. Instead, it ended up becoming the living (!) embodiment of the Uncanny Valley, filled with dead-eyed CGI zombie kids, and was — there is no other word for it — creepy.
The thing is, humans tend to be very good at identifying and classifying behaviors in others, especially base don facial expressions.
That little twitch in your beloved’s lip means that they found your behavior adorable, the almost invisible wrinkle by the left eye — you goofed up real bad.
And when the behavior displayed doesn’t match what we expect it to look like (“huh? why did the eye twitch like that?”), it’s like there is a blaring siren saying “THIS IS ODD WHY IS IT THIS WAY I DON’T GET IT”. In short, humanoid objects which appear almost, but not exactly, like real human beings elicit uncanny, or strangely familiar, feelings of eeriness and revulsion, along with a dip in our relatibility for it.
It’s not just Polar Express though, there have been any number of movies that have happily trod in the uncanny valley. Can we ever forget the visceral horror unleashed upon us when they decided to digitally remove Henry Cavill’s mustache in (the rightfully panned) Justice League?
The good news here is that Machine Learning is coming to the rescue. And no, I don’t mean the — deservedly hyped — realization that some basic ML could do a far superior job when it came to removing Superman’s mustache.
The context here is ground-breaking work from Microsoft’s AI folks (•) showing how you can take the headshot of a person (the subject), and then morph it to have the same posture, emotion, or lighting of a completely different person’s headshot (the sample).
/via https://www.microsoft.com/en-us/research/blog/believing-is-seeing-insightful-research-illuminates-the-newly-possible-in-the-realm-of-natural-and-synthetic-images
 As you can see in the sample above, the Subjects (shown as -a- “identities” above) take on the posture, facial expressions, emotions, and lighting of the Identities (shown as -b- “attributes” above). Most importantly, there is no Uncanny Valley effect at all — the generated faces look completely natural, and are similar to each other, as well as the subject.
Their work relies on using Generative Adversarial Networks (GANs) to separate out the “identity” part of the face (Think “This is Alice. It doesn’t matter if it is a headshot of her angry, from an angle, or whatever, it’s Alice”) from the “attributes” (think “angry, from an angle, against a blue background, dimly lit, etc.”). Once you do that, it becomes child’s play to recombine them so that you can use one person’s identity with another person’s attributes as shown in the examples above (or, for that matter, mix and match identities and attributes).
The details on how they do this are fascinating, and involve asymmetric loss functions that implements cross entropy loss when training the classification and discriminative networks. And yeah, if that means nothing, don’t worry. If it does sound interesting, go read the whole paper .
To bring this back to the real world, we now have the building blocks for modifying faces without entering the Uncanny Valley! We can
• Have a “stunt double” perform the scenes, and swap in the actor’s face, or even
• Have the actor perform the scene, and swap in the actor’s face when they were younger, didn’t have a mustache, or whatever.
Technology, it is a-changing, thanks to Deep Learning!

Comments

Popular posts from this blog

Cannonball Tree!

Erlang, Binaries, and Garbage Collection (Sigh)

Visualizing Prime Numbers