Transferring Font “style” via Deep Learning

There’s been considerable work done in transferring the “style” of a picture to another image (like, “make cabins look like they’re in the depths of winter” as seen below)
 Doing this for pictures, it turns out, is a lot (lot!) simpler than doing it for fonts. Turns out that designing fonts is hard, and, frankly, a bit of a PITA. As a result, when you come up with a specific font for, say, a movie poster, or a logo, or whatever, you usually just design the glyphs for the specific use at hand. So, if your image has the text “PLUGH” in it, then you do the letters “P”, “L”, “U”, “G”, and “H”, put ’em together, tweak spacing, color, etc. and you’re done. You just don’t worry about the rest of the letters of the alphabet, casing, numbers, symbols, whatever.
So great, somebody comes along and wants to create an image with the text “XYZZY” on it, in the same font. (Remember, this somebody could be you!). The old school way would be labor-intensive — you just start from scratch.
In a new paper, Azadi et al. show how this can be done using Deep Learning (•). In particular, they created a conditional Generative Adversarial Network (cGAN) that does this for you. It is a stacked architecture, with two parts
  1. GlyphNet, which learns from the sample fonts to create the coarse glyph shapes for the text you want to generate, and
  2. OrnaNet, which fine-tunes the color, styling, and ornamentation of the fonts created by GlyphNet
The results are shown below. Given one word (TRANSFER), the cGAN generates the entire sentence (MULTI CONTENT GAN FOR FEW SHOT FONT STYLE TRANSFER).
Not bad, eh?
Mind you, this isn’t perfect (yet) — it’s not quite capable of the (very!) high font resolution that one tends to need for printing/publishing, large fonts, etc.
That said, it is a remarkably generalizable solution (who knew that GANs could do this!?!) for few-shot learning like situations, i.e., where complex content needs to be generated from a few limited samples. As the authors put it , example like “modifying a particular human face (style) to have a specific expression (content), consistent stylization of shapes such as emoticons, or transferring materials to consistent sets of objects such as clothing or furniture.

(•) “Multi-Content GAN for Few-Shot Font Style Transfer” — by Azadi et al. The code is available here.

Comments

Popular posts from this blog

Erlang, Binaries, and Garbage Collection (Sigh)

Cannonball Tree!

Visualizing Prime Numbers