Erlang, Binaries, and Garbage Collection (Sigh)

The nice thing about garbage-collection is that it is makes memory management absolutely transparent to you

Of course, the above is true right up to the point when it stops being transparent, at which point it does a phase change from nice to clusterf**k of biblical proportions.  There are entire posts, books, and suicide notes that have been written about this, and I have no great desire to add to them, save with the minor comment that paranoia about garbage-collection is still justified, albeit only in the edge-cases for most people.

That said, I'll give you that Erlang GC is in many ways much, much better positioned in this regard than, oh, Java, but it still rears up and bites you in the butt when you least expect it. (More information about Erlang's GC vis-a-vis Java here, and Jesper's truly excellent description of Erlang's GC here).  A story from the past should serve to illustrate this.

Once upon a time, in a magical land far far away - well, Herndon, Virginia, but whatever - we'd built out a huge CloudBuzzWordSystem™ which was mysteriously running out of memory.  The relevant part of the system dealt with processing phone calls, with the key details were - kinda sorta - as follows
  1. All the information about a call - things like the caller info (Dick Taylor, +17739191234), callee's name (Lizzie Burton, +13122819876),  accounting information (Code: 5162, Billing: 37335, Account: 171), etc. came in over a socket, and was captured by a newly spun up per-call process as a single binary about 400K in size.
  2. This binary got passed on to various other processes in the system that dealt with the parts relevant to them.  For example, the account processes would check the caller for permissions (yeah, Dick can call Lizzie), a chargeback process would use the account information to create billing records (Account: 171, 3 cents), a UI process would take the names and phone numbers and display them on the user's GUI, blah, blah blah.
  3. Once the call ended, the per-call process in (1) above was destroyed, thus, theoretically, freeing up the memory occupied by the binary.
  4. But, mystery of mysteries, the memory didn't actually get freed, but kept increasing at around 400K/call :-(
Readers of the Erlang persuasion are probably already pointing at the screen muttering variations on the theme of "You Fool! Of course the memory increases! By passing the binary around you're probably just creating matches and/or sub-binaries, and till they are garbage-collected, the memory will not Go Away".
And, of course, they are Correct, because I didn't actually point out that in (2) above, the individual processes actually copy out the parts of the binary that they care about, because we actually weren't that dumb y'know?
(Stupid - yes.  Dumb, no.  And yeah, we didn't do it the first time around, but did get around to it, so please read on).

At this point, some of you might be wondering what on earth I am wittering about vis-a-vis matches, sub-binaries, and copying.  A brief digression should help -

In Erlang, there are two types of binaries - heap-binaries, and refc-binaries.  They basically work as follows
  • Heap-binaries are binaries that are up to 64 bytes in size, and are stored in each process's heap. They work just like any other piece of data for a process - copied when passed as a message, discarded when garbage-collected, etc., etc.
  • Refc-binaries (reference-counted) are binaries > 64 bytes in size.  They are stored in a totally separate area in the VM (think of this as the "large binary area").  A reference to this binary - called a ProcBin for some obscure reason - is stored in each process's heap.  When you pass this type of binary over in a message, a new ProcBin gets created on the target process's heap. (OK, this isn't strictly true, but its good enough for this post.  The point here is that we just pass along a reference to the original (large) binary, not the (large) binary itself). 
To complicate matters just a wee bit more, Erlang does one additional bit of optimization here.  When you use either erlang:split_binary/2 or you match out a binary pattern, the resulting variables aren't necessarily new binaries! Instead, the VM creates a new type of binary called a sub-binary, which is, basically, a reference into the original binary (which could be either a heap-binary or a refc-binary from above). 
Got that?
Effectively there are 2x2 = 4 different types of binaries, three of which actually consist of references to the actual binary.

At this point, you should  saying Uh-Oh, because the thing about references, the tricky thing about references, the painfully tricky thing about references is that you can only garbage collect the original item when all the references are gone.
So yeah, when you pass (large) binaries around, they hang around till there are no more references to them, from anywhere, at which point they get garbage-collected. 
So, if you'd called split_binary/2, or matched a section out, or created a match context, till those binaries got garbage-collected, your original binary would hang around using up memory.

And that is what the "You Fool! Of course, the memory increases!..." bit from way earlier was about.
And that was why, whenever we matched the binaries out,  we explicitly binary:copy them into new heap-binaries, which, we were positive, would take care of the problem.
(Why?  Because, we no longer cared about the the original references - the ProcBins - since the copy created a brand-spanking-new binary)

Except (sigh), the problem didn't go away. 
At all.
Oh, we mitigated most of the issues through some very aggressive (and goofy) garbage collection which I will not get into because it was highly embarrassing, but we pretty much learned to live with the problem.

Fast-forward about three years, when, thanks to an incidental post by Robert Virding on the topic, we figured out what the problem was.  It turns out that refc-binaries keep track of every process that has ever touched them!
I know, its pretty obvious in retrospect, but the point here is that a refc-binary is not clobbered till every process that has ever touched it has been garbage-collected.
Not just processes that do something to the binary. Any process that touched it.
And therein lies the rub. Some of our processes barely did anything at all. For example, we had a few processes that acted as extremely simple "routers" - simply passing the binary along to an appropriate destination based on request type -  and they didn't actually manipulate the binary in any form or fashion.  Think of them as the equivalent of me saying "Hey Bob, if you get a chance, drop this package off at Mary's place"and Bob never even looks inside the package.
The thing is, this "router" process was also on the list of processes that had "touched" the binary!!!
Given that this "router" process barely did anything, it also pretty much never got garbage-collected.
Which meant that the original binary hung around for a long time without getting garbage-collected.
And, of course, there were a bunch of similar processes that barely did anything, and, like an unwelcome guest, would simply refused to co-operate and let the binary depart.

At this point, you're probably saying to yourself - "Yeah, yeah, this is all well and good, but what can I actually do about this?"
Tragically, there are no good / universal solutions to this problem - just a lot of "It depends...".  For example
  • If your code / system / algorithm / whatever allows for it, you could copy out your binaries before passing them around.  That way, all your binaries at the destination are heap binaries, and your initial (large) binary gets garbage-collected when the originating process dies.  But, this might not actually be an option for you (it certainly wasn't for us), not to mention all the extra binary data that you are creating.
  • If you know exactly which processes are the offenders, you could explicitly call erlang:garbage_collect on/in them. But, the ensuing timing issues might prevent you from being able to do this
  • Instead of the garbage_collect, you might be able to use spawn_opt (either to spawn the processes directly, or with your gen_*) to set {fullsweep_after, 0}, thus ensuring that the unused binaries get thrown away ASAP. (See the Note at the bottom for more about this).  Alternatively you can set fullsweep_after to a higher value based upon when you want your GC performed
Do note, however, that these are all potentially problematic - there are definitely pluses and minuses associated with each approach, and you really, really need to know what you're doing. e.g.,
The good news, however, is that you'll pretty much never run into this - and when you do, you hopefully have a deep enough understanding of your own system that you can, actually do something about it based on the above

And this, successfully, brings this whole post full circle - sometimes paranoia about garbage-collection is still justified...

Note: Robert Virding points out that "Processes use a generational garbage collection scheme where the most recently allocated data is collected most often. This is based on the heuristic that most data is short lived and can quickly be reclaimed which makes the [GC] more efficient.  This is generally very valid for Erlang.  It does mean however that for old data it can take a long time before it is reclaimed in a full sweep. [...] This can make matters worse in the case of references to large binaries as it means [that] they may stay around a longer time before they are reclaimed even if there is no live reference to them.


MononcQc said…
Hibernating the processes that don't do much might have been a decent way to solve the problem (untested). Hibernating forces a compaction of the stack+heap space, which goes through garbage collection, I believe.

If it truly does little, then you should be fine using hibernation after handling a message to keep everything clean.

This is still very hackish, but likely nicer than calling manual GCs, pending some rewrite that would let things be fine
dieswaytoofast said…
Hibernate does, indeed garbage-collect the process that was hibernated, which in this case would have helped with our problem. The downside with "hibernate" is that it adds a (potentially) problematic delay to the next request to the process, since the process will need to be "woken up"
Mind you, erlang:garbage_collect also adds an equivalent (potentially) problematic delay while the process gets garbage-collected.
The trick - if you use hibernate/GC - is to understand your own application flow, and know when you can use one vs. the other...

Popular posts from this blog

Its time to call Bullshit on "Technical Debt"

Visualizing Prime Numbers