CPU Utilization in Erlang R15B02

Rickard Green makes an interesting point on the erlang-questions mailing list regarding CPU utilization in Erlang R15B02
Regarding increased CPU utilization in R15. When schedulers run out of work, they busy wait for a while before going to sleep. Waking up a busy waiting thread is much faster than waking up a sleeping thread. Due to the rewrites of memory allocation in R15, schedulers are more frequently woken, which cause more busy wait, which in turn cause an increase in CPU utilization when schedulers frequently run out of work (you will at least see some decrease of CPU utilization due to this in R16). When not running out of work there will be no busy wait at all. That is, the increase in CPU utilization does not translate into loss of performance. The busy waiting is there since it shortens the average time to wake up a scheduler, and by this reduces average communication latency between processes. Depending on application the reduced latency might also translate into improved throughput. If the increase in CPU utilization is unwanted, one can as of R15B02 shorten the busy wait threshold (+sbwt command line argument). Note that by shortening the busy wait threshold, there will be an increased average latency.
For those of you who didn't feel like reading through all of that, it basically boils down to the following
  • The Erlang VM has schedulers that, well, schedule processes
    • Basically, they look at all the processes waiting to "do stuff", figure out which one needs to run next, and run that process
  • As long as there are processes waiting to be run, the schedulers are busy
    • Note that busy is not the same as inefficient.  They are, if anything, ludicrously efficient
  • If there are no processes waiting to be run, the schedulers "busy wait". 
    •  Basically, they act like you do when you've drunk a lot of coffee, but have nothing to do, i.e., act all jittery and fidgety.  This way, if a process shows up, they can spring rapidly (and caffeine-atedly) into action
    • As a side effect, the CPU looks like the system is busy.  In reality, however, its busy "wait"-ing, i.e., not really doing anything
  • After a while, if no processes show up, the schedulers go to sleep
    •  And, consequently, the CPU utilization goes down
 Got that?
The bottom line is that in R15B02, higher CPU utilization doesn't necessarily mean that the erlang VM is thrashing - it could very well mean that the VM is actually not doing anything.
The key, of course, is to know which of the two it actually is doing...
 Mind you, this isn't exactly free.
As @duomark points out, this is "responsiveness as a service", with the higher CPU utilization translating to a higher electricity bill.
The tradeoff is, however, under your own control.  From the erl man page


+sbwt none|very_short|short|medium|long|very_long



Set scheduler busy wait threshold. Default is medium. The threshold determines how long schedulers should busy wait when running out of work before going to sleep.
So yeah, you can lower the wait time before the scheduler goes to sleep, thus reducing your electricity bill, at the cost of increasing the response time when a new process comes along.



Comments

Licenser said…
Aloa,
I see one disadvantage of this - in a multi application system one VM running busy wait might impact another application doing actually busy work or can a system distinguish between busy wait and busy do?
dieswaytoofast said…
Licenser: Sadly, as far as most VMs go, there really is no way of knowing the difference. A simple thought-experiment should suffice - I might actually *want* to running a bajillion NoOps (load testing, efficiency checking, whatever).
So yeah, YMMV for sure...
Anonymous said…
I think the correct url for Rickard's post is http://erlang.org/pipermail/erlang-questions/2012-September/069306.html .
dieswaytoofast said…
Good catch - thanks!

Popular posts from this blog

Cannonball Tree!