Stateful Services — Scaled Simply (sorry, couldn’t resist the alliteration )

#Wallaroo pipelines (°) can distribute/fan-out their data-processing to multiple back-end nodes — in a map-reduce kinda. A (the?) cool aspect to this is that the processing can be stateful.
The way it works is surprisingly simple — they
  • Break “global state” into non-intersecting partitions, based on keys
  • Map keys to partitions with a “partition function”
  • Distributed partitions to nodes (redundantly, for reliability)
  • Send input data to partition/node based on the input key
  • Rebalance partitions when a node is added or removed
It’s a clever trick, check out the details at https://goo.gl/vAUEeS
(°) Wallaroo applications are composed of one or more pipelines. A pipeline starts from a source, a point where data is ingested into the application. It is then composed of zero or more computations or state computations. Finally, it can optionally terminate in a sink, a point where data is emitted to an external system.

Comments

Popular posts from this blog

Cannonball Tree!

Erlang, Binaries, and Garbage Collection (Sigh)