Stateful Services — Scaled Simply (sorry, couldn’t resist the alliteration )
#Wallaroo pipelines (°) can distribute/fan-out their data-processing to multiple back-end nodes — in a map-reduce kinda. A (the?) cool aspect to this is that the processing can be stateful.
The way it works is surprisingly simple — they
- Break “global state” into non-intersecting partitions, based on keys
- Map keys to partitions with a “partition function”
- Distributed partitions to nodes (redundantly, for reliability)
- Send input data to partition/node based on the input key
- Rebalance partitions when a node is added or removed
It’s a clever trick, check out the details at https://goo.gl/vAUEeS
(°) Wallaroo applications are composed of one or more pipelines. A pipeline starts from a source, a point where data is ingested into the application. It is then composed of zero or more computations or state computations. Finally, it can optionally terminate in a sink, a point where data is emitted to an external system.
Comments