Posts

Showing posts with the label distributed computing

Debugging Distributed Services with Squash

Image
Debugging distributed services is Teh Sux0r”  — this is a   very   common complaint from people new to the world of distributed computing. Mind you, it’s true, but it also kind of elides the point. “Old School” monolithic single-node apps may support things like setting breakpoints, stepping through code, following (and changing) variables on the fly, and so forth, but once you start thinking in distributed terms (Consistency, Latency, Resilience, etc.), the very act of “debugging” can interfere with the thing being debugged. This, however, does not mean that there is   no   room for debugging. The more you focus on debugging the components — instead of the interactions between the components — the more value there is in “old school” debugging, and thats where   Squash   comes in ( https://github.com/solo-io/squash ) From the docs “Squash brings the power of modern popular debuggers to developers of micro- services apps that run on container orchestra...

Stateful Services — Scaled Simply (sorry, couldn’t resist the alliteration )

#Wallaroo   pipelines (°) can distribute/fan-out their data-processing to multiple back-end nodes — in a map-reduce kinda. A (the?) cool aspect to this is that the processing can be stateful . The way it works is surprisingly simple — they Break “global state” into non-intersecting partitions, based on keys Map keys to partitions with a “partition function” Distributed partitions to nodes (redundantly, for reliability) Send input data to partition/node based on the input key Rebalance partitions when a node is added or removed It’s a clever trick, check out the details at   https://goo.gl/vAUEeS (°) Wallaroo applications are composed of one or more pipelines. A pipeline starts from a source, a point where data is ingested into the application. It is then composed of zero or more computations or state computations. Finally, it can optionally terminate in a sink, a point where data is emitted to an external system.

Two Hard Things in Computer Science

Image
The original There are only two hard things in computer science - cache invalidation, and naming things The variant There are only two hard things in computer science - cache invalidation, and naming things, and off-by-one errors The distributed systems variant There are only two hard things in distributed systems -      2. Exactly-once delivery      1. Guaranteed order of messages     2. Exactly-once delivery Humor thereof /via CommitStrip

Fallacies of Distributed Computing (Oldie but Goodie)

Image
Arnon Rotem-Gal-Oz goes through the fallacies of distributed computing with panache. What?  You've never heard of them? Well, they are The network is reliable Latency is zero. Bandwidth is infinite. The network is secure.  Topology doesn't change.  There is one administrator.  Transport cost is zero.  The network is homogeneous His summary is particularly poignant With almost 15 years since the fallacies were drafted and more than 40 years since we started building distributed systems – the characteristics and underlying problems of distributed systems remain pretty much the same. What is more alarming is that architects, designers and developers are still tempted to wave some of these problems off thinking technology solves everything. Remember that (successful) applications evolve and grow so  even if things look Ok for a while if you don't pay attention to the issues covered by the fallacies they will rear their ugly head and bite you...