Debugging Distributed Services with Squash

Debugging distributed services is Teh Sux0r” — this is a very common complaint from people new to the world of distributed computing.
Mind you, it’s true, but it also kind of elides the point. “Old School” monolithic single-node apps may support things like setting breakpoints, stepping through code, following (and changing) variables on the fly, and so forth, but once you start thinking in distributed terms (Consistency, Latency, Resilience, etc.), the very act of “debugging” can interfere with the thing being debugged.
This, however, does not mean that there is no room for debugging. The more you focus on debugging the components — instead of the interactions between the components — the more value there is in “old school” debugging, and thats where Squash comes in (https://github.com/solo-io/squash)
From the docs
“Squash brings the power of modern popular debuggers to developers of micro- services apps that run on container orchestration platforms. Squash bridges between the orchestration platform (without changing it) and IDE. Users are free to choose which containers, pods, services or images they are interested in debugging, and are allowed to set breakpoints in their codes, follow values of their variables on the fly, step through the code while jumping between microservices, and change these values during run time.
It’s quite cleverly done, and involves three components — a Server that tracks the breakpoints across the components, and does orchestration, a Client the runs as daemons on each node alongs the components, and an IDE as the UI.
Since the client runs on each node, it shares the PID namespace of each process (container!) on that node, making them accessible to the server for debugging purposes.
The Squash client shares the hosts PID namespace (and hence can see all processes on the node, making them available to be debugged). The server proceeds to orchestrate the debugging across all these commands, and it, in turn is managed via the IDE.
For now, the list is somewhat restricted, with support for Kubernetes/Istio(Servers), gdb/dlv/Java/Nodejs (Clients), and VS Code (UI), but the usual suspects seem to be on the roadmap, including Mesos/Docker Swarm/Cloud Foundry, Eclipse/Intellij, and Python.
I’m quite looking forward to see where this all ends up!

Comments

Popular posts from this blog

Cannonball Tree!

Erlang, Binaries, and Garbage Collection (Sigh)

Visualizing Prime Numbers