Posts

Showing posts with the label postmortem

All The Post-Mortems

Image
/via /via  http://www.bugbash.net/comic/113.html There has been a ton of stuff written about Post-Mortems (•), and I’m not going to bother re-capping the field. That said, the things to keep in mind, and  internalize , are 1. This is about fault-tolerance. Your mantra during the process should be “How can we prevent this from happening again?” 2. S**t  will  happen.  Failures aren’t “if”, they are “when”. So, keep asking yourself, “When it happens again, how do we recover?” 3. If you end up with human error as the root cause, you didn’t do it correctly. Mind you, “it” could be the post-mortem, or the system. - because, if you rely on humans being infallible, well, you’ve got a surprise in your future (and an unpleasant one at that) 4. It is not about blame. Seriously. I know we all say that, but this,  really  is not about blame. Go back and read -1- through -3- above, with emphasis this time  The best way to get a handle on post-mor...