How to Write Good Ops Documentation

In my travels I’ve worn the hat of a DevOps guy a few times, and have written documentation, manuals, and guides countless times wearing an entire rainbow of hats. I’ve also had the good fortune to read documentation left by previous engineers at the larger organizations I’ve worked with. Both of these experiences have given me some useful insights into techniques for documentation, good and bad, and what can result from both.

Monkey Proofing

A long time ago, in a startup far, far away, I was building the CyberJocks LAN game center brand into what we hoped would become a national franchise. We only ended up with two locations, in New York and Texas, but that’s a story for another time (and requires beer). Building any brand requires consistency. Consistency in public image, trade dress, operations, and product, to start. Without this consistency, the customer will end up confused, and the brand identity is diluted and far less effective. We figured that comprehensive documentation on The Right Way To Do Thingsâ„¢ would yield consistency.

It did not.

We wrote mighty tomes of operational procedures, guidelines, and standards. We had every last step spelled out, to ensure that the most green member of our team would be able to pick up the three-ring binder from under the front counter, turn to page 314, and follow a step-by-step guide leading her to consistency nirvana. We called it monkey-proofing, the idea to write things so explicitly and clearly, that even a monkey could follow the directions. And we were very good at it. Our directions divined every detail, clarified every concern, and indicated every idiom required to run the business. Whether it was making a pepperoni pizza panini (16 slices of pepperoni), or creating a new user account (legal guardian signature, please), we had it all laid out.

But nobody read the damn thing!

Hundreds of pages of policy and procedure. Step by mind-numbing step. All laid out. Nobody had time to read that much documentation, and nobody was inclined to read it. Of course, any national franchise is going to have The Book just like this, but it is a poor solution.

Later in life, we learned that the best way to set things up is to carefully design a whole system to encourage the right outcomes. Make the easiest thing to do the correct thing, and leverage the natural human tendency of laziness instead of fight it. Engage the team, infuse them with the core beliefs, the driving principles, the why we do it, not the how or the what. Armed with the why, let people think for themselves, empower them to decide the course of action. Amazing things happen this way.

Technical Docs

So what does a franchise manual have to do with technical documentation? Quite a bit, it turned out. As I made my way through the docs of some larger companies, and worked with their operations and development teams, I started to discover a few patterns in the writing, and patterns in the behaviours of those consuming it.

The writing itself fell into several categories:

  • Non-existent
  • Hastily sketched and incomplete
  • Stale, obsolete, or inapplicable
  • Cookbook-style

These all fell short. The first three are obvious in their faults, and serve to illustrate that curation of a document repository is a complex and sizable task, and important to the organization. It’s best to have someone who does this and nothing else, looking after the docs.

The last one ended up being very common. It would tend to specify a problem-solution type of pattern, the same thing you seen in all those cookbook books for tech stuff. Run into a problem with Foo? Follow these steps! The steps would then specify specific commands, options, files, servers, etc., and if followed claimed that the problem with Foo would go away.

But there’s an insidious problem with cookbooks! I’m sure you could guess that they are at high risk of going stale, and quickly. They do precisely that. Beyond the information rotting into irrelevance, the deeper trouble is in the way the consuming engineer responds to a cookbook.

It discourages independent thought and understanding. By providing exact steps, the author believes they are helping out the next guy as much as possible. Simple! Paste this code in your terminal and you’re done! But by doing this, they are short-circuiting the engineer’s mind. They lose the chance to learn, to deepen their understanding of the system.

Understanding The Whole Trumps Specified Recipes

Understanding lets you cover the things a cookbook doesn’t. When situations change, when the docs go stale, when the system throws a fit, and the unexpected becomes the reality, the engineer needs to be able to work it out. They need the resources to think through things. They need a map that can point them in the right direction, clue them in to non-obvious aspects, and offer wisdom hard-earned from previous challenges that were solved.

So how to write better docs? Focus on communicating the fundamental concepts. The goal for better docs is to imbue understanding of the system as a whole. What are the high-level components? How does data flow through? Where are things located, and what bits interact with each other? If something odd happened in the past, document the situation and the solution, and include hard data. Armed with this foundational material, an engineer has a fighting chance when the system goes haywire. It’s ok for some procedures to be outlined, but they should be annotated to account for expected, and where possible, unexpected change. Where cookbooks may leave you wanting, true understanding is a recipe for success.

 

Aaron K.

Aaron Kondziela is a technologist and serial entrepreneur living in New York City.

 

Leave a Reply

Your email address will not be published. Required fields are marked *