Designing for A Cold Day in Hell

Good systems work even when things go wrong.

No plan survives contact with reality, and nowhere is that truer than in designing systems.

We have all suffered through the frustration of calling an automated customer “service” line only to find that your problem does not fit neatly into any of the pre-determined options designed into the system.

And no, of course it doesn’t include an option to talk to a human being. That would be getting dangerously close to providing service.

Have you ever gone to a store to buy something they had listed as in stock only to find there were none on the shelf? So you ask an employee thinking maybe it’s in the back. That’s when you hear the dreaded news that the inventory is probably off and you’re out of luck.

It can feel like many of the systems we all interact with only work as intended on a cold day in hell.

None of this happens because the system architects hate you. The truth is it’s impossible to design a system that accounts for every situation and variation. The key is to design systems that can recognize anomalies and adapt.

Take the “out of stock” example.

There is a system in place at all stores for keeping track of inventory levels. If you haven’t worked in inventory, you might reasonably think that it’s as simple as keeping track of additions and subtractions from your starting point of zero.

Unfortunately, the real-world system has a fly in the ointment – untracked changes in inventory, like theft or accidentally scanning the wrong thing. That becomes a problem when there is no feedback loop. If the store doesn’t have a process to zero out the inventory when the system says we have some and it’s been stolen, it guarantees that the stock count will always be wrong, since no one can buy that missing item.

If you’ve been selling the same thing for a while, mistakes will happen. With no way to fix it, the inaccuracy remains indefinitely.

Now, if you have worked in inventory, you know there are a lot of solutions people have designed that work to provide feedback to the system. Maybe it’s empowering people at the store to zero things out when they see an empty shelf. Maybe it’s periodically counting the whole store and resetting everything when you do. Maybe it’s analysis that highlights potential gaps because we never sell below a certain threshold, or have a sale when we were supposedly out of stock.

Whatever the technique, working systems can adapt when information indicates there’s a gap.

Back to my example of the broken customer service support line. If your issue fits into the cases the architects designed for, getting a resolution can be easy enough. When the design team assumes they’ve allowed for every possible scenario, but you don’t fit the mold, you can be stuck in a loop trying to navigate the automated menu.

Next time I end up in one of these loops, I hope that the company is checking the data and incorporating the feedback to fix their processes. No one can think of everything, and so incorporating feedback is the key to improving system performance.

This post is brought to you by an experience with a client and basic probability. The way the system was originally designed, each step in the chain introduced a risk of failure. With a series of steps and low success for each one, only on a cold day in hell would the system actually work. Every other time, there are exceptions that need to be resolved. In the coming months, hopefully we will get to a system that naturally works.

 

Reply

or to participate.