Supporting a Product’s Cluster Headaches

Cluster headaches are problem reports that make everyone else in the organization ask “are we seeing that too?” and pile on. Alice suspects a memory leak and files a report. Bob tags his customers to it as well. They start discussing general performance questions in a chat or a meeting with a broad audience. Charlene through Zachariah pile on with more detail, some of which is relevant. A few days later, the root cause of Alice’s problem is found in an environmentally specific misconfiguration, but by now you’ve got an executive asking when you’re going to fix the memory leaks.

As a product manager, you may find these incidents annoying. They distract and disturb the engineers and stir up trouble with the field. However, people are people and they’re going to pattern match. Its not in your best interest to pour cold water on field people trying to help. Instead, look for ways to use these incidents to drive improvement.

  • The best disinfectant is sunlight. Open conversation about the troubleshooting process keeps everyone aware (Slack is great for this, ideally with daily summation to the ticket, but some teams use long-running incident meetings instead). As it becomes clear that Alice’s ticket is not what everyone else thought, their willingness to pile on decreases. There is a limit to effective openness though: people can misinterpret comments and egos can get bruised. The time lag of email based ticket comments is particularly bad for this. Someone may need to referee and keep conversation productive. 
  • Recognize that there is a problem. As a development team, perhaps you can look at this situation as a symptom of something to resolve. The X to this Y may be that there are legitimate concerns about resource utilization, and that troubleshooting those concerns is difficult. Can your team do something to improve that experience? Adding metrics and alerting on known bad states is almost always useful.
  • Where there’s smoke, there’s often fire. If cluster headaches keep popping up around the same component, that’s a signal of fear, uncertainty, and doubt. Increase enablement for that component, and listen to what the field says. If they don’t trust it they won’t sell it, so you will not be successful until they understand and trust it.
%d bloggers like this: