Managing risks in Operations & Production Support environment..

Managing risks in a production environment, that is making money for customers, is extremely essential. However, most often, due to the unpredictable nature of the production support or operations management work, the fear of unknown increases drastically.

More often than not, for an operations analyst or a production support analyst every day is a new day and every problem is a new problem and hence the traditional risk management model that suggest to Identify->Analyze->Plan->Track->Control. The traditional model assumes there is a significant time available that will allow you to analyze and assess the risks after you identify it. However, in the production support or operations management area, the time is something that is not available and you are expected to react it quickly.

I have been part of a workshop recently to discuss about the Risk Management and how it could be done in such a volatile, unpredictable and unknown environment such as production (or Live).  In one of my previous experiences about awarding the winners in an organization, it was observed that the companies, most often, tend to reward the people who do better crisis management than the people who do better risk management and that often means that the risks are tend to be reacted only when they are realized and become a bigger problem.

So, at the end of the discussion, it was more or less agreed that the Risk Management in a production environment is all about behavioral change and mindset. Interesting ? .. read ahead !

If you consider the possible responses to a risk once you identify it, they could broadly classified as follows,

  • Terminate – terminate the risk at the source and do not accept the same
  • Transfer – transfer the risk to the concerned stakeholders and ensure they are mitigated
  • Treat – accept the risk immediately and start controlling
  • Tolerate – accept the risk and do nothing !

If you revisit all the scenarios you had experienced related production support or operations business, they are more often than not demand urgent attention.  A priority 1 ticket is waiting or some incident is threatening to take the shape of a bigger problem.  Now, for such situations, can you terminate the risk ? Can you tolerate the risk or can you transfer the risk and keep quiet ? I would think no ! In all such cases, you would have taken quick action to either resolve the risk yourself or ensure that the risk is resolved at the earliest.

Now, coming back to my earlier statement of relating the Treat, you would agree that to treat the risk in an production environment that requires collaboration across multiple teams, you need to develop the ownership & risk taking mindset. Someone needs to take the ownership and drive the problem through to the solution or mitigate the risk in full.

Few tips on mitigating the production risks are as follows,

  • Keep customers informed of more bad news than the good news. Even if you do not believe, the customers are more prepared to listen to worse news than you can possibly give the.
  • Expose your vulnerability without going into victim mindset !


  1. Thank You for sharing your experience Swapnil, appreciate that. Can you share any of the real time risks which you saw in your experience or refer to any such article.

Leave a Reply