Planning and scheduling risk actions is the third step in the risk management process. The planning activities carried out by IT operations translate the prioritized risks list into action plans. Planning involves developing detailed strategies and actions for each of the top risks, prioritizing risk actions, and creating an integrated risk management plan. Scheduling involves the integration of the tasks required to implement the risk action plans into day-to-day operations activities by assigning them to individuals or roles and actively tracking their status.
When developing plans for reducing risk exposure:
- Focus on high-exposure risks.
- Address the condition to reduce the probability.
- Look for root causes as opposed to symptoms.
- Address the consequences to minimize the impact.
- Determine the root cause, then look for similar situations in
other areas that may arise from the same cause.
- Be aware of dependencies and interactions among risks.
During risk action planning, IT operations should consider these six points when formulating risk action plans:
Much of the risk that is present in IT operations is related to the uncertainties surrounding incomplete information. Risks that are related to lack of knowledge may often be resolved or managed most effectively by learning more before proceeding.
Some risks are such that it is simply not feasible to intervene with effective preventative or corrective measures; IT elects to simply accept the risk in order to realize the opportunity. Acceptance is not a "do-nothing" strategy, and the plan should include development of a documented rationale for accepting the risk but not developing mitigation or contingency plans.
It is prudent to continue monitoring such risks through the IT life cycle in the event that changes occur in probability, impact, or the ability to perform preventative or contingency measures related to this risk. For example, a data center may need to temporarily house servers in a basement room that is at risk of flooding. There may be no alternative location available given the heat and power requirements. Mitigation or risk transfer would be too expensive and cause too much disruption. In such a case and given the fact that flooding has never occurred before, it may be justifiable to accept the risk and monitor the situation.
Risk avoidance prevents IT from taking actions that increase exposure too much to justify the benefit. An example is upgrading a rarely used application on all 50,000 desktops of an enterprise. In most cases, the benefit does not justify the exposure, so IT avoids the risk by not upgrading the application.
Whereas the avoidance strategy eliminates a risk, the transference strategy often leaves the risk intact but shifts responsibility for it elsewhere. Examples where risk is transferred include:
- Using external consultants with greater expertise.
- Purchasing a solution instead of building it.
- Outsourcing services.
Risk transfer does not mean risk elimination. In general, a risk transfer strategy will generate risks that still require proactive management, but reduce the level of risk to an acceptable level. For example, a company with an e-commerce site might outsource credit verification to another company. The risks still exist, but they become the outsource partner's responsibility. However, if the outsource partner is better able to perform credit verification, then transferring the risks can also reduce them.
While the goal of risk avoidance is to evade activities or situations having unacceptable risk, risk mitigation planning involves performing actions and activities ahead of time to either prevent a risk from occurring altogether or to reduce the impact or consequences of its occurring. For example, using redundant network connections to the Internet reduces the probability of losing access by eliminating the single point of failure.
It is vitally important to assign an owner to every mitigation plan, and it is helpful to define the plan's milestones in order to track its progress and its success metrics.
Not every risk has a reasonable and cost-effective mitigation strategy. In cases where a mitigation strategy is not available, it is essential to consider effective contingency planning instead.
Risk contingency planning involves creating one or more fallback plans that can be activated in case efforts to prevent the adverse event fail. Contingency plans are necessary for all risks, including those that have mitigation plans. They address what to do if the risk occurs and focus on the consequence and how to minimize its impact. Often IT can establish triggers for the contingency plan based on the type of risk or the type of impact that will be encountered.
Triggers are indicators that tell IT a condition is about to occur, or has occurred, and therefore it is time to put the contingency plan into effect. Ideally, the trigger becomes true before the consequences occur. It may help to think of triggers as warning lights that light up while there is still time to avoid danger. For example, if the condition is that the server runs out of hard disk space, the trigger might be that the server's disk has reached 80 percent of its capacity and is showing an upward trend.
In some cases, the triggers may be date-driven. For example, if the condition is that a newly ordered server might not arrive in time to support the launch of a mission-critical application, a trigger might be set for the latest date on which the server could safely arrive. If the server does not arrive in time and the trigger becomes true, one contingency plan might be to make use of an existing server from a less-critical service.
This best practice will be beneficial during the risk action planning step.
A mitigation plan might have several actions, and the sequence might affect the mitigation's success at reducing, avoiding, or transferring the risk, so it is important to prioritize the steps in this plan.
A contingency plan essentially describes how to shift away from normal operations when a condition occurs. Especially if the consequences disrupt many services, it may be valuable to bring some services back online first. Agree beforehand on the order in which to restore service, and decide how long each part can be offline.