The Microsoft Operations Framework (MOF) Risk Management Discipline applies proven risk-management techniques to the challenges that operations staff members face every day. There are many models, frameworks, and processes for managing risks-all of which discuss planning for an uncertain future. However, the MOF Risk Management Discipline offers greater value than many others through its key principles, consistent terminology, structured and repeatable six-step process, and a recognition that the MOF Risk Management Discipline needs to be an integral part of the overall operations framework.

Key Principles

An essential aspect of successful IT operations involves managing the risks inherent in running the IT infrastructure. Within the MOF Risk Management Discipline, risk management is the process of identifying, analyzing, and addressing risks proactively. The goal of risk management is to clear the way for the positive impacts (opportunities) of an operations activity while minimizing the negative impacts (losses) associated with that risk. Effective processes for understanding and managing risks will ensure that effective trade-offs are made between risk and opportunity.

In order to implement the MOF Risk Management Discipline, you must have a solid understanding of the following key principles:

Risk Is Inherent in Operations

The only environment that has no risk is one whose future has no uncertainty-where there is no question of whether or when a particular hard disk will fail, no question of whether a Web site's usage will spike or when or how much, and no question of whether or when illness will leave the service desk short-staffed. Such an environment does not exist.

By always keeping in mind that risk is inherent, operations professionals seek ways to continuously make the right trade-off decisions between risk and opportunity and to not become too focused on minimizing risk to the exclusion of all else. IT staff need to stay agile and expect change.

Proactive Risk Management Is Most Effective

Proactive risk management is not achieved by simply reacting to problems. Operations staff should work to identify potential risks in advance and to develop strategies and plans to manage them. Plans should be developed to correct problems if they occur. Anticipating potential problems and having well-formed plans in place shortens the response time in a crisis and can limit or even reverse the damage caused by the occurrence of a problem.

Treat Risk Identification as Positive

Operations staff should always regard risk identification in a positive way; doing so will ensure that people contribute as much information as possible about the risks they face. A negative perception of risk causes people to feel reluctant to communicate risks they perceive. The environment should be such that individuals identifying risks can do so without fear of retribution for honest expression of tentative or controversial views. Managers should support and encourage development of a no-blame environment to foster open communications and promote successful risk management discussions.

Assess Risks Continuously

Many IT professionals misperceive risk management as a necessary, but boring task to be carried out only at the beginning of a project or before the introduction of a new service. Continuing changes in operations environments require process owners to regularly look for new operational risks, reassess the status of known risks, and reevaluate or update the plans to prevent or respond to problems associated with these risks.

The MOF Risk Management Discipline advocates the use of a structured process that identifies and analyzes risks. This process provides decision makers with information not only on the presence of risks, but the importance, or ranking, of those risks as well.

Integrate Risk Management into Every Role and Function

At a high level, this means that every IT role shares the responsibility for managing risk and that every IT process is designed with risk management in mind. At a more concrete level, it means that every process owner:

  • Identifies potential sources of risk.
  • Assesses the probability of the risk occurring.
  • Plans to minimize the probability.
  • Understands the potential impact.
  • Plans to minimize the impact.
  • Identifies indicators that show the risk is imminent.
  • Plans how to react if the risk occurs.

One of the key roles within the MOF Team Model Service Role Cluster might be a service manager. For example, the service manager with overall responsibility for the e-mail service performs all of these tasks to manage the risks that are most important for that service. Other people in that manager's extended staff may perform a subset of those tasks. Everyone will help identify new risks, but perhaps only one or two people will be responsible for estimating probability or making plans to minimize the consequence of that risk.

Shared Responsibility and Clear Accountability

Everyone in IT operations is responsible for actively participating in the risk management process. Process owners are assigned action items that specifically address risks within their service area, and each holds personal responsibility for completing and reporting on these tasks in the same way that they do for other action items related to day-to-day operations. Activities include risk identification within areas of personal expertise or responsibility and extend to include risk analysis, risk planning, and the running of risk control tasks.

Within the MOF Team Model, the Service Role Cluster holds final accountability for organizing risk management activities and ensuring that they are incorporated into the standard processes to meet service level agreements (SLAs).

Use Risk-Based Scheduling

Maintaining an environment often means making changes in a sequence. Where possible, process owners should make the riskiest changes first. The greatest risks tend to be those with the highest level of unknowns. Risk-based scheduling involves making quality trade-off decisions and is important because it minimizes wasted efforts, allowing more reaction time for risk mitigation.

Learn from All Experiences

MOF assumes that focusing on continuous improvement through learning will lead to greater success. Knowledge captured from one experience will decrease the uncertainty surrounding decision making when it is applied by others in later situations. MOF emphasizes the importance of organizational-level or enterprise-level learning from operational outcomes by incorporating a learning step into the risk management process.

Keep It Simple

Success requires a risk management process that people understand and use. This is a balancing act. If the process has too little structure, people may use it, but the outputs will not be useful; if it is too prescriptive, people probably won't use it at all.