Availability Management

Description/Summary

Information Availability Management is dealing with the implementation and monitoring of a predefined Availability level for the IT environment. Availability management is balancing the insurance level for the disaster case with the resources, requirements and costs. Availability analysis and business requirements as starting point, based on Service definitions, SLAs and costs to define the customer requirements on Availability level. Internal, minimal Availability requirement is named IT basic recovery level. Additional Availability requirements of the customer has to be defined on individual base. Additional challenge is to balance the internal required Availability level with services levels provided by the external and internal service provider for IT services.

Basic constrain of the Availability is to keep required SLAs balanced with reasonable costs and efforts.

Objectives

Ensure that agreed facilities, services and resources can be resumed in agreed time scale and level of availability.

Information Availability Management contributes to an integrated Service Management approach by executing the following activities:

  • Identifying and defining the internal and external Availability requirements
  • Performing a Business Impact Analysis and a Risk Analysis
  • Planning of Availability strategy and procedures
  • Managing the implementation of Availability actions and the organization
  • Evaluating the Availability procedures (preventive and recovery case) and Availability measures
  • Availability Reporting
  • Availability audit and improvement

Roles & Functions

Availability Management specific roles

Static Process Roles

Availability Process Owner
Availability Process Manager

IT Availability Management Process is controlled by the Availability Manager. Availability Manager can delegate tasks to specialized staff. Should it be necessary to use external staff, an approval of the budget responsible person is necessary.

Availability Management Process Staff

Staff in Availability Management is performing mandatory tasks for the Availability Manager.

Dynamic Process Roles

Availability Auditor

Availability Auditor is providing the verification of availability policies, processes and tools. Auditor should be altering an external and internal resource to provide independent and reliable audit.

Service Specific Roles

Roles depending on the affected service are found in the Service Description. The Service Description including the service specific roles is delivered from the Service Portfolio Management.

Service Expert/Service Specialist
Service Owner

Customer Specific Roles

Roles depending on the affected customer(s) are found in the Service Level Agreement. The Service Level Agreement for the customer specific roles is maintained by the Service Level Agreement Management.

Customer(s)
Customers of the affected Service with a valid SLA

Information artifacts

Availability Plan

Set of targets, measures, reports and actions to establish, maintain and audit certain level of Availability for a set of IT services for a certain time frame (year etc.)

Availability Design Record

  • Availability Design ID
  • Availability Design Requester
  • Availability Design Description
  • Availability Design Agent
  • Availability Design Owner
  • Service: The Service which has to be (re)designed
  • Service Availability
  • Service Reliability
  • Service Resilience
  • Service Maintainability
  • Service is a VBF?

Availability Transition Record

  • Availability Transaction ID
  • Availability Transition Requester
  • Availability Transition Description
  • Availability Transition Agent
  • Availability Transition Owner
  • Service: Services affected by the Change
  • Configuration Items: CI affected by the Change

Key Concepts

Terms

Availability
Ability of IT service to fulfill defined function within defined functional parameters on defined service level
Reliability
Parameter to define time of processing of a service without issues
Resilience
Attribute of a CI/Service to operate even in case of partial failure; such ability improves both Availability and Reliability
Maintainability
Degree of an IT service component to be set from status "failed" to status "up and running" according to defined SLA
Serviceability
Agreements with further service delivery partners like Facility Management or other external IT service provider
Vital Business Functions (VBF)
VBF are critical and business related elements within the scope of business processes, supported by the IT.

Availability Controls

Availability Data Set for Design

Responsible persons
will be defined out of the group of Availability responsible staff
RFC
Interface to Change Management
Status
Comment
Service out of the service catalogue
Interface to the Service Level Management. If service is known, all further parameters like service responsible person and expert group are known
Availability Requirements
Description level of availability for defined service

Life Cycle of Data Set for Design

Availability Data Set for Implementation

GUID
Responsible persons
will be defined out of the group of Availability responsible staff
RFC
Interface to Change Management
Status
Comment

Life Cycle of Data Set for Implementation

Availability Data Set for Validation

Life Cycle of Data Set for Validation

Availability Data Set for Maintenance

Life Cycle of Data Set for Maintenance

Process

Critical Success Factors

C’’ritical S’’uccess Factors (CSF) define a limited amount of factors influencing the success of a process. For the Availability Management following factors can be defined as CSF:

  • Clear business requirements for availability
  • Support from the organization and by the customer
  • Effective Configuration Management Process
  • Close collaboration with Change Management Process
  • Awareness, trainings and tests
  • Acknowledgment of IT-strategy
  • Clear definition of key terms like „time = time to react"
  • Monitoring from customer’s point of view
  • Collaboration with other Delivery Processes
  • Knowledge of technology market

Availability Manager has to regard the CSFs and to define and implement measures to fulfill the process success.

Availability Planning

High Level Process Flow Chart

This chart illustrates the Availability Management Planning process and its activities.

Availability_management_images_planning

Performance Indicators (KPI)

  • Number of changes to Availability Plan
  • Number of Availability Problems

Process Trigger

Event Trigger
Time Trigger

Process is triggered periodical.

Process Specific Rules

Process Specific Rules

  • Each Change to Availability Plan needs to be documented
  • Each Change to Availability Plan needs to be communicated

Process Activities

Availability Plan Definition

Availability Plan Definition is dealing with definition of

  • Basic availability plan

Activity Specific Rules

  • Availability Manager is set to the person who responsible for availability process
  • Availability Management Team is set persons who are staff involved in availability assurance activities
  • The Availability Plan is defined:
    • add new rules
    • modify existing rules
    • delete expired rules
Availability Plan Monitoring

Availability Plan Monitoring is responsible for the monitoring and update of availability planing.

Activity Specific Rules

  • Communicate availability plan
  • Assure that staff understands and follow rules by auditing and teaching staff
  • trigger updates of the Availability Plan if necessary

Availability Design Assistance

Sub process Design in Availability Management is responsible for the initial planning or planning of optimizations of the Availability process. If a change proposal on Availability process is classified the change needs to be planned as well. This is performed by an expert out of the responsible expert group in coordination with a member of the IT Availability Staff. This sub process is triggered by Service Design. Actions aiming the improvement of Availability need to be documented in a RFC document in detail. After the planning of Availability measurements is finished the status needs to be changed to ‘‘plan.

Process owner is the Availability Manager. Process agent is an expert assigned by "Availability Staff".

High Level Process Flow Chart

This chart illustrates the Availability Management Design process and its activities.

Availability_management_images_design

Performance Indicators (KPI)

  • Duration of reaction in case of major availability issues

Process Trigger

Event Trigger

The process is initiated by Service Design.

Time Trigger

Process Specific Rules

Process Specific Rules

  • Each Availability Design Assistance request must be recorded
  • Availability Design Assistance Agent has to document the request and the result
  • Availability Design Assistance Owner has to control the agent
  • Availability Design Assistance requester has to be informed on design status

Process Activities

Design of Availability Part in Service Design Package

Within this activity the Availability section of the service design package is designed.

Activity Specific Rules

  • Set the Availability Design Owner to a member of the IT Availability Management Staff
  • Set the Availability Design Agent to a member of the Service Expert or Specialist Group
  • Design Availability according to the Availability Plan
  • Coordinate Availability Design package with the activy "Availability Design" and other relevant Service Designer
  • Document in the Service Design package and fill out the Availability Design Record
  • Go to control activity "designed"
Approval of Availability Design Package

With this activity the Availability Manager decides on service design package. His decision is based on the cost expectations and the Availability Definition. In general three results of this activity are possible:

  • Availability Design Package is finally neglected
  • Availability Design Package is temporary refused and returned to the Availability expert, for the improvement or optimization of the Feasibility Study
  • Availability Design Package is accepted.

Activity Specific Rules

  • Set the Availability Design Agent to Availability Manager
  • Approve the Availability Design Package and Documentation in the Availability Design Record
  • On approval go to activity "approved - successful"
  • else
    • Go to control activity "new" for a re-design of the Availability Design Package OR
    • Go to control activity "approved - unsuccessful" for final abortion.

Availability Transition Assistance

This activity, in cooperation with Change Management, is supporting the implementation and testing of Availability improvements by designing, testing, implementing and testing the implementation again. This actions are headed by Change Management.

If an Availability Design Package is authorized and approved by Change Management, all actions functional descriptions and implementation procedures described in the improvement proposal need to be detailed, tested and approved in cooperation with the Change Management. Afterwards the implementation should be assisted to provide help in case emergency or implementation issues.

Final PIR, conducted together with Change Management is also including testing.

High Level Process Flow Chart

This chart illustrates the Availability Management Transition process and its activities.

Availability_management_images_transition

Performance Indicators (KPI)

  • Availability
  • Reliability
  • Maintainability
  • Serviceability

total number of RFC to compensate poor availability plan

per priority, per service, per CI, per user, per customer, per location, per employee, …

Process Trigger

Event Trigger

Process to be started by Change Management

Time Trigger

Process Specific Rules

Process Specific Rules

  • Each Availability Transition Request request must be recorded
  • Availability Transition Assistance Agent has to document the request and the result
  • Availability Transition Assistance Owner has to control the agent
  • Availability Transition Assistance Agent has to coordinate his work with other transition agents
  • Availability Transition Assistance requester has to be informed on transition status

Process Activities

Creation Risk Analysis and Feasibility Study

If a availability transition assistance is requested by Change Management the following 2 artifacts need to be defined:

  • Feasibility Study
  • Risk Analysis

Following aspects need to be addressed within a Feasibility Study:

  • Feasibility of proposal
  • Risk of implementation
  • Risk of neglecting proposal
  • Costs

A Feasibility Study is based on high level planning and should not address detailed planning because of the possibility that the proposed change will not be accepted. Detailed planning of the proposal is part of the Transition sub-process.

A Feasibility Study is provided by the expert team of the address service. Eventually additional requirements need to be regarded that are provided by other assistance processes like Financial, Security or Capacity Management.

After the responsible expert finishes their contribution to the planning of the security actions, the status needs to be set on planned design package

Activity Specific Rules

  • Set the Availability Transition Owner to a member of the IT Availability Management Staff
  • Set the Availability Transition Agent to a member of the Service Expert or Specialist group
  • Create a Availability Feasibility Study and Risk Analysis according to the Change Requirements and the Availability Plan
  • Coordinate the creation of the Availability Feasibility Study and Risk Analysis package with others
  • Document Availability Feasibility Study and Risk Analysis in the Availability Transition Record
  • Go to activity "created"
Build - Test - Implement - Assistance

If an Availability change is approved for the implementation Change Management is assigning Availability Management respectively the sub-process Build - Test - Implement - Assistance for

  • Detailed definition of implementation instructions for the Availability improvement,
  • Detailed definition of test procedures and documents for the Availability improvement,
  • Support of implementation,
  • Testing of implementation and
  • Approval of the implementation

For all activities above detailed documents are necessary.

Within the documentation of the improvement proposal, the order and time line of actions need to be described. Testing documentation hast to address the test design and assure effectiveness of the test.

Implementation activities are fulfilled and headed by Change Management - Availability Management is only assisting and supporting regarding the Availability aspects and functions. Availability Management can be defined as agent by Change Management for some implementation steps.

The Test is split up in two main test areas:

  • 1. Test of the implementation - "Does the provided document describe the right implementation actions in the right implementation time order?"

If this test is positive, then

  • 2. Test of the functionality of live-system - "Does the system perform the functions defined in Design sub-process?" This test in performed based on the defined testing documentation. This test is addressing the test design and the test effect.

If both testing areas (see above) are positive, the data set is set in status enabled. The Change Management needs to be informed on the positive result of the testing and the data set status needs to be changed to waiting. In case of negative test results, the testing can be aborted (test status needs to be set on canceled) or transferred again in the activity consolidation.

This activity is performed by expert staff assigned by Availability Staff. Activity owner is the Availability Manager

Activity Specific Rules

  • Support creation of implementation plan including fallback plan
  • Support creation of test plan
  • Support test of implementation including fallback plan
  • Support implementation
  • Document test and implementation results
  • Coordinate work with Change Management
  • Go to activity "assisted"
Evaluation and Closure Assistance

In coordination with the Post Implementaion Review of the Change, Avilability Managment helps to test the implementation from the avilability point of view. In cases of a failed tests, the Change Management has to decide if the fallback plan has to be executed or the implementation can be accepted despite any issues in testing.

Activity Specific Rules

  • Support post implementation review and test
  • Consult Change Management on Fallback Execution
  • Support fallback implementation if necessary
  • Document activities
  • Go to activity "assisted - closed"