Posts

Simulation Exercise

Choices and Categories of Tests & Exercises

Abstract

In testing and exercising the BC plans, the terminology for the various type of tests and methodologies often poses a challenge for any BCM professionals when they are about to start their testing and exercising programmes. The paper is a summary of tests, and it is not intended to be comprehensive list, so as to provide a good foundation of the types of tests that a BCM professional is are likely to embark upon.

1. Introduction

Most BCM professionals find it challenging to identify the type of tests and exercises, that to be conducted for their organization. It is usually a long list and there are many variations within the discipline.

1.1 Categorization

There is several ways of categorizing the types of tests. One approach is to be based on the actions to be taken. An example would be: Desk check, simulation, procedure verification, communications and IT environment walkthrough. Another approach is to list all the possible types of tests to be conducted and then select the type of tests that is useful for testing the requirement outcome based on the readiness level needed by the organization. This includes component, integrated, simulation and live test.

The approach in this paper is to describe the techniques or methodology as the content and objective of the plan can be developed separately. Additional terminology relating to testing can be found in www.BCMPedia.org.

2. Component Tests

The following are sample of the type of tests that could be conducted as part of a component test for a typical business continuity plan.

2.1 Confirm Availability / Version of Plan

This test is designed to check that key staff in both business and support recovery teams can gain access to a hard-copy of their continuity plan at any time. As part of your maintenance program, you should include procedures to “visit” your plan at pre-defined intervals, to update personnel details and to ensure that recovery measures remain relevant.

2.2 Retrieve Vital Hard Copy Records from Offsite Locations

As a good practice, the hard-copy records of documents critical to business operations should be kept in an offsite location. This Component Test confirms that such records are indeed available offsite, are sufficiently up-to-date to be of use in a crisis and can be promptly retrieved within the expected time frame.  These documents may include copies of contracts, agreements, insurance policies, floor plans, title deeds as well as any special reference manuals required to conduct business operations in a crisis situation.

2.3 Contact Staff, Suppliers & Others

One of the most straightforward but important tests is the telephone notification procedure. This is typically carried out on three main groups of people:

  • Staff
  • Suppliers or vendors, who provide you goods and services
  • Other contacts, including customers or others to whom you provide goods and services

Whilst the principles of these tests are similar, you should consider differences in the relationships between your organization and the groups of people and tailor the approach of testing for each group accordingly.  The benefits of carrying out these tests are:

  • Establish that the contact telephone numbers in your plan are correct and up-to-date.
  • Confirm that the resources you require in a crisis, both human and otherwise e.g. equipment and supplies, can be obtained when and where needed.
  • Ensure that the targeted degree of recovery matches the expectations of your internal or external customers.

It is highly likely that you will need to modify your plans following each test. These tests play a very important role in the maintenance program and their value should not be under-estimated.

2.4 Check Lead Times for Critical Equipment

This is to establish the lead-times for the delivery of critical equipment. This differs from testing suppliers of services as it relates to availability of specific items rather than the ability to contact personnel. This is a simple test, which applies to both business and support units.

2.5 Confirm Alternate Site Readiness

This test is used to confirm the readiness of the personnel at the alternate site to receive people from a business unit or building who are displaced due to an incident.  The procedure will vary depending on location and on whether the recovery will be at a commercially operated alternate site or at another organization’s building. In any case, a Service Level Agreement (SLA) should be in place confirming the agreed relocation arrangements. This document will state the expected time frame for the relocation, where all relevant parties (Officials from the alternate site as well as the Central Support Business Units of the organization carrying out the recovery) must acknowledge, confirming that they find the time frame acceptable, reasonable and attainable.  Given that alternate site recovery contracts are usually held centrally and that only certain staff can invoke such plans, it will be assumed, for the purpose of this test, that recovery will be at a site controlled by the organization.

2.6 Test Staff Members’ Knowledge of Business Unit Plan

The person conducting the test visits the business unit BCM coordinator and staff members of a selected business unit and tests how much he/she knows about the procedures without the staff having access to the plan. This will confirm the business unit staff members’ knowledge of the plan and potential ability to ensure the recovery of the business unit if, for whatever reason, a copy of the plan is not initially available.

2.7 Spot Check of Vital Records

This test involved the business unit BCM coordinator and staff members of a selected business unit to visit the offsite location where the vital records are kept. While at offsite location, the team is required to perform a review using a checklist of the inventory of vital records.

2.8 Recall Offsite Storage

This relates mainly to support business units and should not be confused with the retrieval of vital hard-copy records, which is covered separately.
The list of support business units at a medium to large operation would normally include the following:
  • Premises/ Facilities
  • Information Technology
  • Telecommunication/ Networks
  • Security
  • Public Relations
  • Human Resources
  • Administration/ Correspondence
  • Legal/Compliance
  • Financial Control
  • Transport

In order to meet the everyday needs during a disaster, these business units are likely to have spare items such as furniture, equipment, cables, server tapes, back-up disks, stored offsite. In some cases they will be stored in another organization’s building premises and in others, an external storage contractor may be used.

The purpose of this test is to confirm that the business units can access and/or arrange delivery of the required items within the expected time frame stated in the plan.2.9 Check that Important Lists are Still CurrentThis ensures that important lists are up-to-date. Each business continuity plan contains a number of lists, e.g. list of key items or contacts required in a crisis. The information stated in the lists can be used to contain the impact and/or limit the damage to the business.  The following are key lists in a typical business continuity plan:

2.9.1 Personnel Contact List

In addition to a Telephone Call Tree chart, business unit coordinators should have an updated Personnel Contact List.

2.9.2 Initial Action by Business Units

Important business units should each have a brief list stating the tasks which key team members need to undertake in the opening stages of a disaster scenario. These members should have this list with them at all times.

2.9.3 Inventory of Resources

This lists all key resources. Regular checks should be done to confirm they accurately reflect the needs of each business unit.

2.9.4 PC Software Versions

The lists of IT hardware and software, (showing the version) should be kept up-to-date. “Systems” for unique software should be regularly tested and not just stored in an IT business unit.

2.9.5 “Grab” List

This is a list of small items, identified as being useful, which staff will try to take with them as they evacuate.

2.9.6 Priority Salvage List

This identifies items a business unit BCM coordinator might ask someone to hand-carry from the office, if that person was allowed back into a building for, say, 30 minutes.

2.9.7 Essential Forms / Stationery

If a business unit has any special stationery or printed forms without which the business cannot operate, a small supply of these should be stored offsite and the location recorded in the plan. The tests for confirming the contents of these key lists are simple and quick to conduct.

3. Notification Call Tree Test

Even though this is a Component Test, the critical importance of this test cannot be ignored. In a Telephone Notification Call Tree Test for recovery teams, the recovery team members will notify designated staff members as documented in the plan. This personnel communication network forms one of the most efficient and effective means of communicating any news or instructions to all relevant staff, and should include the entire organization.

4. Walk-through Test

In a Walk-through, recovery team members meet to verbally walk-through the steps of each component of the business continuity process as documented in the business continuity plan.

5. Integrated Test

An Integrated Test involves integrating any number of the components in the order that they would occur during actual recovery operations. Integrated test builds on test successes and increasing employee awareness generated during component testing. Organization BCM coordinator and business unit BCM coordinators should realize that the increased complexity, coordination of multiple teams, involvement of other interested personnel and budget considerations will limit the frequency of integrated testing.

6. Incident Simulation Test

This involves the development and use of pre-written test scenarios or test scripts for disaster events. The scenarios tell the team members how to react to such disasters and give organizations a baseline from which to start their recovery plans.

7. Partial Simulation Test

Similar to Full Simulation (below) except that only several business units will be involved. However, for these business units, the testing will be to the fullest detail and scope.

8. Full Simulation Test

Full Simulation test is the ultimate BC plan test which activates the total BC plan. Full Simulation test is also called Full Interruption test or Mock Disaster test. The purpose is to simultaneously test as many components as possible in the organization recovery structure. The test is likely to be costly and could disrupt normal operations, and therefore should be approached with caution. Adequate time must be scheduled for the testing.

To successfully test recovery capability, the tests must evaluate the recovery procedures and documentation, not the inherent knowledge of the staff.
Each test must have a set of primary and secondary objectives to define the direction of the test and to measure its success. An example of such objectives; the primary objective is to evaluate success or failure and the secondary objective is to test if extra time is available.

9. Live Test

Finally, this is the ultimate of all tests. It is perhaps, the most challenging test that any BCM professional would deemed to undertake as this is where anything can go wrong will go wrong. To worsen the situation, this errors of this test will be seen live in the presence organization-wide and especially with senior management.

10. Conclusion

The decision on the types of test to be conducted can be an uphill task initially for many BCM professionals. There is an pressing expectation from the management to test the BC plan to its readied state. Hence, the identification and implementation of correct series of tests for an organization becomes the key necessity for any organization who has a BC plan.

11. References

[1] BCMpedia (2008). Definition of Business Continuity and Disaster Recovery Terminologies, http://www.bcmpedia.org
[2] Goh, Moh Heng (2008). Managing Your Business Continuity Planning Project, 2nd Edition, 166 pages.
[3] Goh, Moh Heng (2008): Conducting Your Impact Analysis for Business Continuity Planning, 130 pages.
[4] Goh, Moh Heng (2008): Analyzing & Reviewing the Risk for Business Continuity Planning, 162 pages.
[5] Goh, Moh Heng (2005): Developing Recovery Strategy for Your Business Continuity Plan, 104 pages.
[6] Goh, Moh Heng (2004): Implementing Your Business Continuity Plan, 104 pages.
[7] Goh, Moh Heng (2006): Testing & Exercising Your Business Continuity Plan, 2nd Edition, 160 pages.
[8] Goh, Moh Heng (2007): Managing & Sustaining Your Business Continuity Management Programme, 190 pages.
[9] Goh, Moh Heng (2006): Developing Your Pandemic Influenza Business Continuity Plan, 128 pagesAbout

The Author

Dr Goh Moh HengDr Goh Moh Heng is the President of BCM Institute www.bcm-institute.org and the Managing Director for GMH Pte Ltd www.gmhasia.com , an Asia-Pacific BCM consultancy firm. During the last 20 years, Dr Goh had conducted several hundreds of tests and exercises for clients throughout the world.  It ranges from the many simple notification tests, walkthrough tests to the large simulation and live tests. Sometests worth mentioning include the enterprise-wide crisis management simulation, full simulation test and unannounced live tests for many international organizations. He hold a PhD and also been awarded the highest level of certification from the three major business continuity management institutes.  He is the author of nine business continuity management books.  Dr. Goh is instrumental in creating the first Wikipedia for BC www.BCMpedia.org. He can be contacted at moh_heng@bcm-institute.org or moh_heng@gmhasia.com.

Pandemic Flu Exercise

Pandemic Flu Business Continuity Planning for Organizations

“Many organizations read about the possible pandemic flu, but cannot completely digest the issues and preparations needed to sustain its mission critical operations and services.”

Abstract

This paper discusses about the pertinent aspects of pandemic flu business continuity (BC) planning. In the last two years, there is an increase in organizations preparing themselves for the possible influenza (flu) pandemic outbreak. The key challenge in the preparatory process is the synchronization of the business continuity plan and procedures with the World Health Organization’s and the local health ministry’s pandemic alert phases. Several probable outbreak situations, and several more possible variations in responses to them, makes the planning process one of the most complicated challenges facing business continuity professionals. The key outcome is the understanding of the scope of implementation of contingency, BC or crisis management plans and the application of the BC execution stages to implement the necessary actions to prepare an organization of the impending pandemic flu outbreak.

1. Introduction

Even though we have experienced three previous pandemic flu outbreaks in the 20th century, no one knows precisely how a pandemic might unfold. However, the recent developments and discoveries about the virus provide some clues as to what we can expect. World Health Organization has warned that the risk of the avian flu becoming a human influenza pandemic is high. Most governments throughout the world have and will continue to take necessary precautionary measures and update their pandemic flu BC and/or preparedness plans.

2. Framework for Pandemic Flu Planning

Planning for the unthinkable pandemic flu may appear to be a humongous and complex set of tasks. It ranges from the possibilities of a small outbreak in any country to a global disaster that undermines the basic functions of life. Organizations without any existing BC or contingency plans will be overwhelmed by the planning complexity. Many of those without the necessary resources and BC planning capabilities have unwisely adopted the “wait-and-hope” approach. For organizations located in regions previously affected by the Several Acute Respiratory Syndrome (SARS) outbreak, a good and logical point for any organization to start is with the review of its Severe Acute Respiratory Syndrome (SARS) contingency or BC plan.

The concepts and approach contained in this paper does not follow the conventional BC planning methodology. It has been specially designed as a fast track planning approach to help organizations prepare against the impending pandemic flu threat. The consideration is based on the need to develop an immediate, simple and effective plan to manage this threat; especially for organizations that do not possess existing contingency or BC plans.

Health experts believe that the pandemic flu virus is continuously evolving. Hence, it is imperative for organizations to develop and implement a BC plan that is flexible and adaptable to the evolving threat; which can be easily and regularly updated as and when more information on the virus is available, through the joint efforts by the communities and governments.

2 Definitions

There is a constant debate on what to name the plans that we develop for this crisis. Some organizations call it pandemic flu BC plan while others call it pandemic flu contingency plan. For clarification, some of the definitions and terminologies of the components of these plans are discussed in the following subsections.

2.1 Contingency Planning

Contingency planning is the process of developing advance arrangements and procedures that enable organizations to respond to events happening by chance or to unforeseen circumstances.

2.2 Business Continuity Planning

Business continuity planning is the process of developing advance arrangements and procedures that enable an organization to respond to an event in such a manner that critical business functions continue without interruption or essential change. In this paper, contingency planning is a subset of BC planning.

2.3 Pandemic Flu Contingency Plan

A pandemic flu contingency plan is used by an organization and its business units to respond to disruptions to operations resulting from exposure of employee(s) to a pandemic flu incident.

2.4 Key Objectives of BCP for Pandemic Flu

  • Reduce the transmission rate or morbidity among employees and customers
  • Continue and/or recover mission critical operations and services

3. Non-conventional Business Continuity Planning

Pandemic flu BC planning differs from traditional BC planning or the Year 2000 or SARS BC planning because organizations:

  • Cannot afford to wait the next few months as the pandemic spreads rapidly, and the impact is significant and immediate
  • Cannot expect to follow a traditional business continuity event timeline
  • Need to react as quickly as possible
  • Need to execute BC plans immediately
  • Should expect some fatalities and high absenteeism within the workforce
  • Need to consider where the employees are residing, and possibly, relocate them back to their home country
  • Must expect closure of borders by the government; thus, critical operations for organizations highly dependent on cross-border workers will potentially be disrupted
  • Must understand that the magnitude of the damage cannot be clearly defined as it extends beyond the organizations and country’s boundaries
  • Should consider legal issues and risks as this is a predicted event
  • Expect outage/absenteeism for a protracted period of time
  • Should consider non-compliance of outsourcing agreements

4. Key Disaster Scenario

One of the business continuity (BC) best practices is to define the key disaster scenario. This scenario provides a common perspective to the executive management, BC project manager, BC team, IT Disaster Recovery Planning team and even the Crisis Management team.

The key disaster scenario should be based on the worst-case situation – occurring at the most vulnerable time; resulting in damages and losses of the most severe magnitude, like total loss of information, physical infrastructure and equipment.

The traditional BC planning focuses on denial of access to facilities. However, but the pandemic flu BC plan focuses on denial of access to facilities, and loss of key people. Hence, the assumptions to cope with a pandemic BC planning are very different. In addition to this basic difference, there are many other assumptions that a BC planner must quickly look into with regard to pandemic flu BC planning.

5 Pandemic Flu BC Planning Asumptions

5.1 Length of Disruptions and Absenteeism

Medical experts have projected that at least 25% of people will contract the virus during a full-scale pandemic. There are two possible levels of disruptions: short and medium term, and long term. In Figure 3, these assumptions are depicted as business disruption scenarios.

5.1.1 Short and Medium Term Disruption

  • The percentage will be higher than 25% as staff may be staying away from work to care for family members due to quarantine or closure of school.
  • An estimate of 25% absenteeism should be taken as a “low estimate” for medium term disruption. In larger cities, this percentage may increase to 50% or more for short periods.

5.1.2 Long Term Disruption

  • In the event of a full pandemic, it is predicted that business will not return to normal for a period of 6 to 18 months. The best case scenario is if the pandemic is relatively benign and handled effectively by national governments. The worst case scenario is the possibility of major financial centers being moderately impacted. A working assumption of a severe disruption lasting 12 months would be supportable.
  • There will be a huge reduction in international services such as tourism and offshore financial services.

5.2 Multiple Sites Disruptions

• Should there be a pandemic flu outbreak; the situation would be unpredictable as more than one business location could be impacted.

5.3 Maintain Separation of Personnel

  •  Authorities will discourage, or even prohibit, gatherings or concentration of large numbers of people so as to limit human-to-human transmission of the disease.
  • Decentralization (reduce human-to-human contact) of key personnel becomes mandatory i.e. autonomous decision making.

5.4 Continuous IT Operations

Provided that the continued operation of key infrastructure (data centers, networks and systems) is accorded highest priority, the major problem is one of managing the people resources.

5.5 Disruption to Supply Chain

During an outbreak, one part of the world may be mildly affected; but, their operations may still be impacted if their suppliers are in other countries that are seriously affected by the outbreak. One major concern for organizations is that the current supply chain and outsourcing arrangements may not operate at contracted service levels. Organizations that are highly automated, ‘just in time’ value chains, outsourcing core activities to third parties will be seriously at risk.

5.6 Local Denial of Access

In developing the pandemic flu BC plan, organizations should consider the following office closure scenario:

  • Staff affected by pandemic flu resulting in closure of office.
  • Staff members being quarantined for five days or more (subject to local health authorities’ guidelines).
  • Office closed for one to three days for cleaning.
  • Duration required by staff to recover from influenza (the minimum recovery duration will be at least two weeks).

5.7 Ineffectiveness of Temperature Checking

It is important to understand that infection cannot be detected by temperature checking as a person could carry the virus for more than a day before any sign of a fever appears.

5.8 Variation of Health Support and Preparedness

In reviewing the country’s pandemic flu health support, the level of preparedness forms an important consideration when developing your BC plan.

6 BC Execution Stages and Pandemic Timeline

The planning assumptions are a pre-requisite for the implementation of the pandemic flu BC plan. This is followed by the understanding of the typical BC execution process and the WHO’s pandemic stages.

6.1 BC Execution Stages

Figure 1: BC Execution Stages
The execution of a typical BC plan (Figure 1) includes the following stages:

  • Reduce
  • Respond
  • Recover/ Resume
  • Restore/ Return

6.2 WHO’s Pandemic Stage with BC Execution Stages

Those who are familiar with the WHO’s pandemic stage requires little explanation on the timeline. The key in pandemic flu BC planning is to match the various BC execution phases with the WHO’s pandemic flu timeline.

Figure 2: Pandemic Stages and BC Execution Stages

6.3 Pandemic Timeline and BC Execution Stages

Finally, the objectives is to show the correlation of each WHO’s pandemic stage and the BC execution phase. The mapping provides the BC professionals to map their professional BC knowledge and implementation to the possible disruption to business scenarios as shown in Figure 3.
Figure 3: Pandemic Timeline and BC Execution Stages

7. Types of Plans and Extend of Planning

There is a need to be aware of the types of plans that an organization will be implementation in preparation for the pandemic flu outbreak. The important difference is the scope and extends of implementation. They are the contingency plan, BC plan and crisis management (CM) plans.

7.1 Pandemic Flu Contingency Plan

A typical Pandemic Flu Contingency Plan consists of only the following components:

  • Reduce; which is to focus on the preventive measures
  • Respond; which is to focus on managing and containing the pandemic flu incident
  • Recover and Resume; which is to conduct limited planning for the outbreak except for some high level documentation process to handle the critical business functions

A pandemic flu contingency plan must handle:

  • Preventive measures to minimize contamination (pandemic flu prevention)
  • Immediate responses to a disaster (pandemic flu emergency response)

7.2 Pandemic Flu BC Plan

A Pandemic Flu BC Plan will include the pandemic flu contingency plan and in addition, it must handle:

  • Subsequent business recovery and resumption activities
  • The return of business to normalcy

It is essential to note that in some situations, the “business resumption” and “return to normal” processes can be conducted in parallel with the pandemic flu contingency plan.

7.3 Crisis Management Plan

Crisis Management (CM) plan is a plan used for the overall coordination of an organization’s response to a crisis in an effective, timely manner, with the goal of avoiding or minimizing damage to the organization’s profitability, reputation or ability to operate.
The definition of crisis and the crisis management team is provided below:

  • Crisis is a critical event such as pandemic flu, which, if not handled in an appropriate manner, may dramatically impact an organization’s profitability, reputation, or ability to operate.
  • A Crisis Management team will consist of key executives as well as key role players (i.e. media representative, legal counsel, facilities manager, disaster recovery coordinator, etc.) and the appropriate business owners of critical organization functions.

7.4 BC Execution Stages versus BC Planning, CP and CM

Figure 4: BCP Stages Mapped Against Planning Processes
The relationship among the various planning processes, namely, BC planning, contingency planning (CP) and crisis management (CM), is shown in the Figure 4.
It is essential for BC planners to fully understand the WHO’s pandemic framework and its corresponding stages and phases. The activation by the WHO may result in an escalation by the local government. It is suspected that the local governments and health authorities will escalate their pandemic flu alert status ahead of the WHO.

8. Conclusion

In summary, the pertinent aspects of pandemic flu business continuity (BC) planning were discussed. The key challenge for the businesses is in the preparatory process is the synchronization of the business continuity plan and procedures with the World Health Organization’s and the local health ministry’s pandemic alert phases. Several probable outbreak situations, and several more possible variations in responses to them, makes the planning process one of the most complicated challenges facing business continuity professionals. The key outcome is the understanding of the scope of implementation of contingency, BC or crisis management plans and the application of the BC execution stages to implement the necessary actions to prepare an organization of the impending pandemic flu outbreak.

9. About the Author

Dr Goh Moh HengDr Goh Moh Heng is the President of BCM Institute and is regarded as one of the leading practitioner in the area of business continuity. He hold a PhD and also been awarded the highest level of certification from the three major business continuity management institutes. He is the author of nine business continuity management books. Dr. Goh is instrumental in creating the first Wikipedia for BC www.BCMpedia.org. He can be contacted at moh_heng@bcm-institute.org.

10. References

[1] BCMpedia (2008). Definition of Business Continuity and Disaster Recovery Terminologies, http://www.bcmpedia.org
[2] Goh, Moh Heng (2008). Managing Your Business Continuity Planning Project, 2nd Edition, 166 pages.
[3] Goh, Moh Heng (2006). Developing Your Pandemic Influenza Business Continuity Plan, 128 pages.