Posts

Simulation Exercise

Choices and Categories of Tests & Exercises

Abstract

In testing and exercising the BC plans, the terminology for the various type of tests and methodologies often poses a challenge for any BCM professionals when they are about to start their testing and exercising programmes. The paper is a summary of tests, and it is not intended to be comprehensive list, so as to provide a good foundation of the types of tests that a BCM professional is are likely to embark upon.

1. Introduction

Most BCM professionals find it challenging to identify the type of tests and exercises, that to be conducted for their organization. It is usually a long list and there are many variations within the discipline.

1.1 Categorization

There is several ways of categorizing the types of tests. One approach is to be based on the actions to be taken. An example would be: Desk check, simulation, procedure verification, communications and IT environment walkthrough. Another approach is to list all the possible types of tests to be conducted and then select the type of tests that is useful for testing the requirement outcome based on the readiness level needed by the organization. This includes component, integrated, simulation and live test.

The approach in this paper is to describe the techniques or methodology as the content and objective of the plan can be developed separately. Additional terminology relating to testing can be found in www.BCMPedia.org.

2. Component Tests

The following are sample of the type of tests that could be conducted as part of a component test for a typical business continuity plan.

2.1 Confirm Availability / Version of Plan

This test is designed to check that key staff in both business and support recovery teams can gain access to a hard-copy of their continuity plan at any time. As part of your maintenance program, you should include procedures to “visit” your plan at pre-defined intervals, to update personnel details and to ensure that recovery measures remain relevant.

2.2 Retrieve Vital Hard Copy Records from Offsite Locations

As a good practice, the hard-copy records of documents critical to business operations should be kept in an offsite location. This Component Test confirms that such records are indeed available offsite, are sufficiently up-to-date to be of use in a crisis and can be promptly retrieved within the expected time frame.  These documents may include copies of contracts, agreements, insurance policies, floor plans, title deeds as well as any special reference manuals required to conduct business operations in a crisis situation.

2.3 Contact Staff, Suppliers & Others

One of the most straightforward but important tests is the telephone notification procedure. This is typically carried out on three main groups of people:

  • Staff
  • Suppliers or vendors, who provide you goods and services
  • Other contacts, including customers or others to whom you provide goods and services

Whilst the principles of these tests are similar, you should consider differences in the relationships between your organization and the groups of people and tailor the approach of testing for each group accordingly.  The benefits of carrying out these tests are:

  • Establish that the contact telephone numbers in your plan are correct and up-to-date.
  • Confirm that the resources you require in a crisis, both human and otherwise e.g. equipment and supplies, can be obtained when and where needed.
  • Ensure that the targeted degree of recovery matches the expectations of your internal or external customers.

It is highly likely that you will need to modify your plans following each test. These tests play a very important role in the maintenance program and their value should not be under-estimated.

2.4 Check Lead Times for Critical Equipment

This is to establish the lead-times for the delivery of critical equipment. This differs from testing suppliers of services as it relates to availability of specific items rather than the ability to contact personnel. This is a simple test, which applies to both business and support units.

2.5 Confirm Alternate Site Readiness

This test is used to confirm the readiness of the personnel at the alternate site to receive people from a business unit or building who are displaced due to an incident.  The procedure will vary depending on location and on whether the recovery will be at a commercially operated alternate site or at another organization’s building. In any case, a Service Level Agreement (SLA) should be in place confirming the agreed relocation arrangements. This document will state the expected time frame for the relocation, where all relevant parties (Officials from the alternate site as well as the Central Support Business Units of the organization carrying out the recovery) must acknowledge, confirming that they find the time frame acceptable, reasonable and attainable.  Given that alternate site recovery contracts are usually held centrally and that only certain staff can invoke such plans, it will be assumed, for the purpose of this test, that recovery will be at a site controlled by the organization.

2.6 Test Staff Members’ Knowledge of Business Unit Plan

The person conducting the test visits the business unit BCM coordinator and staff members of a selected business unit and tests how much he/she knows about the procedures without the staff having access to the plan. This will confirm the business unit staff members’ knowledge of the plan and potential ability to ensure the recovery of the business unit if, for whatever reason, a copy of the plan is not initially available.

2.7 Spot Check of Vital Records

This test involved the business unit BCM coordinator and staff members of a selected business unit to visit the offsite location where the vital records are kept. While at offsite location, the team is required to perform a review using a checklist of the inventory of vital records.

2.8 Recall Offsite Storage

This relates mainly to support business units and should not be confused with the retrieval of vital hard-copy records, which is covered separately.
The list of support business units at a medium to large operation would normally include the following:
  • Premises/ Facilities
  • Information Technology
  • Telecommunication/ Networks
  • Security
  • Public Relations
  • Human Resources
  • Administration/ Correspondence
  • Legal/Compliance
  • Financial Control
  • Transport

In order to meet the everyday needs during a disaster, these business units are likely to have spare items such as furniture, equipment, cables, server tapes, back-up disks, stored offsite. In some cases they will be stored in another organization’s building premises and in others, an external storage contractor may be used.

The purpose of this test is to confirm that the business units can access and/or arrange delivery of the required items within the expected time frame stated in the plan.2.9 Check that Important Lists are Still CurrentThis ensures that important lists are up-to-date. Each business continuity plan contains a number of lists, e.g. list of key items or contacts required in a crisis. The information stated in the lists can be used to contain the impact and/or limit the damage to the business.  The following are key lists in a typical business continuity plan:

2.9.1 Personnel Contact List

In addition to a Telephone Call Tree chart, business unit coordinators should have an updated Personnel Contact List.

2.9.2 Initial Action by Business Units

Important business units should each have a brief list stating the tasks which key team members need to undertake in the opening stages of a disaster scenario. These members should have this list with them at all times.

2.9.3 Inventory of Resources

This lists all key resources. Regular checks should be done to confirm they accurately reflect the needs of each business unit.

2.9.4 PC Software Versions

The lists of IT hardware and software, (showing the version) should be kept up-to-date. “Systems” for unique software should be regularly tested and not just stored in an IT business unit.

2.9.5 “Grab” List

This is a list of small items, identified as being useful, which staff will try to take with them as they evacuate.

2.9.6 Priority Salvage List

This identifies items a business unit BCM coordinator might ask someone to hand-carry from the office, if that person was allowed back into a building for, say, 30 minutes.

2.9.7 Essential Forms / Stationery

If a business unit has any special stationery or printed forms without which the business cannot operate, a small supply of these should be stored offsite and the location recorded in the plan. The tests for confirming the contents of these key lists are simple and quick to conduct.

3. Notification Call Tree Test

Even though this is a Component Test, the critical importance of this test cannot be ignored. In a Telephone Notification Call Tree Test for recovery teams, the recovery team members will notify designated staff members as documented in the plan. This personnel communication network forms one of the most efficient and effective means of communicating any news or instructions to all relevant staff, and should include the entire organization.

4. Walk-through Test

In a Walk-through, recovery team members meet to verbally walk-through the steps of each component of the business continuity process as documented in the business continuity plan.

5. Integrated Test

An Integrated Test involves integrating any number of the components in the order that they would occur during actual recovery operations. Integrated test builds on test successes and increasing employee awareness generated during component testing. Organization BCM coordinator and business unit BCM coordinators should realize that the increased complexity, coordination of multiple teams, involvement of other interested personnel and budget considerations will limit the frequency of integrated testing.

6. Incident Simulation Test

This involves the development and use of pre-written test scenarios or test scripts for disaster events. The scenarios tell the team members how to react to such disasters and give organizations a baseline from which to start their recovery plans.

7. Partial Simulation Test

Similar to Full Simulation (below) except that only several business units will be involved. However, for these business units, the testing will be to the fullest detail and scope.

8. Full Simulation Test

Full Simulation test is the ultimate BC plan test which activates the total BC plan. Full Simulation test is also called Full Interruption test or Mock Disaster test. The purpose is to simultaneously test as many components as possible in the organization recovery structure. The test is likely to be costly and could disrupt normal operations, and therefore should be approached with caution. Adequate time must be scheduled for the testing.

To successfully test recovery capability, the tests must evaluate the recovery procedures and documentation, not the inherent knowledge of the staff.
Each test must have a set of primary and secondary objectives to define the direction of the test and to measure its success. An example of such objectives; the primary objective is to evaluate success or failure and the secondary objective is to test if extra time is available.

9. Live Test

Finally, this is the ultimate of all tests. It is perhaps, the most challenging test that any BCM professional would deemed to undertake as this is where anything can go wrong will go wrong. To worsen the situation, this errors of this test will be seen live in the presence organization-wide and especially with senior management.

10. Conclusion

The decision on the types of test to be conducted can be an uphill task initially for many BCM professionals. There is an pressing expectation from the management to test the BC plan to its readied state. Hence, the identification and implementation of correct series of tests for an organization becomes the key necessity for any organization who has a BC plan.

11. References

[1] BCMpedia (2008). Definition of Business Continuity and Disaster Recovery Terminologies, http://www.bcmpedia.org
[2] Goh, Moh Heng (2008). Managing Your Business Continuity Planning Project, 2nd Edition, 166 pages.
[3] Goh, Moh Heng (2008): Conducting Your Impact Analysis for Business Continuity Planning, 130 pages.
[4] Goh, Moh Heng (2008): Analyzing & Reviewing the Risk for Business Continuity Planning, 162 pages.
[5] Goh, Moh Heng (2005): Developing Recovery Strategy for Your Business Continuity Plan, 104 pages.
[6] Goh, Moh Heng (2004): Implementing Your Business Continuity Plan, 104 pages.
[7] Goh, Moh Heng (2006): Testing & Exercising Your Business Continuity Plan, 2nd Edition, 160 pages.
[8] Goh, Moh Heng (2007): Managing & Sustaining Your Business Continuity Management Programme, 190 pages.
[9] Goh, Moh Heng (2006): Developing Your Pandemic Influenza Business Continuity Plan, 128 pagesAbout

The Author

Dr Goh Moh HengDr Goh Moh Heng is the President of BCM Institute www.bcm-institute.org and the Managing Director for GMH Pte Ltd www.gmhasia.com , an Asia-Pacific BCM consultancy firm. During the last 20 years, Dr Goh had conducted several hundreds of tests and exercises for clients throughout the world.  It ranges from the many simple notification tests, walkthrough tests to the large simulation and live tests. Sometests worth mentioning include the enterprise-wide crisis management simulation, full simulation test and unannounced live tests for many international organizations. He hold a PhD and also been awarded the highest level of certification from the three major business continuity management institutes.  He is the author of nine business continuity management books.  Dr. Goh is instrumental in creating the first Wikipedia for BC www.BCMpedia.org. He can be contacted at moh_heng@bcm-institute.org or moh_heng@gmhasia.com.

Citibank IWE Exercise

Planning basics for a crisis management simulation

How prepared are your people and teams for the pressure of a crisis?

Have we checked the effectiveness of our crisis management plan? Are our staff familiar with it? Have the procedures changed recently? Is it still current or is it sitting on a shelf? I strongly believe that these are some of the questions that many of the management have to deal with when it comes to Crisis Management.

All the above questions can be answered when an organisation is ready to go the extra mile in preparing simulation options to validate your plans, rehearse your procedures and prepare your people, from strategic senior management level to the tactical level and operational front line staff.

Simulation should be designed not as a test that can be failed but a process that enables an organisation to apply and minimise impact during any crisis situation and it should allow staff to develop and gain confidence in their roles.
Simulation should be developed and delivered to meet a specific crisis management objective or as part of a developmental programme designed to increase the crisis readiness of our organisation on an on-going basis. The scale of the simulation should be based on you organisation’s need, the complexity of your organisation and the maturity of your crisis management team.

Crisis Management Simulation Options

These are some of the methods that can be used to validate your plans, rehearse your procedures and prepare your people. You could start a simulation ranging from a simple walk-through of crisis response plans and table top simulations to multi-agency, resource-intensive simulations of crisis scenarios, played out in real-time.

  • Crisis Management Plan Walk-through
  • Scenario Based Workshops
  • Table Top Simulations
  • Full Simulation Simulations
  • Live Simulations

Methodology

The methodology you choose to perform your crisis management simulation is crucial as it will drive you through from the initial meeting, scope, objectives to the delivery of a post-simulation report and recommendations. This will also lead you in preparing the simulation architecture, design the scenario and plans, work on the supporting documentation and preparation of “players”, observers, and supporting staff as well as delivering the event on the day and gathering feedback.

About the Author

MuruganMurugan is currently the Assistant Vice President for GMH Continuity Architects office based in Malaysia.  Murugan has vast experience in development and deployment of Business Continuity Management Projects/Program/Workshops for Banks & Financial Institutions (Local & International), Information Technology, and World-Class Event Management. He has also managed and actively involved in IT Outsourcing engagement largely for financial institution and other industries namely Data Centre Services, Media Management, Service Desk and Contact Centre Services and was also responsible to drive performance across the organization, guiding collaborative teams, to implement strategic initiatives to protect the company’s business operation. Appointment within BCM Institute Murugan. M is an Instructor with BCM Institute.