Category Archives: Episodes

(Part 3 of 3) Business Continuity Planning (BCP) and Disaster Recovery (DR) Planning for Commercial Banks of Nepal: Disaster Recovery, Training, Testing and Update, and Planning for the Pandemic

Disaster Recovery

Disaster recovery is a subset of business continuity efforts and basically deals with technological aspects of the BCPIn accordance with the NRB guidelines, during the Disaster Recovery Planning (DRP), the bank should choose suitable data recovery strategies for different business processes to meet the required RPOs and RTOs as specified in the BIAs of those processes.

The bank must put a management approved DRP in place to prepare for the recovery of critical business functions and continuation of technology infrastructure to achieve the same. Such plan should be able to strictly define the resources, action plan, tasks, procedures and data required to manage the technology recovery effort of the bank.

After you completed the BIA, it is a best practice to document a management-approved formal business continuity strategy in respect of people, premises, technology, information, and relationships. This strategy would be the key to guide the course of actions to be used in the development and implementation of the bank’s BCP.

During this process, the BCP Coordinator and the BCP Executive Team (with assistance from technical experts or advisors) should assign proper roles and responsibilities for various other BCP Functional Teams, such as Executive Management Team, Damage Assessment/Salvage Team, IT/Communications Team, Logistics/Transportation Team, Facilities/Security Team, PR/Communication Team, etc. During the disaster recovery process, BCP Functional Teams or Disaster Recovery (DR) Teams have distinct roles to play including but not limited to the following: 

Table-2: Roles and Responsibilities of BCP Functional Teams

Depending on the scope and goals of the BCP, banks could form other functional teams, such as Finance/Accounting Team, Human Resources Team to support their disaster recovery needs. These BCP Functional Teams, aka DR Teams, will be responsible for both the continuity as well as the recover aspects of the BCP. They are assigned with specific duties to perform in both pre and post disaster context.

Each team’s critical business information including call list, task list, customer list, immediate action plans, response procedures, critical equipment, software, supplies, vendors, vital records, etc. must be documented electronically, stored in the Cloud as well as in hard copy formats. 

Training, Testing and Update

Every bank should ensure that BCM is embedded in its organizational culture; as a result, all relevant personnel and staff are aware of their BCP roles and responsibilities. At the headquarters level, each BCP Functional Team (with the help of BCP Coordinator, BCP Executive Team and technical advisors) will be responsible for developing training and exercise materials for their teams based on the information contained in their BCP including  both ERP and DRP.

It is important that the awareness and training activities are followed by frequent drills (including tabletop exercise and departmental or full scale tests) for each BCP Functional Team or DR Team.

The NRB guidelines require that the BCP should be periodically tested (at least annually) to ensure its effectiveness. The testing should include all aspects and constituents of the bank i.e. people, processes and resources including technology infrastructure. BCP testing should be both planned and unplanned and should be audited by internal audit of the bank.

The guidelines further require that the testing and its outcome should be documented and amendments in BCP be made as suggested by the outcome of the test. In addition to regular testing, it is recommended that the team members and managers receive annual refresher training regarding the emergency alert, emergency response, and notification procedures, etc.  

The alternate site test procedure sits at the heart of the disaster recovery test. It deals with two major aspects; firstly exercising the system recovery procedures and establishing the communication links and secondly testing the recovery of the participating application software.

During the full scale test, the application owners and respective DR Teams are responsible to successfully run their applications at the alternate site. The full scale test provides opportunities to address areas where the exercise was successful, problems were encountered, and improvements were necessary.

The NRB guidelines suggest that the bank should check transaction and data integrity between Datacenter and Disaster Recovery site periodically. It is recommended to make this check as a part of End of Day (EOD) or Beginning of Day (BOD) process.

BCP Coordinator, in coordination with the DR Teams, should be responsible for the regular update of the DRP, especially following the full scale test. Afterwards, all participants should be notified of the changes as well as encouraged to maintain the hard copies of the same. Since the recovery solutions are primarily based on BIAs, the BCP Coordinator must also update the bank’s BIAs, at least annually.

The overarching objectives of a BCP testing and exercise program are to create a learning environment for all the participants and to document changes. Testing and exercising the DRP would verify that the recovery procedures work as intended and that the supporting documentation is current, accurate and relevant. Eventually, the program would help determine the state of readiness of the bank’s BCP.

Planning for the Pandemic

In the age of COVID-19 pandemic, it is highly pertinent for the commercial banks and financial institutions to recognize the fact that there are a few notable differences between the conventional Business Continuity Planning (BCP) process and planning for the challenges posed by the pandemic.

Unlike natural, man-made and technological disasters, the impact of a pandemic is highly difficult to determine because of the scale and duration of the crisis situation. These differences call for the banks and financial institutions to review their existing BCPs and prepare to take appropriate actions to respond to the COVID-19 crisis which has potential to cause major business disruptions; both internal as well as external and at multiple levels.

In a recently published report (Anticipate, prepare and respond to crisis, 2021) on the world day for safety and health at work, the International Labor Organization (ILO) particularly emphasizes that investing in a sound and resilient Occupational Safety and Health (OSH) system can build capacity to face future emergencies while supporting the survival and business continuity of enterprises.

During the COVID-19 pandemic, it is vital that workplaces adopt adequate policies and develop action plans for the prevention and mitigation of the contagion. These should include emergency response preparedness, as part of their BCP, and be in line with the results of proper risk assessments.      

COVID-19 presents an unusual risk scenario where a conventional BCP measure such as relocating staff to an alternate site may not necessarily mitigate the risk. Pandemic events may extend longer than a typical BCP risk scenario so an effective communication strategy is critically important as the pandemic continues to evolve over time.

In the meantime, banks and financial institutions need to ensure the continuity of their critical services, such as providing continued deposit and lending services, cash management, keeping ATMs and online banking functional, managing financial markets, and maintaining the payment and settlement system, etc.

Other key concerns may include health protection of staff, mitigating panic, strengthening morale, providing current and essential information to staff, and resumption of normal business activities once virus containment measures have been eased.

Banks and financial institutions should, therefore, establish a framework for COVID-19 operational risk-management. This framework should be able to put together a COVID-19 Committee, thereby conducting a thorough risk assessment and devising a pandemic response plan. Such plan, eventually a part of the OSH system, would support the bank’s business continuity in its true sense.

(Part 2 of 3) Business Continuity Planning (BCP) and Disaster Recovery (DR) Planning for Commercial Banks of Nepal: Business Impact Analysis (BIA), RPO and RTO, Backup Sites, and Datacenter (DC)

Business Impact Analysis (BIA)

BIA is the key element of the BCP planning process, since it provides the foundation upon which the BCP is developed.

Bank’s critical business functions are time-sensitive and must be restored first in the event of a disaster to avoid unacceptable financial and operational losses. BIA helps identify these time-sensitive critical business functions within various departments of the bankThe purpose is to identify the impacts of disruptions that may result in denied access to the critical banking services, buildings and facilities.

The NRB guidelines specifically require that there should be detail procedures for prioritizing critical business functions, incident handling and how the bank will manage and control identified risks.

BIA helps analyze the operational, financial and non-financial impacts on various bank activities (within each of the identified critical business functions), when these business functions are not available or the access to normal workspace is denied.

Furthermore, BIA also helps identify resource requirements, such as competent staff, office equipment, office technology, computer applications, vital records, office stationery, and third-party services etc. to support the technology and business recovery process of the bank.

As per the NRB guidelines, the bank should accurately determine and prioritize such mission-critical business activities along with their recovery strategy, alternate site locations, testing, training, etc.

It would be meaningful if the BIAs were conducted before the risk assessment in order to identify urgent business functions upon which risk assessment could be focused.

BIA is often completed in two major steps targeting first functional recovery (activity recovery) and next computer application recovery on a priority basis. The idea is to determine the bank’s functional recovery priorities, identify interdependent activities and establish appropriate recovery objectives so that Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs) can be set for those mission critical business functions as well as activities within them.

RPO and RTO

Recovery Point Objective (RPO) is the point in time at which backup data, such as backup tapes or replication must be restored and synchronized by IT to resume business processing. It is basically the frequency of data backup (e.g. software backup, user data backup, application backup, etc.) or in other words, the measure of data loss (in hours or days) acceptable to your bank.

For example, if you have an RPO of 24hrs, then the data restored from the backup will be 24hrs old and that the business function will manually recover the missing data.

In the best-case scenario RPO is zero which basically means that all affected computer systems utilize mirroring (real-time data/transaction copying) technology to simultaneously copy all incoming data/transactions to another identical system in a remote location.

Determining RPO may also depend on the modification frequency of the data that is being backed up. Data that does not change often can have longer RPOs, such as account information, personal records, employee records, etc. On the other hand, shorter RPOs are advised for frequently updated data, such as credit card data, financial transactions, etc.   

Recovery Time Objective (RTO) is the period of time within which IT systems, applications, or business functions of the bank must be recovered or put back in operation after an outage. That means a 24hrs RTO would indicate that the particular business function could operate using temporary manual workarounds for the first 24hrs following a disaster declaration. During this period the business function can continue to function in an emergency mode without access to the IT systems or applications.

Determining RTO may also include a “time of year” or “seasonal” component, such as busy festival times, end of fiscal year, quarterly reporting period, etc.; when a disruptive event can prove to be disastrous.

For example, in the middle of the month or quarter your finance team may go days without accessing the finance application, but during the end of the month or quarter, even few hours without this application can be extremely disruptive. 

The NRB guidelines require that the bank’s BCP should specify RPO and RTO of different business processes. The guidelines, however, allow the bank to choose from the Hot, Warm or Cold backup sites to meet the RPO and RTO requirements as specified in the bank’s BIAs. 

Backup Sites

For disaster recovery backup purposes, the NRB guidelines call for the bank’s own standby site and system or having it outsourced from some disaster recovery providers. Depending on RPO and RTO requirements, bank may opt for high availability system to keep both system and data replicated on remote site or live replication of data to offsite location. The bank may also choose to have full system backup, off-site incremental backups or backups made to electronic media and sent offsite periodically.

As per the requirements and criticality of business functions, it is recommended to go for a combination of above strategies utilizing Hot, Warm and Cold backup sites. 

Table-1: Comparison of Hot, Warm and Cold Backup Sites 

Datacenter (DC)

DC is a physical location which hosts computer systems and network equipment to facilitate and support day to day banking operations. It could be located on the bank premises, co-located outside or on Cloud.

Whatever the arrangements has been done for standby site (or disaster recovery sites; Cold, Hot and Warm), the NRB guidelines dictate that the bank should also adopt disaster mitigating strategies such as locally mirroring data and system, arranging UPS and generator for long term power failure, using surge protector to minimize the effect of power fluctuations and providing adequate physical and environmental controls in the DC.

Moreover, the delivery channels such as ATM, internet banking, mobile banking tend to significantly increase the risk of financial loss and electronic frauds along with other banking risks, such as credit risk, reputation risk, compliance risk, market risk, strategic risk, etc. Thereforethe DC, disaster recovery solution, enterprise network and security and branch or delivery channels should be designed and configured for high availability and no single point of failure, as prescribed by the NRB guidelines.  

The guidelines further requires that the location of building containing the DC and critical equipment rooms must be chosen so as to minimize the risk of natural and man-made disaster, flood, fire, explosion, riots, environmental hazards etc. Physical access to DC and critical equipment rooms must be restricted to authorized individuals only.

Cont’d….

(Part 1 of 3) Business Continuity Planning (BCP) and Disaster Recovery (DR) Planning for Commercial Banks of Nepal: BCP Policy, Hazard Identification, Risk Assessment, Vulnerability Reduction and Emergency Response

In today’s world, the banking and financial sectors play vital role in economic growth, stability and sustainability of a country. They are expected to provide 24/7 continuous and reliable services to an array of customers and stakeholders. However, one cannot deny the fact that banks and financial institutions are susceptible to internal as well as external threats such as fire, explosion, earthquake, pandemic, blockade, fuel shortage, severe storm, landslide, flood, fraud, cyber-attack, power outage, system failure, etc.

These hazards are capable of causing various kinds of risks (including financial, operational, legal, reputational, etc.) and may lead to severe business disruption; sometimes even cripple the financial system as a whole. 

As the country suffers the financial blow of COVID-19 pandemic, it is obvious that more than ever before we need to integrate sound and effective Business Continuity Management (BCM) practices within our banks and financial institutions. One of the tangible ways to ensure whether an institution has embraced BCM is to see that it has a ready and workable Business Continuity Plan (BCP) addressing all critical aspects of banking and financial activities pertaining to people, process, infrastructure, facility and technology. 

Nepal Rastra Bank (NRB) released its “Nepal Rastra Bank Information Technology Guidelines” in August, 2012. The core objectives of these guidelines are to promote sound and robust technology risk management and to strengthen system security, reliability, availability and business continuity in commercial banks of Nepal.

Under the title “Business Continuity and Disaster Recovery Planning” the guidelines set some specific requirements for the banks with relation to BCP policy, roles & responsibilities, risk analysis, vulnerability reduction, disaster response, business impact analysis, recovery strategy, datacenter, backup sites, resumption of business processes, training, testing and updates.

Disaster preparedness and disaster mitigation are the key planning aspects of any business continuity and disaster recovery effort. Disaster preparedness involves the activities performed prior to a disaster to support and enhance disaster mitigation measures. On the other hand disaster mitigation includes the action plans and activities to eliminate or reduce the effects of a disaster after it occurred.

Eventually, a BCP bundles together all the documents required for an effective execution of hazard identification, risk control, disaster response and business recovery to re-establish critical business functions after a disruptive event, such as a massive earthquake or the failure of a firewall security system.

The business continuity and disaster recovery planning requires some serious commitment and dedicated efforts from the executive leaders of the commercial banks. However, such efforts are more likely to be successful if they have the support of those in senior leadership positions. This is all the more so because at the end of the day board of directors and senior management are responsible for the bank’s business continuity.  

BCP Policy

The NRB guidelines require banks to develop a board-approved BCP policy and appoint a senior bank officer as the head of BCP process. The BCP policy should incorporate detail procedures for prioritizing critical business functions, controlling identified risks, allocating resources and manpower, handling emergency incidents and reviewing the policy periodically.

BCP can be simple or complex depending on the size, scope, goals and objectives of the bank. The BCP policy should establish achievable goals and set clearly defined objectives and milestones to achieve this goal. The goals and the objectives should encompass all aspects of the plan including hazard prevention, risk mitigation, disaster preparedness, emergency response and recovery of the business processes. Short-term objectives are essential to the development of the plan while long-term objectives may require more significant planning, investment and expertise.

BCP is likely to achieve the greatest success when a senior officer within the bank is fully dedicated to completing the assessment and organizing efforts to follow-up on the required tasks. Keeping this in mind, the bank should appoint a BCP Coordinator who is responsible for putting together and maintaining a comprehensive BCP based on its business impact analysis, risk assessment and recovery objectives.

Next, a BCP Executive Team should be formulated at the headquarters level comprising of senior officers from various departments; especially those working in the critical business areas of the bank. Similar BCP teams should also be replicated in different branch offices in order to correspond the working and functioning of the BCP Executive Team.

Hazards and Risk Assessment

Hazards are events that can give rise to business disruption or an emergency situationNRB guidelines require that a BCP should consider all possible hazards including natural, man-made, security threats, human errors, regulatory requirements, dependencies created by outsourcing activities and operations in multiple countries, etc.

To identify hazards, you should gather information about natural or man-made emergencies that may arise in your local area, as well as emergencies those may be created by the interruption of the bank’s own operations.

There are a variety of sources to collect hazard information, such as employees working in different departments of your bank, local media, disaster reports, government organizations, academic institutions, nonprofit agencies, etc. Also, find out about any emergencies that have occurred in the past and gather information about other potential hazards related to: fire, explosion, hazardous materials, flood, landslide, blockade, telecommunication or computer system failure, power outage, construction failure, human error, fraud, etc. 

You should also assess how likely such an event is, how we are exposed, what our vulnerabilities are, what assets are at risk and how severe the hazards’ impact would be.

An institution-wide risk assessment looks at the probability and impact of a variety of specific threats that could cause a business disruption. The entire process will also allow you to prioritize risks and move accordingly in your bank’s emergency response, business continuity and recovery planning processes.

It makes more sense to focus your risk assessment on the critical business functions identified during business impact analysis. Remember, this is not a once off process and you should regularly improve and append your risk assessment matrix to keep it current and relevant. The BCP Coordinator should be made responsible for periodic update of the same.

Reducing Vulnerability

After gaining better understanding of the hazards that may impact your banking activities, you now also have a sense of where and why you are vulnerable to such impacts. The next step is to use this information to do what you can to reduce your vulnerabilities as far as possible. This involves identifying and implementing pre-emptive measures to reduce vulnerability, as well as assessing your ability to respond to emergencies. 

You need to consider how your bank and employees would respond to emergency events. This includes an assessment of the existing resources by asking questions, such as how quickly can you react? Do we have the skills and inter-organizational relationships to respond swiftly and effectively? Can we identify alternative operational procedures?

For instance, if a landslide obstructs an important road that is part of your distribution network; would you be able to use alternative distribution route or method? If intermittent fuel shortage is a risk; would it make sense to keep a stockpile for such occurrences? If there is a server fire in the facility; would you have quick backup available for your digital data?

One of the most effective ways to assure your bank’s recovery from an emergency is to involve your employees directly in preparing and planning for disasters. When assessing your human resources, consider what you have in place already and what you need to do to help prepare and train your employees.

You may also consider purchasing insurance products especially for the hazards with a high risk priority score in your risk assessment matrix, if such products are available in the market.

You should plan ahead in case the bank needs assistance from others in an emergency. It is recommended that you contact external organizations that may be able to help Just in Time, during or after a disaster.

In some cases, formal agreements such as a MOU may be helpful to define the business terms, relationship and communication with these service providers during an emergency. This may also include, for example, making agreements with other banks to continue serving your clients while your bank is transitioning to backup operations.

Emergency Response

Emergency response refers to the bank’s initial activities designed to mitigate a disaster’s immediate and short-term impacts. An Emergency Response Plan (ERP) should include specific guidelines and procedures for declaring an emergency, activating internal response, notifying staff, maintaining line of communication, deploying the BCP and recovery teams, etc. It should also clearly illustrate how and when to move to an alternate site, how to access data stored off site, who is responsible for what, etc. 

An ERP basically identifies the structures and preparations you need to make to respond effectively to emergencies. It describes the steps your bank would take to protect itself and its employees before, during and after an emergency.

In case of big disasters such as an earthquake, you cannot expect immediate assistance from the communities and/or from any professional responders. Therefore the bank should be prepared with some internal responders (with prepositioned equipment) who can conduct basic search and rescue till the professional responders arrive at your place.

If your employees are briefed about potential emergency situations and how to respond, their response will be more effective and they are less likely to be confused or scared. This calls for an effective crisis-communication plan so that all your employees clearly know their roles during a disaster, as well as the roles and responsibilities of key personnel at your facility. 

Although there may seem to be an overlap between the ERP and the BCP, bear in mind that the ERP focuses primarily on pre-emptive measures for disaster preparedness and response activities immediately after a disaster, while the BCP is primarily a business recovery and a rather long-term impact mitigation plan.

The bank should have a board- approved written ERP that is routinely reviewed, exercised, and updated periodically to ensure that the plan really works during real emergencies.

Moreover, the NRB guidelines require that the bank should develop appropriate ERP, including communication strategies and outsourced services, to ensure business continuity, control reputational risk and limit liability of service disruption. The ERP should, inter-alia, cover mechanism to identify incidence as soon as it occurs, recovery of e-banking system and services, communication strategy to address external party and media, procedure to alert related regulatory body, etc.

Cont’d….