Actions

Disaster Recovery Plan (DRP)

A disaster recovery plan (DRP) is a documented process or set of procedures to recover and protect a business IT infrastructure in the event of a disaster.[1]


A Disaster Recovery Plan (DRP) is a business plan that describes how work can be resumed quickly and effectively after a disaster. Disaster recovery planning is just part of business continuity planning and applied to aspects of an organization that rely on an IT infrastructure to function. The overall idea is to develop a plan that will allow the IT department to recover enough data and system functionality to allow a business or organization to operate - even possibly at a minimal level. The creation of a DRP begins with a DRP proposal to achieve upper level management support. Then a business impact analysis (BIA) is needed to determine which business functions are the most critical and the requirements to get the IT components of those functions operational again after a disaster, either on-site or off-site.[2]


Disaster Recovery Plan
source: Delcorp Data


Scope and Objectives of a Disaster Recovery Plan[3]
The Disaster Recovery Plan provides a state of readiness allowing prompt personnel response after a disaster has occurred. This, in turn, provides for a more effective and efficient recovery effort. The Disaster Recovery Plan should be developed to accomplish the following objectives:
1. Limit the magnitude of any loss by minimizing the duration of a critical application service interruption.
2. Assess damage, repair the damage, and activate the repaired computer center.
3. Recover data and information imperative to the operation of critical application’s.
4. Manage the recovery operation in an organized and effective manner.
5. Prepare technology personnel to respond effectively in disaster recovery situations.
Every business has the responsibility to respond to any short or long term disruption of services. By developing, documenting, implementing and testing this Disaster Recovery Plan, businesses will be able to restore the availability of critical applications in a timely and organized manner following a disaster occurrence. In order to accomplish these objectives, the technology area will depend on support from senior management, end users and staff departments. Disaster Recovery Plan activities are initiated by a situation or disaster alert procedure. After discovery of an incident, technology management will be informed of a potential disaster at the computer center. The Recovery Management Team will perform an assessment of the situation and determine if there is a need to declare a disaster and activate the Disaster Recovery Plan. When the Plan is activated, assigned recovery personnel will be alerted and directed to activate their recovery procedures.


Types of Disaster Recovery Plan (DRP)[4]
There is no one right type of disaster recovery plan, nor is there a one-size-fits-all disaster recovery plan. However, there are three basic strategies that feature in all disaster recovery plans:
(1) preventive measures,
(2) detective measures, and
(3) corrective measures.
Preventive measures will try to prevent a disaster from occurring. These measures seek to identify and reduce risks. They are designed to mitigate or prevent an event from happening. These measures may include keeping data backed up and off site, using surge protectors, installing generators and conducting routine inspections. Detective measures are taken to discover the presence of any unwanted events within the IT infrastructure. Their aim is to uncover new potential threats. They may detect or uncover unwanted events. These measures include installing fire alarms, using up-to-date antivirus software, holding employee training sessions, and installing server and network monitoring software. Corrective measures are aimed to restore a system after a disaster or otherwise unwanted event takes place. These measures focus on fixing or restoring the systems after a disaster. Corrective measures may include keeping critical documents in the Disaster Recovery Plan or securing proper insurance policies, after a "lessons learned" brainstorming session. A disaster recovery plan must answer at least three basic questions:
(1) what is its objective and purpose,
(2) who will be the people or teams who will be responsible in case any disruptions happen, and
(3) what will these people do (the procedures to be followed) when the disaster strikes.


Stages of a Disaster Recovery Plan[5]
The goal of a DRP is to resume normal computing capabilities in as little time as possible. A typical DRP has several stages, including the following:

  • Understanding an organization's activities and how all of its resources are interconnected.
  • Assessing an organization's vulnerability in all areas, including operating procedures, physical space and equipment, data integrity and contingency planning.
  • Understanding how all levels of the organization would be affected in the event of a disaster.
  • Developing a short-term recovery plan.
  • Developing a long-term recovery plan, including how to return to normal business operations and prioritizing the order of functions that are resumed.
  • Testing and consistently maintaining and updating the plan as the business changes.


Phases in Developing an Effective Disaster Recovery Plan (DRP)[6]

  • Business Impact Analysis (BIA): Performing a careful and complete Business Impact Analysis (BIA) is critical to developing an effective Disaster Recovery Plan. During this phase, system requirements, functions, and interdependencies are analyzed — the results are then used to identify system contingencies as well as setting priorities. The Business Impact Analysis drives the Disaster Recovery Plan by identifying the applications and systems that will significantly impact the business in the event of a disaster. During this phase, it is vital that input be obtained from departments across the enterprise, from Human Resources and Customer Service, to Information Technology and Accounting. In addition to using this information gathering to define critical time frame, the BIA is also an effective strategy to educate the enterprise on the need for a Disaster Recovery Plan and to identify any alternative manual procedures that could potentially minimize the impact of an interruption in system availability.
  • Defining RPO and RTO: Critical to BIA is determining the Recovery Point Objective (RPO) and the Recovery Time Objective (RTO). The RPO is the point in time to which data must be recovered; the RTO is the overall length of time an IT component can be in recovery before it negatively impacts critical business processes. The analysis is important because different applications and IT components will have different RPOs and RTOs. For example, an application that supports a mission-critical application, such as customer order processing, may have a short RPO/RTO, while an application that runs an internal, non-customer facing of low import may have a much longer RPO/RTO.
  • Architecting Your Recovery Strategies Developing a solid Disaster Recovery strategy requires a comprehensive approach. Key items that need to be considered include network requirements, infrastructure needs, data recovery, data and record management, security and compliance. After the critical applications and data recovery objectives are identified, the company needs to architect the specific strategies and solutions to make sure the recovery objectives for applications, network and data are restored in the appropriate timeframes. Meeting these recovery objectives may involve deploying new architecture, tools, and infrastructure internally or with the assistance of an external service provider. Options to be considered include electronic vaulting, tape retention, or a dual data center approach.
  • Testing and Training: Performing thorough analysis and developing sound recovery strategies are critical to a solid Disaster Recovery Plan; however, testing the plan and training staff on executing the plan is vital to successful DR planning. The only way to validate that your plan will work is to test the plan on a regular basis and put a function in place to ensure it is updated to reflect changes in the environment. There are various levels of testing, with varying degrees of involvement - from a structured walk through with key technical resources verbally assessing the plan, to simulation testing where a disaster is simulated so the plan can be implemented, to full interruption testing, in which the disaster recovery plan is activated in total. The organization needs to establish the testing required to effectively assess the validity of the plan. In tandem with testing the plan is the need to train assigned personnel both on their roles in the disaster recovery scenario and on the broader content of the plan itself. Like testing, the organization needs to regularly revisit the training plan to address the organizational changes, new hires, and attrition that are inevitable.


Key Elements of a Disaster Recovery Plan (DRP)[7]
Ensuring that your assets, data and hardware are protected is only part of a disaster recovery plan – the rest is determining a process for how quickly you can be back up and running. Rather than scrambling to put the pieces back together after a major storm, it’s time to put a plan in place. Here are the seven key elements of a business disaster recovery plan.

  • Communication Plan and Role Assignments: When it comes to a disaster, communication is of the essence. A plan is essential because it puts all employees on the same page and ensures clearly outlines all communication. Documents should have all updated employee contact information and employees should understand exactly what their role is in the days following the disaster. Assignments like setting up workstations, assessing damage, redirecting phones and other tasks will need assignments if you don’t have some sort of technical resource to help you sort through everything.
  • Plan For Your Equipment: It’s important you have a plan for how to protect your equipment when a major storm is approaching. You’ll need to get all equipment off the floor, moved into a room with no windows and wrapped securely in plastic so ensure that no water can get to the equipment. It’s obviously best to completely seal equipment to keep it safe from flooding, but sometimes in cases of extreme flooding this isn’t an option.
  • Data Continuity System: As you create your disaster recovery plan, you’ll want to explore exactly what your business requires in order to run. You need to understand exactly what your organization needs operationally, financially, with regard to supplies, and with communications. Whether you’re a large consumer business that needs to fulfill shipments and communicate with their customers about those shipments or a small business to business organization with multiple employees – you should document what your needs are so that you can make the plans for backup, business continuity and have a full understanding of the needs and logistics surrounding those plans.
  • Backup check: Make sure that your backup is running and include running an additional full local backup on all servers and data in your disaster preparation plan. Run them as far in advance as possible and make sure that they’re backed up to a location that will not be impacted by the disaster. It is also prudent to place that backup on an external hard drive that you can take with you offsite, just as an additional measure should anything happen.
  • Detailed Asset Inventory: In your disaster preparation plan, you should have a detailed inventory of workstations, their components, servers, printers, scanners, phones, tablets and other technologies that you and your employees use on a daily basis. This will give you a quick reference for insurance claims after a major disaster by providing your adjuster with a simple list (with photos) of any inventory you have.
  • Pictures Of the Office and Equipment (before and after prep): In addition to the photos that you should have of individual inventory items, you’ll want to take photos of the office and your equipment to prove that those items were actively in use by your employees and that you took the necessary diligence to move your equipment out of harms way to prepare for the storm.
  • Vendor Communication and Service Restoration Plan: After a storm passes, you’ll want to begin running as quickly as possible. Make sure that you include vendor communication as part of your plan. Check with your local power provided to assess the likelihood for power surges or outages while damage is repaired in the area. You’ll also want to include checking with your phone and internet providers on restoration and access.


Testing Criteria and Procedures for Disaster Recovery Plans[8]
Best practices dictate that DR plans be thoroughly tested and evaluated on a regular basis (at least annually). Thorough DR plans include documentation with the procedures for testing the plan. The tests will provide the organization with the assurance that all necessary steps are included in the plan. Other reasons for testing include:

  • Determining the feasibility and compatibility of backup facilities and procedures.
  • Identifying areas in the plan that need modification.
  • Providing training to the team managers and team members.
  • Demonstrating the ability of the organization to recover.
  • Providing motivation for maintaining and updating the disaster recovery plan.


Reviewing a Disaster Recovery Plan[9]
The National Institute of Standards and Technology (NIST) is a good resource for standards and guidelines to help businesses build a solid IT recovery plan. In its Guide to Test, Training, and Exercise Programs for IT Plans and Capabilities, recommendations are outlined to help organizations systematically review their plans to identify gaps that need to be addressed. There are three common ways to review a plan:

  • Testing the system. An organization can test the system. Tests often focus on recovery and backup operations. An example would be removing power from a system to evaluate how quickly the organization can recover.
  • Conducting a tabletop exercise. These are discussion-based exercises where a facilitator presents a scenario and asks the exercise participants questions related to the scenario, which initiates a discussion among the participants of roles, responsibilities, coordination, and decision-making. A tabletop exercise is discussion-based only and does not involve deploying equipment or other resources.
  • Conducting a functional exercise. This allow staff to execute their roles and responsibilities as they would in an actual emergency situation, but in a simulated manner. The goal is to exercise the roles and responsibilities of specific team members, procedures and assets involved in one of more aspects of the recovery plan.


Disaster Recovery Plan Mistakes to Avoid (See Figure Below)[10]
1. Not Updating Your DRBC Plan: Many businesses will create a recovery plan and put it on the shelf to gather dust. This is a crucial mistake that should be avoided at all costs. Businesses experience turnover and the names and responsibilities on the plan will change. If the employee assigned to a task leaves your company and a new employee is not assigned to that task, when disaster strikes nobody will do the task. Best practice would be to update your plan a minimum of once a year.
2. Having Only One Person Creating The DR Plan: Once you begin creating a DR plan you will realize how complex and time intensive it can be. There are many levels and facets to a great DR plan. You need the input of various employees from different departments in your organization to insure you have considered every contingency. Create a Disaster Recovery (DR) plan team that can work together and separately on creating the entire plan.
3. Thinking Your Insurance Policy and DR Plan Are The Same Thing: Having good insurance is a vital part of your DR Plan but it is only a part. You will need to know how to keep your company running while you are waiting for the insurance company to write you a check. Customers understand that disasters happen, but they expect companies to have a contingency plan in place to lessen the disruption of service.
4. Not Running Drills: Employees need to be trained on what to expect if a disaster happens. If they have tasks to complete, they need to have that knowledge before an actual disaster occurs. Employees and their alternates should practice their tasks to make sure they fully understand how to execute them. Finding out in a drill that someone is unable to complete their task is much better then when an active event occurs.
5. Depending On A Phone Tree: Communication during a disaster is critical. Phone trees are notorious for failing a few levels in. You need to make sure you have a reliable communication plan in place. Put your employees, vendors, and important clients contact information in your plan for easy access. Consider hosting your plan on a platform that offers a communication feature like the one in Stay in Business.
6. Over Assigning DR Duties To Employees: If the same employee is given the majority of the tasks to do, your company’s recovery time will be slowed. Only assign tasks to the same person that they can reasonably complete. Assigning teams to a task can help alleviate this problem. While one person needs to be responsible for overseeing the task gets done, multiple people can help accomplish it.
7. Not Tracking When Tasks Get Done: During the confusion that can occur because of a disaster, keeping track of what has been done and what still needs to be started can be a hard job. Employee accountability for starting and completing their tasks can be simplified by a tracking system or checklist. The more you can automate your reporting needs, the easier the recovery will be.
8. Not Learning From Disaster: When your company has been restored to normal operations and the disaster declared over, your DR planning is not done. Now is the time to go over what happened and how you can improve your plan for the next event. No plan is perfect, but a well thought out plan, that is allowed to change when new information is presented, has a better chance of protecting your employees and assets. In the end isn’t that why you made a plan in the first place?


Disaster Recovery Plan Mistakes
source: SIB


Benefits of a Disaster Recovery Plan[11]
There are many benefits to having a disaster recovery plan, but the biggest benefit is the level of disaster preparedness one can only get by taking the time to make develop a plan. A backup and disaster recovery plan is designed to keep business going after a disaster. It’s really a matter of saving your business from the cost of downtime. Just know that the benefits of having a disaster recovery plan are more than just readiness. Here are a few reasons you should consider making a disaster recovery plan that you may not have considered.

  • Asset and Inventory Management: The first part of a good backup and recovery plan is thorough documentation, which involves understanding equipment inventory. This is useful for identifying which pieces of equipment you have, which are extra but may come in handy, and which are completely superfluous. Any good IT administrator knows which equipment he or she has and where to find it. That way if there is a problem, whether small or large, spare equipment is quickly accessible. Good asset management also helps prevent employee theft, which can certainly happen at any organization.
  • Network Management: How can you successfully manage a network if you don’t know everything about it? Detailed documentation as part of a good backup and recovery plan helps you clearly understand the way a network is functioning, which allows you to remedy issues quickly. So if there’s a simple problem like a busted router or something awful like a server failure, you can handle it. RMM tools are great for this because they can help you document networked equipment automatically. Still, there’s a physical aspect that you shouldn’t ignore. Taking photos of equipment set ups - particularly in server rooms or closest - can be useful as well.
  • Task Redundancy: Part of your disaster plan involves making sure at least two people can do any one task. This keeps you covered in an emergency, but it doesn’t have to be a full on disaster for task redundancy to be useful. Have you ever had somebody leave on vacation, call in sick, or leave the company abruptly and on poor terms? This can cause huge problems if that person is the only one who can do a critical task. Not only that, but what about less critical tasks? As an example, suppose you need a person to perform a network diagnosis before you can fix something, but only that one person has the capability. If that person is too busy, it can create a bottle neck and you’re sitting around waiting. You could save time if only you could quickly do it yourself.
  • Cost Savings: We mentioned that good documentation can result in better management, but it can also help you identify areas where you could be saving money, particularly if it’s time for a hardware upgrade. Why run three separate servers when you can run three virtual servers on one physical piece of equipment? Your eagle-eye view can help you see where the cost savings might be and where you might be able to go virtual or to the cloud.
  • Ability to Test: How can you test a plan you don’t have? If you have a disaster recovery plan you can run through what would happen in various scenarios, which allows you to see your recovery in action. If you’re an IT provider, this also helps you establish trust with clients who can actually watch your test and see that you can deliver on any promises you’ve made.


See Also

Disaster Recovery Planning
Business Continuity
Business Continuity Plan (BCP)
Business Continuity Planning (BCP)
Risk Management
Enterprise Risk Management (ERM)
Crisis Management


References

  1. Definition of Disaster Recovery Plan ScienceDaily
  2. What is Disaster Recovery Plan Techopedia
  3. Scope and Objectives of a Disaster Recovery Plan Sans Institute
  4. Types of Disaster Recovery Plan (DRP) [1]
  5. Stages of a Disaster Recovery Plan Webopedia
  6. The Four Phases in Developing an Effective Disaster Recovery Plan (DRP) secure24
  7. Key Elements of a Disaster Recovery Plan (DRP) Kyle Cebull
  8. Developing Testing Criteria and Procedures for Disaster Recovery Plans Wikipedia
  9. Different ways to Review a Disaster Recovery Plan BerganKDV
  10. 8 Disaster Recovery Plan Mistakes Companies Make SIB/Amala
  11. Benefits of a Disaster Recovery Plan ACD Communications


Further Reading

  • How Effective Is Your Disaster Recovery Plan? Forbes
  • Information Technology Disaster Recovery Plan SOU
  • Guidelines for Generating a Disaster Recovery Plan University of Arizona
  • 10 Essential Steps to Developing an IT Disaster Recovery Plan That Works Cloed Endure
  • 8 ingredients of an effective disaster recovery plan cio.com
  • 7 things your IT disaster recovery plan should cover CSO Online
  • The Importance Of A Disaster Recovery Plan Iron Mountain