Disaster Recovery: Identifying Risks and Critical Facility Operations
OTHER PARTS OF THIS ARTICLEPt. 1: This PagePt. 2: Devising a Disaster Recovery PlanPt. 3: The Aftermath: Minimizing Disaster Impact
Most institutional and commercial facilities will face an emergency sooner or later. Some events are more severe than others, but all of them have the potential to disrupt, cripple or shut down a facility. Disasters come in many forms, from floods and hurricanes to power outages, equipment breakdowns and workplace violence.
The type, size and severity of the event will dictate the probability that a facility will be out of operation, as well as the situation's duration and the speed with which services are restored.
For maintenance and engineering managers overseeing the process of restoring services to facilities, the priorities in the aftermath of a disaster depend on a facility's criticality to the continued operation of the organization.
The first step in risk management related to disaster recovery is assessing and understanding risks. Managers need to determine the critical functions of each facility and the utilities required to keep them operating. They also need to identify single-point failures and eliminate as many of them as possible. They need to isolate critical operations from those that are non-critical. Isolated systems can act as redundant systems if sized to handle the additional loads and arranged so they can switch from one power source to another.
Virtually every facility relies heavily on HVAC and related utility systems. Depending on a facility's type and location, as well as the time of year, it might not need heating or cooling, and some facilities can continue a day or two without plumbing.
But few facilities can continue more than a few hours without electricity, and fewer still can get by without Internet access. Communication systems become even more important during an emergency, but with two-way radios and cell phones, these systems are less likely to fail than years ago.
Managers generally have little or no control over the utility-distribution systems outside a facility. Other parties are responsible for scheduling and paying for the maintenance and repair of streets, sewers, and power and water lines. Utility loops around a facility create redundant systems, which not only provide a backup after a disaster, as well as during routine maintenance.
A disaster or an emergency is not likely to disrupt water and gas lines, which generally are located underground and, therefore, inherently protected, but even the smallest event can disrupt electricity service and communications. These latter systems tend to the first services restored afterward.
Electricity is usually the first thing managers think about when discussing isolation and redundancy. Most facilities have multiple services, generators and batteries to supply electricity because they are high priorities and easily disrupted. But if water is vital to operations, managers should give it the same consideration they give to the electrical system.
Generally, it is difficult and costly to have redundant rooftop units. Using several smaller units reduces the impact on a facility affected by the failure of one such unit, but installing several units can increase both first costs and operating costs.
The use of redundant cooling units or secondary electrical feeders for those units is just as important as ensuring the space has electricity. Having redundant cooling units for a server room is now essential, given the spread of these mission-critical areas in most types of facilities. But managers too often forget about backup cooling for the people using that server room, as well as for the critical systems.