Develop Comprehensive Work Rules, Procedures To Minimize Human Error In Data Centers

By David Boston  
OTHER PARTS OF THIS ARTICLEPt. 1: How to Minimize Human Error, Prevent Data Center DowntimePt. 2: Operations Objectives Should Drive Data Center Staffing DecisionsPt. 3: This PagePt. 4: Site-Specific Infrastructure Training Can Help Limit Data Center Human ErrorPt. 5: How To Use Incentives To Improve Data Center Staff Retention

4. Rigorously applied work rules. Comprehensive work rules and procedures can go a long way toward reducing human error in data centers. This is perhaps the simplest and most effective step to take towards minimizing human error, and one of the least implemented. The intent is to require every individual who will set foot in the facility to read, discuss, and sign a document that spells out common rules for working in the building. By explaining the most common risks to the operation and establishing expected behaviors before an individual enters the building for the first time, a significant number of incidents may be avoided.

One factor seems to prevent this practice from being more universally implemented: the executive in charge of the critical facility (who all the collective departments involved ultimately report to) fails to demand that every person is required to read, discuss, and sign the document. Executive backing is essential because 15 to 30 minutes are required to review the document with every new person a department hires or contracts. The practice applies to all employees, management, contractors and vendors. (A senior executive is just as capable of causing interruptions as an electrician, if the executive lacks familiarity with the facility's unique processes.)

A successful work rules document will incorporate everything from the mundane (no food or beverages in critical areas) to life safety (appropriate arc flash gear will be worn during certain electrical work activities).

5. A comprehensive, site-specific procedures program. Those who maintain the critical facility's infrastructure systems require written procedures to consistently carry out riskier activities such as system transfers, when system redundancy is reduced as equipment is brought off-line for maintenance or repair. Just as important are procedures for resolving emergency scenarios. A critical facility may require 150 to 200 documents to cover both of these categories, due to the number of infrastructure systems involved. This number seems high when compared to a non-critical facility's needs. However, by comparison to another critical endeavor, it is roughly one-fifth the number of procedures required for operating a nuclear submarine.

In all cases, procedures need to be site-specific, as each facility's configuration is unique. One individual on the facilities staff must be assigned the role of procedures owner and be provided dedicated time each month to make continual progress with the program. Typically, the procedures owner is provided a contracted resource to get the program started.

Establishing a master list and a standard format are important initial steps. A single owner will ensure all procedures are similar, so as to minimize confusion for the reader. Testing each draft document for clarity is essential. This should be done with the least knowledgeable individual on the team, which will vary for each system. An intuitive filing system is needed, both electronic and hard copy, so the documents may be accessed quickly in an emergency. Less critical categories of procedures, such as non-invasive PM tasks and building inspections, do not require the same level of formality or rigor.

Examples of Critical Facility Work Rules Documents

Here are some of the many examples of work rules documents available on the Internet

Contact FacilitiesNet Editorial Staff »

  posted on 8/16/2013   Article Use Policy

Related Topics: