A data center facility manager has a job with basically no margin for error. It’s not enough to respond quickly to problems. The real challenge is preventing problems from occurring in the first place. These tips can help data center facility managers prevent problems that can compromise uptime or reduce efficiency.
A data center facility manager is well aware of the needs of the data center infrastructure. If demand on a UPS system hits a predetermined threshold of capacity, the facility manager knows that it’s time to start thinking about adding capacity. Since almost every organization’s appetite for data processing is rising, it makes sense to keep the data center power and cooling infrastructure well ahead of that growth.
But that doesn’t mean it’s time for an RFP. Instead, the first step is a reality check. Before a major investment in a data center facility is made, be sure you understand the direction that IT plans to go to meet the organization’s need for compute.
"What we're seeing today is a definite change in the decision-making process before people make an investment into the data center,” says Jeff Gilmer, senior partner, Excipio Consulting. “The facilities people really need to understand the direction of the technology group and the business group before they go ahead and make a business decision to invest in upgrading the infrastructure. For a long-term investment, it's important that they all get involved."
He cites an organization where the facility team was ready to sign purchase orders for millions of dollars of equipment to increase capacity, when in fact one third of the compute gear in the data center was due to be retired, in part because of a planned move to the cloud, Gilmer says. The result was that demand for power was going to drop by 50 percent.
A skilled staff is crucial to avoiding problems that can jeopardize data center uptime. But a variety of forces are conspiring to reduce the onsite knowledge base. Difficulty getting qualified staff is one factor. Tighter budgets add to that challenge. And the increasing complexity of equipment and designs can complicate matters, requiring more knowledge on the part of on-site staff.
One solution is to rely more on outsourced vendors. Another work-around is to use technology to do more remotely. Both of these are legitimate options. But both can reduce the amount of expertise available on-site in the event of something unexpected.
"Operations are a huge part of critical facilities,” says Michael Fluegeman, principal, director engineering at PlanNet “The problem is you don't have the knowledge on site you used to have. It's a change I've seen in the industry over the years — you just don't have people anymore to operate equipment.”
In the past, data center staff would at least learn how to do the basic operations and transfers, Fluegeman says. If service providers came in for preventive maintenance, the staff would take the opportunity to learn from them. “The problem with vendors that know everything about your facility is they're not there all the time. and they may not be readily available when you need them,” Fluegeman says.
One easy-to-ignore aspect of onsite knowledge is documentation. “Many construction drawings are not maintained by owners,” Fluegeman says. What’s more, they don’t keep written procedures on how to do basic things. “Service providers may have procedures, but they keep them close to the vest because that's their job security.”
It’s essential for data center facility managers to realize where the risks are and how their staff can address problems themselves. “One of the things you can do is request documentation and keep it,” Fluegeman says. You may have to put the request for documentation in the contract to ensure that you get it, he adds.
From cooling equipment to UPS batteries to switchgear, a data center is full of equipment installed to prevent downtime. Needless to say, no facility manager wants a data center to go down because of a failure of some component of the facility infrastructure. But those systems all have different life expectancies, points out Mark Evanko, principal engineer, BRUNS-PAK. Facility managers should “put a scorecard together to know what needs to be replaced and when.”
There’s a potential silver lining in aging equipment. “There's a lot more efficient equipment today,” Evanko notes. That’s why it pays to be aware of improvements in technology as existing equipment becomes long in the tooth.
But to take advantage of those improvements, facility managers need to keep an open mind. “People say, 'I've been using this same type of UPS system for the last 20 years and I want to keep using this kind of system,' or 'I'm familiar with this type of cooling so I want to cool with this system,’” Gilmer says. While hands on experience is valuable, it’s not the whole story. “There's newer and better things out there, and we need to be open-minded when we go to look at system design.” The question to ask: What are the best solutions for our organization in the long-term?
Does a data center really need its own operating staff? The answer is Yes, say experts. They emphasize that it’s a mistake to expect the staff responsible for office space to pull double duty by taking on a mission critical data center. "Separately staff your critical facilities to ensure they receive the focused attention they require,” says David Boston, director of facility operations solutions for TiePoint-bkm Engineering. “Critical facilities require uninterrupted preventive maintenance and specific experience in providing continuous power and cooling to an ever-changing mix of computer equipment. Office facilities have an entirely different set of needs. Office facilities teams must support a high volume of requests from occupants, with little predictability in volume. Weather also causes a need to react more often. A staff expected to support both will almost always fail to support both entities successfully.”
That’s an opinion shared by other experts. “The people you have on site are doing everything from changing light bulbs to dealing with people complaining about the temperature in the office — they're spread too thin,” Fluegeman says.
The best strategy, Boston says, is to assign a separate group with its own supervisor or lead person to support critical facilities. That dedicated focus will help ensure success.
Dave Lubach has covered a wide range of facility technology topics. He was formerly the associate editor for Facility Maintenance Decisions magazine.
David Boston, Michael Fluegeman, and Jeff Gilmer are speaking at the 2018 Critical Facilities Summit.