Over the years, facility executives have learned the language of Ns and 9s as a way of describing data center design redundancy (the Ns) and availability (the 9s). An N+1 data center, for example, has one redundant UPS system, one redundant generator and so on. A data center designed to six 9s of availability is expected to have 99.9999 percent uptime — in other words, downtime of less than 32 seconds a year.
But the Ns and 9s are only part of the story when it comes to keeping data centers running. Experts say many data center failures are caused by human error, not technical glitches. And the cost of simple mistakes can actually be greater in data centers designed to be most reliable. Those data centers are the ones where the most is at stake — where the financial impact of a data center crash may reach six or seven figures. Those are also the most expensive data centers, with price tags of tens, and in a few cases hundreds of millions of dollars. The expected return on that investment is straightforward: the elimination of chances that a power outage or facility infrastructure problem will bring down the data center.
But that expense will go for naught if a contractor mistakes the emergency power off button for a door opener and cuts off all electricity to the data center while trying to go home for the day.
No wonder preventing human error in its various manifestations is a crucial responsibility for facility executives who manage data centers. As data center designs have become more reliable, attention has been increasingly focused on the human element.
To explore issues related to staffing of data centers, Critical Facilities gathered a group of facility executives responsible for mission critical data centers in a variety of settings, including finance, research and manufacturing. The discussion ranged from new demands facing data center facility executives to specific staffing challenges to outsourcing issues.
The day before the roundtable, which was held in New York City, an article in the New York Times described staffing problems facing data centers. The challenge was familiar to everyone at the table. “Good people aren’t staying,” said Alexander Kogan, associate vice president, plant operations and housing at The Rockefeller University. “They jump ship quite often.”
One reason for turnover is the pressure to keep data centers up and running no matter what. “They’re on the line all the time,” said Gary Fescine, director of operations for Blackrock with global responsibility for data centers. “One thing goes wrong, they’re afraid for their jobs.” As a result, they may keep looking for what Fescine called “the perfect place,” where infrastructure design, policies and staff appear to an outsider to minimize risks. “The devil you don’t know is better than the devil you do know,” he said.
No one at the table expected things to get any easier. In fact, just the opposite seems likely.
“Two things are happening that are going to force a real shift in the training and expertise level in this world,” said David Schirmacher, vice president, corporate services and real estate, Goldman Sachs. “One is the fact that the densities within the data centers are getting higher and higher. If you’re in the newer environments that are very dense, even small mistakes can be catastrophic.”
The second factor stems from the rapidly increasing growth of data processing requirements in data centers. This growth, coupled with smaller and more power-dense IT hardware, has led to sharp increases in overall energy consumption and in the electricity used per square foot.
“Over the past few years, not only has the watts per square foot gone up, but the overall amount of energy consumed by data processing has risen dramatically in relation to our office spaces,” said Schirmacher. If that pace continues, data centers could eventually consume far more energy than all of the space used to house people, although data centers themselves represent only a small fraction of the space the firm occupies.
As energy use rises, those responsible for data centers — in both facilities and IT — will come under increasing pressure to control energy use. Part of the pressure could come from new regulations, said Schirmacher. But part of it will come from senior managers concerned about the environmental impact of energy use. “They will see articles about how data centers are responsible for an increasing amount of carbon emissions,” he said. “They’re going to come back to you and say, ‘What does that mean for us?’” Facility executives will need metrics to answer that question — the same metrics that can be used to guide efforts to control energy use.
As facility executives prepare for emerging challenges, it’s important to look at where — literally — the data center facility staff is right now. The location of a data center determines the pool of potential employees.
Densely populated areas like the Tri-State region around New York City typically have the largest number of experienced data center facility engineers and managers. But a big city location doesn’t necessarily make it easy to find and retain qualified staff, said Michael Bosco, vice president, technology, Morgan Stanley. “Getting good people is difficult. Number one, people jump around. Number two, it’s a very small world that we’re talking about. You know who the players are and who’s doing the hiring.” At the time of the roundtable, Bosco was head of engineering for the North American region of Barclays Capital.
Data centers in other parts of the country have challenges of their own. One of them is that vendor support for mission critical items like UPS systems may be lacking in the area. If that’s the case, a facility executive may need to retain in-house expertise in that system.
“When you’re building a data center,” said Schirmacher, “make sure the components you’re putting in are locally serviceable.”
Global operations magnify the challenge of ensuring that qualified people — whether in-house, outsourced or provided by vendors — are available to keep the data center running. Pfizer has hundreds of data center facilities scattered around the world, some of them operating in remote locations with minimal staff.
“The problem is not just the people,” said Roy Clark, senior manager, global engineering, Pfizer. “It’s the time.” If a data center in a hard-to-reach location runs into a problem, the sheer distance and travel logistics involved may delay the response from an outside organization contracted to provide support for the data center facility or from other people needed to resolve the situation. “Responses can take from a few minutes up to fifteen hours or longer,” said Clark. “A long response time can have a significant impact on business operations.”
Location also plays a role in determining whether a data center is staffed by union or non-union employees. Some facility executives dread having to work with unions. But Schirmacher is perfectly comfortable working in a union environment. He wants unions to see Goldman Sachs as an exemplary employer and advises facility executives to be sensitive to union requirements. “Everyone needs to be a winner in a critical operation,” he said.
In Schirmacher’s eyes, unions should be seen as allies. “You need to really partner with the unions and get, not just the people who are doing the work, but the union itself to understand the criticality of the operation. They need to understand that we’re looking for highly qualified people, and we will return to the membership highly qualified people.” Unions are receptive to that approach, but it’s up to the facility executive to take the lead.
Outsourcing has long been used to address the need for qualified data center facility staff. “We’ve been outsourced for 20 years,” said Phil Meyers, executive director, Morgan Stanley. “We outsource all our facilities and have only senior people manage that relationship.”
With the increasing difficulty of finding qualified facility staff, even organizations that hadn’t outsourced in the past are doing so now. CA outsourced its data center facility operations about a year ago in large part to address the loss of subject matter expertise.
“We would spend a lot of time in training, and those people would be cherry-picked to go elsewhere,” said Robert Paul, vice president, worldwide facilities for CA. “Now we’ve retained some subject matter expertise, even if it isn’t on site. We can contact their corporate headquarters, and they’ve got teams that can parachute in and help us with startups and things like that. You don’t have to worry that Joe left yesterday.”
The benefits of outsourcing have gone beyond staffing. “We’re now doing engineering audits at all the locations to try to get to an N+1 environment,” said Paul.
To hire the outsourcing firm, CA formed a committee that included representatives from facilities, IT, and top management. Everyone involved scored each candidate independently on criteria the firm had identified as keys to success. They also asked for detailed information on the person who was going to be the account manager for CA. The contract prevents the service provider from moving key personnel without CA’s permission.
That focus on people is an important part of the process of selecting an outsourced provider. Schirmacher expects the proposed onsite account manager and his or her team to lead the sales presentation to his firm. “They can bring the other people if they’d like, but we want the people who will be running our show doing the pitch,” he said. “You get a good view of how much research they did about your account and their ability to communicate.”
Facility executives who don’t turn to outsourcing providers should be diligent about finding the right person. The first step is a very specific job description, in some cases down to familiarity with a particular piece of equipment, said Rich Julason, director of facilities for the TV Guide Magazine Group. Another important step is working with the human resources department to determine the appropriate pay level for the position.
Technical skills are important in a new employee — whether in-house or outsourced — but they’re not sufficient. “I’ll take judgment and communication skills over technical ability every day,” said Meyers. “And the ownership mentality is key. That’s one of the biggest things we find with our outsourcing people. The ones that are good have that ownership mentality.”
Training is one way to accomplish that goal. “It enhances their self esteem, makes them more valuable. And I think they will be more loyal,” said Julason.
Facility executives would do well to think broadly about training. For example, if a piece of critical equipment is up for annual maintenance, it may be worthwhile to bring in extra staff — people who wouldn’t be needed simply to accomplish the work — and let the project serve as an opportunity for them to learn about the equipment.
Bosco took a similar tack at Barclays. When the firm brought a new piece of critical equipment online, it trained both electrical and mechanical groups on it. “It’s not that you’re going to want a mechanical technician touching the UPS system,” said Bosco. But if mechanical technicians can identify alarms that come up on the UPS screen, they will have a better idea of who should be called and can convey more information to that person.
Commissioning offers another learning opportunity. Kogan said that his entire facility staff, which is largely in-house, gets involved in commissioning. “We want them to understand every component,” he said. What’s more, Kogan’s spec for commissioning involves a lot of hands-on work for the facility staff. “Turn off the power. Does the generator kick in on time? The UPS batteries? That gets the guys excited because they’re actually working with the equipment and seeing what could happen.”
Steps like that add up. “You hope you’re getting the best people, and the best people want to be with the best people,” said Schirmacher. “If they feel that they’re in an industry-leading environment, they’re not anxious to leave. They want to stay there and grow. They like what they do, and they feel connected to the organization.”