For years, putting water in a data center was an idea that amounted to heresy for data center designers. But as data center cooling demands skyrocket, the industry is taking a hard look at that very option.
The focus of attention is a product category known as In-Row Cooling (IRC). Many IRC units use water. The objective of IRC is to capture and cool the heat from servers before it has time to mix with ambient air, thereby reducing energy use. The efficiency of this approach has attracted the interest of IT groups — a growing number of which are being charged for the full data center energy bill, thanks to cheaper submetering — as well as facility executives who are interested in sustainability and LEED certification.
Typically, the need for IRC arises where computer loads are greater than 10 kw per server cabinet. This seems to be where many existing data centers begin to approach their maximum cooling capacities. Because of its design, IRC can start with a single cabinet and grow from there.
As data center managers are updating IT technology to the latest generation of processors, more and more IRC is creeping into data centers of all types. If a data center is old and at its design capacity, the use of IRC is generally an inexpensive way to take the first steps into high density computing — blade servers and the like. In the case of research centers, supercomputing facilities, and other high performance computing sites, the entire data center will go to IRC.
Water, Water Everywhere
The move to water-cooled applications raises many challenges for facility executives. For example, experience shows that a building’s chilled water system is anything but clean. Few data center operators understand the biology and chemistry of open or closed loop cooling systems. Even when the operating staff does a great job of keeping the systems balanced, the systems still are subject to human errors that can wreak permanent havoc on pipes.
Installing dedicated piping to in-row coolers is difficult enough the first time, but it will be nearly intolerable to have to replace that piping under the floor if, in less than five years, it begins to leak due to microbial or chemical attacks. That does happen, and sometimes attempts to correct the problem make it worse.
Consider these horror stories:
- A 52-story single-occupant building with a tenant condenser water system feeding its data center and trading systems replaced its entire piping system (live) due to microbial attack.
- A four-story data center replaced all of its chilled and condenser water systems (live) when the initial building operators failed to address cross contamination of the chilled water and the condenser water systems while on free cooling.
- In yet another high-rise building, a two pipe (non-critical) system was used for heating in the winter and cooling in the summer. Each spring and fall the system would experience water flow blockages, so a chemical cleaning agent was added to the pipes to remove scale build-up.
Before the cleaning agent could be diluted or removed, the heating system was turned on. Thanksgiving night, the 4-inch lines let loose. Chemically treated 180-degree water flooded down 26 stories of the tower. Because no one was on site knew how to shut the system down, it ran for two hours before being stopped.
Water quality isn’t the only issue to consider. Back in the days of water-cooled mainframes, chilled water was delivered to a flat plate heat exchanger provided by the CPU manufacturer. The other side of the heat exchanger was filled with distilled water and managed by technicians from the CPU manufacturer. Given this design, the areas of responsibility were as clear as the water flowing through the computers.
In today’s designs, some of the better suppliers promote this physical isolation through the use of a “cooling distribution unit” (CDU) with the flat plate heat exchanger inside. Not all CDUs are alike and some are merely pumps with a manifold to serve multiple cooling units. It is therefore wise to be cautious. Isolation minimizes risk.
Currently, vendor-furnished standard CDUs are limited in the number of water-cooled IRC units they can support. Typically these are supplied to support 12 to 24 IRCs with a supply and return line for each. That’s 24 to 48 pipes that need to be run from a single point out to the IRCs. If there are just a few high-density cabinets to cool, that may be acceptable, but, as the entire data center becomes high-density, the volume of piping can become a challenge. Even 1-inch diameter piping measures two inches after it is insulated.
The solution will be evolutionary. Existing data centers will go the CDU route until they reach critical mass. New data centers and ones undergoing major renovations will have the opportunity to run supply and return headers sized for multiple rows of high-density cabinets with individual, valved take-offs for each IRC unit. This reduces clutter under the floor, allowing reasonable airflow to other equipment that remains air-cooled. Again, the smart money will have this distribution isolated from the main chilled water supply and could even be connected to a local air-cooled chiller should the main chilled water plant fail.
Evaluating IRC Units
Given the multitude of water-cooled IRC variations, how do facility executives decide what’s best for a specific application? There are many choices and opportunities for addressing specific needs.
One consideration is cooling coil location. Putting the coils on top saves floor space. And the performance of top-of-the-rack designs are seldom affected by daily operations of server equipment installs and de-installs. But many older data centers and some new ones have been shoehorned into buildings with minimal floor-to-ceiling heights, and many data centers run data cabling in cable trays directly over the racks. Both these situations could make it difficult to put coils on top.
If the coil is on top, does it sit on top of the cabinet or is it hung from the structure above? The method of installation will affect data cabling paths, cable tray layout, sprinklers, lighting and smoke detectors. Be sure that these can all be coordinated within the given overhead space.
Having the coil on the bottom also saves floor space. Additionally it keeps all piping under the raised floor and it allows for overhead cable trays to be installed without obstruction. But it will either increase the height of the cabinet or reduce the number of “U” spaces in the cabinet. A “U” is a unit of physical measure to describe the height of a server, network switch or other similar device. One “U” or “unit” is 44.45 mm (1.75 inches) high. Most racks are sized between 42 and 50 “U”s (6 to 7 feet high) of capacity. To go taller is impractical because doing so usually requires special platforms to lift and install equipment at the top of the rack. To use smaller racks diminishes the opportunities to maximize the data center capacity.
With a coil on the bottom, a standard 42U cabinet will be raised 12 to 14 inches. Will that be too tall to fit through data center and elevator doors? How will technicians install equipment in the top U spaces? One option is a cabinet with fewer U spaces, but that will mean more footprint for the same capacity.
Another solution is 1-foot-wide IRC units that are installed between each high-density cabinet. This approach offers the most redundancy and is the simplest to maintain. It typically has multiple fans and can have multiple coils to improve reliability. Piping and power are from under the floor. This design also lends itself to low-load performance enhancements in the future. What’s more, this design usually has the lowest installed price.
On the flip side, it uses more floor space than the other approaches, with a footprint equal to half a server rack. It therefore allows a data center to go to high-density servers but limits the total number of computer racks that can be installed. Proponents of this design concede that this solution takes up space on the data center floor. They admit that data centers have gone to high-density computing for reduced footprint as well as for speed, but they contend that the mechanical cooling systems now need to reclaim some of the space saved.
Rear-door solutions are a good option where existing racks need more cooling capacity. But the design’s performance is more affected by daily operations then the other designs due to the door being opened when servers are being installed or removed. Facility executives should determine what happens to the cooling (and the servers) when the rear door is opened.
No matter which configuration is selected, facility executives should give careful consideration to a range of specific factors:
Connections. These probably pose the greatest risk no matter which configuration is selected. Look at the connections carefully. Are they of substance, able to take the stresses of the physical abuse when data cables get pulled around them or do they get stepped on when the floor is open? The connections can be anything from clear rubber tubing held on with hose clamps to threaded brass connections.
Think about how connections are made in the site as well as how much control can be exercised over underfloor work. Are workers aware of the dangers of putting stresses on pipes? Many are not. What if the fitting cracks or the pipe joint leaks? Can workers find the proper valve to turn off the leak? Will they even try? Does the data center use seal-tight electrical conduits that will protect power connections from water? Can water flow under the cables and conduits to the nearest drain or do the cables and conduits act like dams holding back the water and forcing it into other areas?
Valve quality. This is a crucial issue regardless of whether the valves are located in the unit, under the floor or in the CDU. Will the valve seize up over time and become inoperable? Will it always hold tight? To date, ball valves seem to be the most durable. Although valves are easy to take for granted, the ramifications of valve selection will be significant.
Servicing. Because everything mechanical will eventually fail, one must look at IRC units with respect to servicing and replacement. How easy will servicing be? Think of it like servicing a car. Is everything packed so tight that it literally has to be dismantled to replace the cooling coil? What about the controls? Can they be replaced without shutting the unit down? And are the fans (the component that most commonly fails) hard wired or equipped with plug connections?
Condensate Drainage. A water-cooled IRC unit is essentially a mini computer-room air conditioning (CRAC) unit. As such, it will condense water on its coils that will need to be drained away. Look at the condensate pans. Are they well drained or flat allowing for deposits to build up? If condensate pumps are needed what is the power source?
Some vendors are promoting systems that do sensible cooling only. This is good for maintaining humidity levels in the data center. If the face temperature of the cooling coil remains above the dew point temperature in the room, there will not be any condensation. The challenge is starting up a data center, getting it stabilized and then having the ability to track the data center’s dew point with all the controls automatically adjusting to maintain a sensible cooling state only.
Power. Data centers do not have enough circuits to wire the computers and now many more circuits are being added for the IRC units. What’s more, designs must be consistent and power the mechanical systems to mimic the power distribution of computers. What is the benefit of having 15 minutes of battery back-up if the servers go out on thermal overload in less than a minute? That being the case, IRC units need to be dual power corded as well. That criteria doubles the IRC circuit quantities along with the associated distribution boards and feeders back to the service entrance.
Before any of the specifics of IRC unit selection really matter, of course, facility executives have to be comfortable with water in the data center. Many are still reluctant to take that step. There are many reasons:
- There’s a generation gap. Relatively few professionals who have experience with water-cooled processors are still around.
- The current generation of operators have been trained so well about keeping water out of the data center that the idea of water-cooled processors is beyond comprehension.
- There is a great perceived risk of making water connections in and around live electronics.
- There is currently a lack of standard offerings from the hardware manufacturers.
The bottom line is that water changes everything professionals have been doing in data centers for the last 30 years. And that will create a lot of sleepless nights for many data center facility executives.
Before You Dive In
Traditionally, data centers have been cooled by computer-room air conditioning (CRAC) units via underfloor air distribution. Whether a data center can continue using that approach depends on many factors. The major factors include floor height, underfloor clutter, hot and cold aisle configurations, loss of air through tile cuts and many more too long to list here.
Generally speaking, the traditional CRAC concept can cool a reasonably designed and maintained data center averaging 4 kw to 6 kw per cabinet. Between 6 kw and 18 kw per cabinet, supplementary fan assist generally is needed to increase the airflow through the cabinets.
The fan-assist technology comes in many varieties and has evolved over time.
• First there were the rack-mounted, 1-U type of fans that increase circulation to the front of the servers, particularly to those at the top of the cabinet.
• Next came the fixed muffin fans (mounted top, bottom and rear) used to draw the air through the cabinet. Many of these systems included a thermostat to cycle individual fans on and off as needed.
• Later came larger rear-door and top-mounted fans of various capacities integrated into the cabinet design to maximize the air flow evenly through the entire cabinet and in some cases even to direct the air discharge.
All these added fans add load to the data center and particularly to the UPS. To better address this and to maximize efficiencies, the latest fan-assist design utilizes variable-speed fans that adjust airflow rates to match the needs of a particular cabinet.
Until recently, manufacturers did not include anything more than muffin fans with servers. In the past year, this has started to change. Server manufacturers are now starting to push new solutions out of research labs and into production. At least one server manufacturer is now utilizing multiple variable turbine-type fans in their blade servers. These are compact, high air volume, redundant and part of the manufactured product. More of these server-based cooling solutions can be expected in the coming months.
— Dennis Cronin
Dennis Cronin is principal for Gilbane Building Company’s Mission Critical Center of Excellence. His 30 years of data center experience, includes work in corporate IT and corporate real estate and as an analyst of data center failures. He is a founder and original sponsor of 7x24 Exchange.