Data Center Airflow Management Considerations with Liquid Cooling18 min read
Data Center Airflow Management Considerations with Liquid Cooling has got to be the shortest blog of all time. After all, if we’re liquid cooling, what airflow management considerations might we be facing? Liquid cooling does eliminate much of the fuss with airflow, but there may still be some airflow elements requiring attention. Actually, my experience with this subject goes back nearly four decades. Our product development group office was upstairs in an ancient factory, across the wall from the room where the old System 3081 lived. Sharing that room with the computer were the desks of the computer guys (don’t recall the term IT floating around freely yet) and the MRP guys. The head of the computer department was a smoker, so he would open a window for smoke breaks and our airflow management (desk fan) would evacuate most of the cigarette smoke from the liquid-cooled mainframe computer room. Such was my introduction to the convergence of liquid cooling and airflow management. There are various takes on liquid cooling, some of which involve a little more thinking about airflow management than others.
Data centers utilizing liquid immersion cooling, on the surface, would seem to be the last place where airflow management would be a concern, but sometimes there may be some recalcitrant air molecules requiring herding. If the data center, or at least the particular computer room, is nothing but servers and they are all immersed, there may not be much airflow to manage, particularly if the immersion medium is a single phase coolant. However, if a two phase coolant medium is being used, that is to say heat is ultimately removed through evaporation, there are a variety of safety regulations that at least nudge up against airflow management. For example, OSHA set the exposure limit for mineral oil at 5 mg per cubic meter averaged over an eight hour working shift, which might at the least suggest the value for exhaust hoods and some managed refresh volumes. In addition, spinning media is not compatible with liquid immersion cooling, so if the site includes non-solid state storage, there will be at least some requirement for air cooling in conjunction with the immersion cooling. Under normal circumstances, some mass storage racks are problematic for the effectiveness of the overall airflow management scheme for a data center. In fact, in order to keep some storage from completely undermining the effectiveness of a data center airflow management plan, those racks need to be integrated into a custom hot aisle containment configuration or have all the equipment relocated into a chimney-compatible cabinet. If that storage is the only active equipment in an otherwise immersion-cooled space, then it is suddenly much less problematic. If there is nothing for the mass storage to mess up, then the airflow management game is merely to assure it is getting enough cool air and that the waste air is being diverted away from supply inlets.
Some liquid cooling may only be liquid cooling in the eye of the beseller and not necessarily in the eye of the beholder. For example, fifteen years ago, vendors of in-row cooling were calling it liquid cooling while astute observers were calling it close-coupled cooling. By the same token, sellers against in-row cooling would sometimes call it liquid cooling and therefore tout all the dangers of having water in the data center; on the other hand, those same competitive sellers would say it was not really liquid cooling and the small in-row fans were not efficient air movers. It’s a testament to in-row marketing that this quasi-liquid cooling solution established such a solid foothold in the marketplace when it got thunked equally for what it wanted to be as well as what it tried not to be. Regardless, airflow management is important to this incarnation of liquid cooling and the vendors realized this with containment systems marketed as critical accessories to the in-row coolers.
Likewise, rear door heat exchangers were introduced as liquid cooling despite still relying on fans to pull cold air into servers and push hot air out of servers. Nevertheless, rear door heat exchangers get water or refrigerant very close to the heat load in each rack and push the envelope a little further along that close-coupled/liquid cooling continuum. Rear door heat exchangers can be either passive or active and as a result have somewhat different airflow management considerations. In passive rear door heat exchangers, the rear door itself, at some higher volume of airflow, will add measurable head pressure load on the server fans. This pressure head makes the sealing of all potential air passages between the front and rear of the cabinet that much more critical – no longer as a potential bypass path, but rather as a likely re-recirculation path, depending on the airflow volume and the impedance of the door. An associated element of this airflow management consideration is to either monitor pressure inside the cabinet or closely track rack-face temperatures for possible evidence of pressure-induced re-circulation. On the other side of the tipping point of these metrics is a choice of re-distributing load or adding active rear doors with fans to help overcome the door impedance. With active rear door heat exchangers, the absolute seal between the front and rear of the rack is less critical and the actual fan energy penalty of the fans on the rear door heat exchanger may be at least partially offset by lowered server fan energy. WARNING: Good for the data center energy bill, but not so good for PUE!1
There are a couple different partial direct contact liquid cooling architectures, each with their own airflow management considerations. In this category of liquid cooling, we will have some kind of cold plate attached to microprocessors and frequently to other significant heat generators such as memory. In one variation, liquid (refrigerant or water) will be pumped to and from the cold plates and through a liquid-to-liquid heat exchanger that removes the heat to a chiller or some form of heat rejection that accepts cooler water from a chiller or economizer, or from a warm water source since these liquid temperatures can be much higher than what are required to support air cooling. In another variation, the cold plates are connected to some kind of radiator with fins much larger than a normal heat sink located at the edge of the mother board or somewhere else within the server shell. The increased surface area of these fins allows for much higher heat rejection at lower airflow volumes than we can get with heat sinks sized to fit on the component and in the shell. With the cold plates connected to a liquid-to-liquid heat exchanger, our liquid cooling will be removing anywhere from 60% up to 86% of the total heat load in the server, meaning that air will still be working on that remaining 40% to 14%. Everything we know about airflow management (plugging all the holes in the raised floor, between the front and the rear of the cabinet and within the rows of cabinets) is still required to minimize wasted capacity on that remaining air-cooled load. With the cold plates connected to larger heat rejection fins, we will be pulling more air through the cabinet than with the first direct contact option, but the larger fins should allow us to operate at a higher delta T resulting in less airflow required than with standard component-attached heat sinks. Regardless, all best practice airflow management disciplines need to be employed to protect the efficiencies gained from this variation of liquid cooling.
There is also a hybrid combination of rear door heat exchangers and direct contact liquid cooling. I have not seen this deployed anywhere as often as the benefits might suggest, probably due to the complexity of balancing flows and pressure between the large rear door radiator and the small cold plate pumps and piping. Theoretically, since both systems can work effectively with warmer water, this hybrid could eliminate any semblance of a mechanical plant. The rear door heat exchanger can operate off warm water from the building (AC return loop, for example) and then the direct contact liquid cooling cold plate can operate off the rear door heat exchanger return loop. Since the rear door is seeing only 14-40% of the total heat load and associated airflow, the door impedance is not an issue for the server fans and, therefore, the server fans are the only mechanical component of the entire system (plus some small pumps operating at fractions of a watt). In this hybrid architecture, some care should be directed to sealing between the front and the rear of the cabinet, but otherwise airflow management is not a particularly critical consideration. The degree to which IT equipment not equipped with direct contact liquid cooling is introduced into this hybrid space would obviously affect the degree to which further attention should be paid to airflow management.
Finally, there is a liquid cooling solution wherein every heat generating component on the server mother board is in one way or another connected to a cold plate which is then connected to a chiller or a liquid-to-liquid heat exchanger or an economizer or any combination thereof. The airflow management considerations for this variation of liquid cooling are rather straightforward: If everything in the data center is so equipped, the only airflow management consideration is for the comfort cooling of the random human wandering through the space. If, on the other hand, there are some cabinets of IT equipment not so equipped, such as mass storage racks previously discussed, then we need to apply everything we know about airflow management best practices to keep supply air and return air from cohabitating.
Does the advent of liquid cooling in the data center industry signify that we airflow management dinosaurs can lumber off into the sunset with the knowledge that our work is done? That may be a little premature. For the time being, while we are still in early growth with shake-out still a ways down the road for liquid cooling, there will remain plenty of opportunities for disciplined airflow management to optimize the benefits of liquid cooling in the data center.
1 I have discussed this in many previous pieces. Usually this trade-off is part of a discussion on utilizing ASHRAE allowable temperature limits and reaping mechanical plant benefits and increased economizer access benefits while server energy goes up a little due higher fan speeds. Regardless, server fan energy can constitute anywhere from 1% up to 10% of the total data center energy draw, depending on PUE and inlet temperatures. Since server fans are included in the divisor of the PUE equation, we end up in these situations where we can raise the PUE and lower the energy usage or lower the PUE and raise the energy usage. This paradox, by the way, is part of the value proposition for liquid cooling vendors.
Data Center Consultant
Let's keep in touch!