Should Enterprise Owners Consider Liquid Cooling?16 min read
Should enterprise data center owners consider liquid cooling? The answer is, “Yes.” Should enterprise owners adopt liquid cooling for their data centers? The answer is, “Maybe.” Moving from “consideration” to “adoption” involves sorting through important financial, operational and technical questions. First we need to clearly understand what we actually mean by liquid cooling. Then we’ll want to understand the effort and complexity required to get from where we are today to an end state of utilizing liquid cooling. Finally, we will want to understand whether we have a realistic probability of being satisfied with an end state of liquid cooling. At this point, I believe there is a general consensus in the industry on a few points. Liquid is a better heat removal vehicle than air; liquid cooling will produce a significantly lower PUE than a sub-optimized air-cooled data center — and can sometimes provide an incremental benefit over a fully optimized air-cooled data center — and liquid cooling can effectively cool higher densities than air-cooling1. There is already an abundance of literature addressing the above advantages of liquid cooling, so I will limit my discussion to those issues which may make liquid cooling either attractive or unattractive to the enterprise, depending on the constellation of variables for a particular situation.
‘Liquid Cooling’First, let me limit my definition of liquid cooling for the purposes of the rest of this discussion. I am talking about water or some type of dielectric. You say that goes without saying? For some, perhaps it does, but the fact of the matter is, for example, when we consider the equation for convection heat transfer, one of the elements of that equation is the temperature of the fluid. That fluid in our data center universe is almost always air. Therefore, we will keep in mind we are talking about liquid as one of the states of matter and specifically that liquid will typically be water or a dielectric. Furthermore, I am limiting my definition to circumstances where conductive heat transfer is taking place directly at the heat source and is integral to the liquid heat removal mechanism. That definition will thereby exclude various close-coupled cooling solutions like rear door heat exchangers, whereby heat is removed from heat sources by convection heat transfer (air) prior to that heat being transferred to liquid. Nevertheless, close-coupled cooling can play an important role in some liquid cooling environments.
Variations of Liquid CoolingEven with the above limitations, there are still some variations in liquid cooling, and they will bear on the other assorted variables that are part of the decision process. In general, we can look at three primary categories of liquid cooling: 1. Immersion (typically a dielectric) Since various economic, health and environmental concerns have surfaced regarding two phase immersion (basically relying on evaporation to remove the heat), I will limit this discussion to single phase dielectric immersion which relies on a heat exchanger between the dielectric and some heat rejection mechanism (condenser, tower, etc.), so that on the other side of the heat exchanger, the mechanical plant may look similar to that with which we are more familiar, perhaps minus a chiller. 2. Partial direct contact In partial direct contact, cold plates replace heat sinks attached to major heat sources (microprocessor, memory). Liquid is pumped through the cold plates to remove heat to a heat exchanger, on the other side of which heat is rejected, similar to a design with CRAHs and chiller plant and tower, perhaps eliminating the chiller via warm water cooling. Some air cooling is still utilized for the remaining heat within each server. This is where those close-coupled cooling solutions, such as rear door heat exchangers, could have a role. If adequate volume of warm water is available, such as the return loop on a building cooling system, it is possible to cool the load not addressed by the partial direct contact liquid cooling solution without CRAHs or other data center air moving hardware. 3. Full direct contact In full direct contact liquid cooling, a high-conductive plate is connected to every heat source within the server via some form of little conductors. Liquid then removes the heat from the plate either to a heat exchanger or perhaps directly to a heat rejection mechanism.
What to Consider Before Making the MoveThe effort and complexity required to move from today’s condition to a future condition of a liquid cooled data center is going to vary depending on what today’s condition is and any IT hardware limitations. If today’s condition is a piece of empty pasture land, then the effort and complexity will be less than converting a traditional space. For example, a greenfield project avoids the hassle of a staged decommissioning of DX coolers or managing parallel non-compatible mechanical plants. The greater complexity, however, frequently arises from IT hardware considerations. Many of the liquid cooling solutions providers have aligned in one way or another with IT equipment providers to make available IT equipment that either has the liquid cooling elements integrated directly into the hardware or the hardware is at least customized to accommodate a particular liquid cooling solution. In this case, mitigating complexity ties the enterprise to single source cooling and computing platforms. Single sourced IT can be avoided by adding the complexity of integrating the cooling elements on site into the various IT platforms (direct contact) or making the requisite modifications to the IT equipment (immersion). The cooling vendors will typically be of assistance in this effort, but the exercise will be ongoing whenever new equipment is brought into the data center. At this point, single-sourcing a liquid cooling vendor is pretty much unavoidable due to the absence of any kind of relevant standard covering interoperability. Further complications arise from the mix of IT equipment that remains incompatible with liquid cooling, resulting in the need to manage and maintain two separate and independent cooling systems. If air cooling is being provided by DX cooling units, these will be distinctly separate systems. If air cooling is being provided by chilled water cooling units, the liquid cooling heat rejection loop can be integrated into the existing infrastructure without much difficulty. Regardless, there will be some exceptions to the liquid cooling initiative that will need to be addressed. For example, spinning media is not compatible with immersion cooling and neither is communications equipment with fiber optic connectivity. Workarounds could be upgrades to solid state storage and downgrades to copper connectivity, but there will still be some equipment that the enterprise owner will not want to abandon that will not be compatible with the liquid cooling deployment of choice. Furthermore, the partial direct contact liquid cooling, by definition, will require some amount of air cooling for the internal components without cold plate connections. The full direct contact liquid cooling approach could conceivably represent a path to full liquid cooling, with some intensified degree of difficulty and complexity. I suppose it would be possible to have a custom heat transfer plate made up to fit each piece of equipment on a data floor bill of material along with a roadmap for all the little heat conductors. However, a somewhat lower effort project would result from specifying single source servers with integrated plates and conductors and then as much ancillary equipment as possible for whom the cooling vendor had already designed and developed custom plates with the hope that there would be very little remaining effort and complexity for further IT equipment customization. If the available IT equipment options fulfilled the enterprise IT mission, this could be a reasonable path; if not, then any liquid cooling implementation would remain a hybrid design.
From Consideration to AdoptionFinally, the goal of the project itself will bear on crossing that bridge from consideration to adoption of liquid cooling. If the goal is to allow for further expansion of data center capacity when there is a relatively insurmountable restriction on mechanical plant capacity, liquid cooling can represent a path to adding that capacity. Equipment committed to immersion cooling will add zero additional CFM requirement to existing cooling equipment, though associated heat exchangers could add some load on chillers and condensers. The direct contact approaches, by virtue of being effective with warm water supply, can operate off the return loop of an existing chilled water system thereby not adding any volume demand and increasing the ΔT on the chiller. Depending on baseline operating conditions, incremental increases in that ΔT will translate to higher chiller efficiency for some steps in IT load increases before there is an effective chiller load capacity ceiling reached. Increased IT load handled by full direct contact liquid cooling will obviously not add any CFM demand to existing air-movers. Likewise, if partial direct contact liquid cooling is accompanied by a close-coupled cooling solution such as rear door heat exchangers, there will be no additional CFM demand on existing air-movers. If the project is to either upgrade an existing data center or build a new data center to support certain high speed or high density applications, one of the liquid cooling approaches will likely be beneficial. High frequency trading and blockchain applications were early adopters and have benefited from liquid cooling. Emerging artificial intelligence and accelerators for latency-sensitive applications represent viable candidates for liquid cooling. With the growing availability of solid state drives and the advent of sealed helium high density, high speed read/write heads, super high density storage has also become a reasonable candidate for liquid cooling. If, on the other hand, the goal of the project is to achieve a lower PUE, liquid cooling may not always represent the best return on investment path. If we are starting from that open pasture and we can be accommodating about specifying IT equipment that will minimize or eliminate the need for air movement energy use and there are some advantages to be obtained from minimizing the white space footprint, liquid cooling may be a reasonable vehicle to take us to that low PUE destination. However, if we are looking at upgrading/retrofitting an existing space to drive down the PUE, it will behoove us to take a close look at the investments required to overlay liquid cooling on the existing facility versus optimizing the existing facility with improved airflow management and access to free cooling. Depending on the circumstances, the projected PUEs could be relatively close but the paybacks could be quicker for the optimization project. The point is: some analysis and planning would be in order. If the objective of the project is to install a data center in some space not intended for a data center, such as a high rise, inhabited office building, liquid cooling may solve some of the associated issues. An alternative to adding a chiller plant on the roof of a multi-tenant building (good luck!) or taking up floor space with DX cooling and adding some heat rejection capability on that same multi-tenant building rooftop could be implementing a liquid cooled computer room operating off the office chilled water return loop. Depending on the size of the building, the incremental load could be totally transparent to the mechanical plant. In addition, we are not at adding any (or very little) noise pollution from fans.
ConclusionLiquid cooling has been successfully adopted in scientific, academic, research, super high density and associated HPC applications. Adoption by the enterprise has been much slower, though that does not mean it cannot be a good fit. Specifically for enterprise applications that more closely resemble some of the early adopter industry segments in terms of need for speed or density, liquid cooling may be a viable option for the enterprise. Nevertheless, there may be some restrictions on IT equipment and there will more than likely be some added complexity in overlaying a liquid cooling solution over many variations of installed mechanical plants. Regardless, even without a standard for platform interoperability, the liquid cooling market offering has reached a level maturity where it should at least be on the radar for consideration for enterprise data centers. 1Granted, different people are going to drive that stake into the ground in different places. For example, there is an Intel data center in Northern California where 40+kW racks are very effectively being air-cooled with a PUE less than 1.1
Data Center Consultant
Let's keep in touch!