Best Practices When Creating a Data Center Monitoring Scheme10 min read
Today, we’re going to skip an intro full of statistics and trends around IT updates. In fact, we’re going to very specifically focus on your data center monitoring practices.
I’ve had the chance to work with a vast variety of data center environments. And yes, they are all different. Some power HPC workloads, while others work with cloud workloads. Then, there are those which act as storage silos and repositories, and those that are private colocation partners. Each might have different types of requirements, gear, and layouts.
Still, the critical nature of the modern data center requires optimal monitoring mechanisms. Here are some collected best practices:
Key monitoring parameters for environmental monitoring:
- Temperature. This is a given within any environment. Under no circumstance should a server, or server rack, be allowed to operate outside of their functioning temperatures. Make sure to constantly check ASHRAE for optimal operating temperatures. They do change. To get even more detailed, administrators look for rack exhaust metrics, internal temperatures, and even server temperatures. The more visibility into the temperature control mechanisms in place, the quicker a response engineer can address issues before they become serious problems.
- Humidity and water control. Just like monitoring temperatures, critical systems within a data center must be monitored for humidity. There are multiple ways to examine the humidity within an environment. Generally speaking, those are levels within the rack and outside of the rack. Levels should be kept steady in all circumstances and environments requiring fast reaction times will consider deploying multiple sensors in various strategic locations.
- Aisle environmental controls. This means temperature, humidity, airflow, and hot/cold aisle monitoring. Depending on the size of the environment, hot/cold aisles will be present. Watching the temperature ranges in these data center aisles can help administrators spot problems very quickly; and improve efficiency.
- Static electricity. Ambient static electricity monitoring sensors help with the need to see if something has entered the facility with large amounts of static around them. Static electricity can be very harmful to a data center environment, so managing these sensors is important.
- Data center access. From a security perspective, many organizations are deploying data center environment and rack entry sensors. These sensors will alert the proper personnel if a rack has been improperly entered. More advanced environments will actually activate a camera system pointing to the exact rack where the cage has been opened.
Working with sensors and sensor technologies:
- Durability. Large data centers have come to rely on sensors within their environments to provide some of the most important data regarding their data center health. With that in mind – remember this rule: No one sensor is ever guaranteed to work forever. Sensor failures do, and will occur and any level. This is why it’s so important to have a redundant sensor environment. Intelligent data center monitoring tools will actually observe all of the sensors in the environment. Automated recovery procedures can be configured to look at multiple sensors at the same time just in case one has failed. This will help eliminate false positives when a sensor fails. With so many sensor points within a large data center, administrators must have the proper alerting mechanism in place. If a sensor fails, the right person must be notified immediately. This is the same if the device starts to post incorrect information or triggers false alarms.Proactive testing and maintenance of a data center sensor environment will help lessen the chance that a failure will occur. Still, a good management system will help alleviate headaches when it comes to having a faulty sensor.Remember, today’s enterprise monitoring systems are built to last. Administrators rely on this data to make very important decisions revolving around data center environmental information. Having a redundant sensor architecture will help with a failed sensor in a portion of a data center.
- Sensor placement. When deploying data center sensors, it’s very important to take the environment size into consideration. Since each environment is unique, there aren’t too many tools which can “auto” place sensors for you. This is where a good partner can really help out. HVAC professionals and data center monitoring/environmental design experts can help an organization plan out the best strategy for sensor deployment.Still, from a high-level perspective, there are four major areas where an administrator should consider deploying sensors. These include:
- Rack-level monitoring. For more information and redundancy, administrators may deploy more sensors within a rack. However, sensors should be located at the top of the rack to monitor exiting hot air and at the bottom of the rack to monitor floor cooling metrics.
- Ambient room monitoring. This is where room humidity and temperature sensors are important. For large environments, it’s recommended to place a sensor in a hot zone – or an area furthest away from the cooling unit.
- Computer room air conditioning/handler monitoring. These sensors will help identify immediate failures with the cooling unit. They should be placed somewhere near the AC device.
- Wetness monitoring. Depending on the data center environment, it is recommended to place leak sensors around the outside walls of the server room as well as beneath the raised floor. To detect wetness coming from cooling unites, place water sensors around the unit to monitor possible water leaks. Take extra precautions if you have liquid-cooled systems.
Data center requirements will always be unique to the use-case the data center serves. Creating good monitoring best practices requires an understanding of the business, the requirements of the data center, and future demands. Depending on your specific use-case, you might require additional monitoring in sensitive areas. Similarly, sensitive security areas may require more physical monitoring. Remember to design around your requirements; supporting both your data center and the business.
CTO, MTM Technologies
Airflow Management Awareness Month
Free Informative webinars every Tuesday in June.