Given the vital rule of data center availability and since the inlet temperature of the IT equipment increase rapidly until reaching a certain threshold value after which IT starts throttling or shut down because of overheat during cooling system failure. Hence, it is especially important to understand failures and their effects. This study presented experimental investigation and analysis of a facility-level cooling system failure scenario in which chilled water interruption introduced to the data center. Quantitative instrumentation tools including wireless technology such as wireless temperature and pressure sensors were used to measure the discrete air inlet temperature and pressure differential though cold aisle enclosure, respectively. In addition, Intelligent Platform Management Interface (IPMI) and cooling system data during failure/recovery were reported. Furthermore, the IT equipment performance and response for opened and contained environments were simulated and compared. Finally, an experiment based analysis of the Ride Through Time (RTT) of servers during chilled water interruption of the cooling infrastructure presented as well. The results showed that for all three classes of servers tested during the cooling failure, CAC helped keep the server’s cooler for longer. The containment provided a barrier between the hot and cold air streams and caused slight negative pressure to build up, which allowed the servers to pull cold air from the underfloor plenum. In addition, the results show that the effect of CAC in containment solutions on the IT equipment performance and response could vary and depend on the server’s airflow, generation and hence types of servers deployed in cold aisle enclosure. Moreover, it was shown that when compared to the discrete sensors, the IPMI inlet temperature sensors underestimate the Ride Through Time (RTT) by 42% and 12% for the CAC and opened cases, respectively.

