In recent years, there has been a significant growth in number, size and power densities of data centers. A significant part of data center power consumption is attributed to the cooling infrastructure, consisting of computer air conditioning units (CRACs), chillers and cooling towers. For energy efficient operation and management of the cooling resources, data centers are beginning to be extensively instrumented with temperature sensors. While this allows cooling actuators, such as CRAC set point temperature, to be dynamically controlled and data centers operated at higher temperatures to save energy, it also increases chances of thermal anomalies. Furthermore, considering that large data centers can contain thousands to tens of thousands of such sensors, it is virtually impossible to manually inspect and analyze the large volumes of dynamic data generated by these sensors, thus necessitating autonomous mechanisms for thermal anomaly detection. Also, in addition to threshold-based detection methods, other mechanisms of anomaly detection are also necessary. In this paper, we describe the commonly occurring thermal anomalies in a data center. Furthermore, we describe — with examples from a production data center — techniques to autonomously detect these anomalies. In particular, we show the usefulness of a principal component analysis (PCA) based methodology to a large temperature sensor network. Specifically, we examine thermal anomalies such as those related to misconfiguration of equipment, blocked vent tiles, faulty sensor and CRAC related anomalies. Furthermore, several of these anomalies normally go undetected since no temperature thresholds are violated. We present examples of the thermal anomalies and their detection from a real data center.
Skip Nav Destination
ASME 2009 InterPACK Conference collocated with the ASME 2009 Summer Heat Transfer Conference and the ASME 2009 3rd International Conference on Energy Sustainability
July 19–23, 2009
San Francisco, California, USA
Conference Sponsors:
- Electronic and Photonic Packaging Division
ISBN:
978-0-7918-4360-4
PROCEEDINGS PAPER
Autonomous Detection of Thermal Anomalies in Data Centers
Manish Marwah,
Manish Marwah
Hewlett-Packard Company Labs, Palo Alto, CA
Search for other works by this author on:
Ratnesh K. Sharma,
Ratnesh K. Sharma
Hewlett-Packard Company Labs, Palo Alto, CA
Search for other works by this author on:
Wilfredo Lugo
Wilfredo Lugo
Hewlett-Packard Company, Aguadilla, Puerto Rico
Search for other works by this author on:
Manish Marwah
Hewlett-Packard Company Labs, Palo Alto, CA
Ratnesh K. Sharma
Hewlett-Packard Company Labs, Palo Alto, CA
Wilfredo Lugo
Hewlett-Packard Company, Aguadilla, Puerto Rico
Paper No:
InterPACK2009-89140, pp. 777-783; 7 pages
Published Online:
December 24, 2010
Citation
Marwah, M, Sharma, RK, & Lugo, W. "Autonomous Detection of Thermal Anomalies in Data Centers." Proceedings of the ASME 2009 InterPACK Conference collocated with the ASME 2009 Summer Heat Transfer Conference and the ASME 2009 3rd International Conference on Energy Sustainability. ASME 2009 InterPACK Conference, Volume 2. San Francisco, California, USA. July 19–23, 2009. pp. 777-783. ASME. https://doi.org/10.1115/InterPACK2009-89140
Download citation file:
8
Views
0
Citations
Related Proceedings Papers
Related Articles
Viability of Dynamic Cooling Control in a Data Center Environment
J. Electron. Packag (June,2006)
Analytical Modeling for Thermodynamic Characterization of Data Center Cooling Systems
J. Electron. Packag (June,2009)
A Gray-Box Based Virtual SCFM Meter in Rooftop Air-Conditioning Units
J. Thermal Sci. Eng. Appl (March,2011)
Related Chapters
Modeling Building Air Conditioning Energy Consumption in Dense Urban Environments
Handbook of Integrated and Sustainable Buildings Equipment and Systems, Volume I: Energy Systems
Evaluation of the Analytical Bottom-Up SIL Proof by Statistical Top-Down Methods (PSAM-0242)
Proceedings of the Eighth International Conference on Probabilistic Safety Assessment & Management (PSAM)
Numerical Simulation and Analysis about Harmonic Performance Used for Heat Transfer and Air-Conditioning Cooling Load of Energy-Saving Porous Brick and Hollow Brick Wall
Inaugural US-EU-China Thermophysics Conference-Renewable Energy 2009 (UECTC 2009 Proceedings)