EAR
EAR provides four main added values:
Power and environmental system monitoring and job accounting.
Transparent runtime application performance and power monitoring
Dynamic application and cluster energy optimization through simple energy policies
Smart cluster energy and power capping to ensure your cluster does not consume more than whay you decide
EAR 4.3 main features
Runtime Energy Optimization
Support for multiple jobs sharing a node
Application perfomance and energy accounting
Energy and Power Capping
System monitoring
Hints of application analysis and optimization
Energy savings estimates reported to the DB
Application phases reported to the DB
Support of relational and non-relational DB
Error detection/correction of wrong power readings
Support on Intel/AMD CPUs and NVIDIA GPUs
Transparent job submission through a SLURM plugin and PBSpro hooks
EAR 4.3 main values
Monitor the system and the applications
Energy accounting and power/performance monitoring
New graphic tools for System /Job Performance/Power/Energy/CO2 analyses
Ensure nodes are performing as expected through periodic checks
Reduce the cluster power consumption by about 10% while minimizing performance penalty
Ensure the cluster does not consume more than what you decide through energy and power capping
Robust, Reliable and Operational since
August 2019 at LRZ on SuperMUC NG 6480 Intel node cluster
May 2022 at SURF on Snellius hybrid cluster with Intel/NVIDIA GPUS + AMD partitions