EAR
EAR provides four main added values:
Power and environmental system monitoring and job accounting.
Transparent runtime application performance and power monitoring
Dynamic application and cluster energy optimization through simple energy policies
Smart cluster energy and power capping to ensure your cluster does not consume more than whay you decide
EAR 4.3 main features
Runtime Energy Optimization
Support for multiple jobs sharing a node
Application perfomance and energy accounting
Energy and Power Capping
System monitoring
Hints of application analysis and optimization
Energy savings estimates reported to the DB
Application phases reported to the DB
Support of relational and non-relational DB
Error detection/correction of wrong power readings
Support on Intel/AMD CPUs and NVIDIA /Intel GPUs
Transparent job submission through a SLURM plugin and PBSpro hooks
EAR 4.3 main values
Monitor the system and the applications
Energy accounting and power/performance monitoring
New graphic tools for System /Job Performance/Power/Energy/CO2 analyses
Reduce the cluster power consumption while minimizing performance penalty
Ensure the cluster does not consume more than limits you decide through energy and power capping
Robust, Reliable and Operational since
August 2019 at LRZ on SuperMUC-NG Intel cluster and March 2024 on SuperMUC NG2 Intel accelerated cluster
May 2022 at SURF on Snellius hybrid cluster with Intel/NVIDIA GPUS + AMD partitions and November 2023 on Snellius2 AMD Genoa cluster
March 2024 at BSC on MN5 general purpose partition and accelerated partition