Pu Pang, Quan Chen, Deze Zeng, Chao Li, Jingwen Leng, Wenli Zheng, Minyi Guo
In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
Large-scale datacenters often host latency-sensitive services that have stringent Quality-of-Service requirement and experience diurnal load pattern. Co-locating best-effort applications that have no QoS requirement with latency-sensitive services has been widely used to improve the resource utilization with careful shared resource management. However, existing co-location techniques tend to result in the power overload problem on power constrained computers due to the ignorance of the power consumption. To this end, we propose Sturgeon, a runtime system proactively manages resources between colocated applications in a power constrained environment, to ensure the QoS of latency-sensitive services while maximizing the resource utilization. Our investigation shows that, at a given load, there are multiple feasible resource configurations to meet both QoS requirement and power budget, while one of them yields the maximum throughput of best-effort applications. To find such a configuration, we establish models to accurately predict the performance and power consumption of the colocated applications. Sturgeon monitors the QoS periodically in order to eliminate the potential QoS violation caused by the unpredictable interference. The experimental results show that Sturgeon improves the throughput of best-effort applications by 24.96% compared to the state-of-the-art technique, while guaranteeing the 95%-ile latency within the QoS target.