Aug
2017
Modeling and developing conflict-aware scheduling on large-scale data centers
Abstract:
Large-scale data centers are the growing trend for modern computing systems. Since a largescale data center has to manage a large number of machines and jobs, deploying multiple independent schedulers (termed as distributed schedulers in literature) to make scheduling decisions simultaneously has been shown as an effective way to speed up the processing of large quantity of submitted jobs and data. The key drawback of distributed schedulers is that since these schedulers schedule different jobs independently, the scheduling decisions made by different schedulers may conflict with each other due to the possibility that different scheduling decisions refer to the same subset of the resources in the data center. Conflicting scheduling decisions cause additional scheduling attempts and consequently increase the scheduling cost. More resources each scheduler demands, higher scheduling cost may incur and longer job response times the users may experience. It is useful to investigate the balanced points in terms of resource demands for each of independent schedulers, so that the distributed schedulers can all achieve decent job performance without experiencing undesired resource competition. To address this issue, we model distributed scheduling and resource conflict using the game theory and conduct the quantitative analysis about scheduling cost and job performance. Further, based on the analysis, we develop the conflict-aware scheduling strategies to reduce the scheduling cost and improve job performance. We have conducted the simulation experiments with workload trace and also real experiments on Amazon Web Services (AWS). The experimental results verify the effectiveness of the proposed modeling approach and scheduling strategies.