Tag Archives: Optimal Scheduling

Stochastic Modeling and Optimization of Stragglers

Abstract: MapReduce framework is widely used to parallelize batch jobs since it exploits a high degree of multi-tasking to process them. However, it has been observed that when the number of servers increases, the map phase can take much longer than expected. This paper analytically shows that the stochastic behavior of the servers has a negative effect on the completion time of a MapReduce job, and continuously increasing the number of servers without accurate scheduling can degrade the overall performance. We analytically model the map phase in terms of hardware, system, and application parameters to capture the effects of stragglers on the performance. Mean sojourn time (MST), the time needed to sync the completed tasks at a reducer, is introduced as a performance metric and mathematically formulated. Following that, we stochastically investigate the optimal task scheduling which leads to an equilibrium property in a datacenter with different types of servers. Our experimental results show the performance of the different types of schedulers targeting MapReduce applications. We also show that, in the case of mixed deterministic and stochastic schedulers, there is an optimal scheduler that can always achieve the lowest MST.

Authors: Farshid Farhat and Diman Zad Tootaghaj from Penn State, Yuxiong He from MSR (Microsoft Research)

. The work was done during my visit from MSR in Summer 2015.

Stochastic modeling and optimization of stragglers in mapreduce framework

@phdthesis{farhat2015stochastic,
  title={Stochastic modeling and optimization of stragglers in mapreduce framework},
  author={Farhat, Farshid},
  year={2015},
  school={The Pennsylvania State University}
}

 

Stochastic modeling and optimization of stragglers

@article{farhat2016stochastic,
  title={Stochastic modeling and optimization of stragglers},
  author={Farhat, Farshid and Tootaghaj, Diman and He, Yuxiong and Sivasubramaniam, Anand and Kandemir, Mahmut and Das, Chita},
  journal={IEEE Transactions on Cloud Computing},
  year={2016},
  publisher={IEEE}
}

Optimal Scheduling in Parallel Programming Frameworks

FORK-JOIN QUEUE MODELING AND OPTIMAL SCHEDULING IN PARALLEL PROGRAMMING FRAMEWORKS

ABSTRACT

MapReduce framework is widely used to parallelize batch jobs since it exploits a high degree of multi-tasking to process them. However, it has been observed that when the number of servers increases, the map phase can take much longer than expected. This thesis analytically shows that the stochastic behavior of the servers has a negative effect on the completion time of a MapReduce job, and continuously increasing the number of servers without accurate scheduling can degrade the overall performance. We analytically model the map phase in terms of hardware, system, and application parameters to capture the effects of stragglers on the performance. Mean sojourn time (MST), the time needed to sync the completed tasks at a reducer, is introduced as a performance metric and mathematically formulated. Following that, we stochastically investigate the optimal task scheduling which leads to an equilibrium property in a datacenter with different types of servers. Our experimental results show the performance of the different types of schedulers targeting MapReduce applications. We also show that, in the case of mixed deterministic and stochastic schedulers, there is an optimal scheduler that can always achieve the lowest MST.

 

KEYWORDS

Stochastic processes, Computational model, Delayed Tailed Distribution, Optimal scheduling, Cloud computing, Synchronization, Queuing Theory, MapReduce, Stochastic Modeling, Performance Evaluation, Fork-Join Queue.

Towards Stochastically Optimizing Data Computing Flows

Abstract:
With rapid growth in the amount of unstructured data produced by memory-intensive applications, large scale data analytics has recently attracted increasing interest. Processing, managing and analyzing this huge amount of data poses several challenges in cloud and data center computing domain. Especially, conventional frameworks for distributed data analytics are based on the assumption of homogeneity and non-stochastic distribution of different data-processing nodes. The paper argues the fundamental limiting factors for scaling big data computation. It is shown that as the number of series and parallel computing servers increase, the tail (mean and variance) of the job execution time increase. We will first propose a model to predict the response time of highly distributed processing tasks and then propose a new practical computational algorithm to optimize the response time.