Abstract or Introduction
The present research proposes a novel approach to estimate incoming jobs runtime based on similarities of reocurring jobs. To achieve this goal, we utilize the latest achievements in neural network techniques to embed the job dependencies. Subsequently, we perform multiple clustering techniques to form meaningful groups of reoccurring jobs. Finally, based on the similarities within the groups of samples, we predict runtimes. A recently published trace dataset allows us to develop and evaluate our contribution with more than 200,000 complex and real-world jobs.
The cloud data centers should daily handle numerous jobs with complex parallelization. In order to schedule such a heavy and complicated workload and reach efficient resource utilization, runtime prediction is critical. Moreover, accurate runtime prediction may assist cloud users in choosing their required resources more intelligently. Despite the importance of runtime prediction, achieving an accurate prediction is not straightforward because the execution time of jobs in complicated environments of clouds is affected by many factors, e.g., cluster status, users’ requirements, etc.
- Quote paper
- Alireza Alamgiralem (Author), 2021, Trace-Based Runtime Prediction of Reoccurring Data-Parallel Processing Jobs, Munich, GRIN Verlag, https://www.grin.com/document/1180656
Publish now - it's free
Comments