The present research proposes a novel approach to estimate incoming jobs runtime based on similarities of reocurring jobs. To achieve this goal, we utilize the latest achievements in neural network techniques to embed the job dependencies. Subsequently, we perform multiple clustering techniques to form meaningful groups of reoccurring jobs. Finally, based on the similarities within the groups of samples, we predict runtimes. A recently published trace dataset allows us to develop and evaluate our contribution with more than 200,000 complex and real-world jobs.

The cloud data centers should daily handle numerous jobs with complex parallelization. In order to schedule such a heavy and complicated workload and reach efficient resource utilization, runtime prediction is critical. Moreover, accurate runtime prediction may assist cloud users in choosing their required resources more intelligently. Despite the importance of runtime prediction, achieving an accurate prediction is not straightforward because the execution time of jobs in complicated environments of clouds is affected by many factors, e.g., cluster status, users’ requirements, etc.

Excerpt

Inhaltsverzeichnis (Table of Contents)

1 Introduction
- 1.1 Motivation
- 1.2 Research Objectives
- 1.3 Related Work
- 1.4 Contributions
2 Background
- 2.1 Cloud Computing
- 2.2 Job Scheduling
- 2.3 Data-Parallel Processing
- 2.4 Runtime Prediction
3 Methodology
- 3.1 Trace Data Collection
- 3.2 Job Dependency Embedding
- 3.3 Job Clustering
- 3.4 Runtime Prediction
4 Experiments and Evaluation
5 Conclusion

Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)

This thesis investigates the use of trace-based runtime prediction for recurring data-parallel processing jobs in cloud data centers. The goal is to improve resource utilization and scheduling by accurately predicting the execution time of upcoming jobs based on the characteristics of similar past jobs.

Trace data analysis for identifying recurring patterns
Utilization of deep learning techniques for job dependency embedding
Clustering algorithms for grouping similar jobs based on their dependencies
Runtime prediction models based on similarities within job clusters
Evaluation of the proposed approach using a real-world trace dataset

Zusammenfassung der Kapitel (Chapter Summaries)

Chapter 1: Introduction: This chapter provides an overview of the motivation behind this research, the research objectives, a review of related work, and the key contributions of the thesis.
Chapter 2: Background: This chapter establishes the theoretical foundation by introducing concepts related to cloud computing, job scheduling, data-parallel processing, and runtime prediction.
Chapter 3: Methodology: This chapter delves into the proposed approach, outlining the process of trace data collection, job dependency embedding using deep learning techniques, job clustering based on similarities, and finally, runtime prediction based on the identified clusters.
Chapter 4: Experiments and Evaluation: This chapter presents the results of experiments conducted on a real-world trace dataset. It evaluates the effectiveness of the proposed approach in accurately predicting the runtime of incoming jobs.

Schlüsselwörter (Keywords)

The main keywords and focus topics of this thesis are cloud computing, data-parallel processing, job scheduling, runtime prediction, trace analysis, deep learning, job dependency embedding, clustering, and resource utilization.

Excerpt out of 103 pages - scroll top

Details

Title: Trace-Based Runtime Prediction of Reoccurring Data-Parallel Processing Jobs
College: Technical University of Berlin
Grade: 1.7
Author: Alireza Alamgiralem (Author)
Publication Year: 2021
Pages: 103
Catalog Number: V1180656
ISBN (PDF): 9783346602008
ISBN (Book): 9783346602015
Language: English
Tags: neural network pyTorch alibab trace dataset python clusterig prediction runtime cloud computing graph neural network big data
Product Safety: GRIN Publishing GmbH

Quote paper: Alireza Alamgiralem (Author), 2021, Trace-Based Runtime Prediction of Reoccurring Data-Parallel Processing Jobs, Munich, GRIN Verlag, https://www.grin.com/document/1180656

Trace-Based Runtime Prediction of Reoccurring Data-Parallel Processing Jobs