This bachelor thesis aims to illustrate the idea behind Markov Decision Processes (MDP) and to present a few basic methods of Reinforcement Learning (RL) namely Monte Carlo Learning and Q-Learning, which are the solutions for decision problems modelled by MDPs. For the last section we apply these methods on an application and in the end discuss the results.
Let us imagine the scenario where we put a hamster inside a maze, we expect the hamster to go through the maze till it reaches some point we considered as the goal. Well, it may randomly work but most of the time it won’t. At this place, the hamster does not know how important this particular point remains namely the goal.
But how will it be, when we remunerate the hamster once the goal is reached, he receives a reward for example a piece of cheese. The hamster will start to remember the route, which leads to the cheese and he maybe will learn to go the easy and quick way to achieve this goal. What we did, is that we reinforce the good behavior of the hamster by giving it some reward.
Inhaltsverzeichnis (Table of Contents)
- 1 Introduction
- 1.1 Outline
- 2 Basics of MDPs and RL
- 2.1 Markov Decision Processes
- 2.1.1 Markov Process
- 2.2 Value Function
- 2.3 Policy Iteration
- 2.4 Reinforcement Learning
- 2.4.1 Monte Carlo Learning
- 2.4.2 Temporal Difference Learning
- 2.1 Markov Decision Processes
- 3 Cleaning Robot Application
- 3.1 Introduction
- 3.2 Solving via Value Iteration
- 3.3 Solving via Monte Carlo Learning
- 3.4 Solving via Q-Learning
- 3.5 Comparison of Results
- 4 Discussion
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This bachelor thesis aims to provide an understanding of Markov Decision Processes (MDPs) and present fundamental methods of Reinforcement Learning (RL), specifically Monte Carlo Learning and Q-Learning. The focus is on illustrating how these methods can be applied to solve decision problems modeled by MDPs. The work utilizes a cleaning robot application to demonstrate the practical implementation of these techniques.
- Markov Decision Processes (MDPs) and their role in decision-making
- Reinforcement Learning (RL) as a solution approach for MDP-based problems
- Exploring specific RL methods, including Monte Carlo Learning and Q-Learning
- Application of RL methods to a practical example: a cleaning robot
- Comparison and analysis of the results obtained using different RL methods
Zusammenfassung der Kapitel (Chapter Summaries)
- Chapter 1: Introduction provides a brief outline of the thesis's scope and structure.
- Chapter 2: Basics of MDPs and RL introduces the concept of Markov Decision Processes, including Markov Processes and the value function. It then explores policy iteration and delves into the fundamental principles of Reinforcement Learning, specifically focusing on Monte Carlo Learning and Temporal Difference Learning.
- Chapter 3: Cleaning Robot Application presents a practical application of the learned concepts. It introduces the cleaning robot problem and demonstrates how to solve it using Value Iteration, Monte Carlo Learning, and Q-Learning. This chapter concludes with a comparison of the results obtained using different methods.
Schlüsselwörter (Keywords)
This thesis focuses on Markov Decision Processes, Reinforcement Learning, Monte Carlo Learning, Q-Learning, Value Iteration, Cleaning Robot, Decision Problems, Optimal Policy, and Application. These keywords represent the core concepts and research focus of the work.
- Citar trabajo
- Omar Baiazid (Autor), 2021, Methods of Machine Learning and their Application. The Basics of Markov Decision Processes and Reinforcement Learning, Múnich, GRIN Verlag, https://www.grin.com/document/1141604