Reinforcement learning is a learning problem in which an actor has to behave optimally in its environment. Deep learning methods, on the other hand, are a subclass of representation learning, which in turn focuses on extracting the necessary features for the task (e.g. classification or detection). As such, they serve as powerful function approximators. The combination of those two paradigm results in deep reinforcement learning.
This thesis gives an overview of the recent advancement in the field. The results are divided into two broad research directions: value-based and policy-based approaches. This research shows several algorithms from those directions and how they perform. Finally, multiple open research questions are addressed and new research directions are proposed.
Inhaltsverzeichnis (Table of Contents)
- Abstract
- Zusammenfassung
- List of Abbreviations
- List of Figures
- 1 Introduction
- 2 Reinforcement Learning
- 2.1 Markov Decision Process (MDP)
- 2.2 Value-Based Methods
- 2.2.1 Dynamic Programming (DP)
- 2.2.2 Monte Carlo (MC)
- 2.2.3 Temporal Difference (TD)
- 2.3 Policy-Based Methods
- 2.3.1 Policy Iteration
- 2.3.2 Policy Gradient
- 3 Deep Learning
- 3.1 Neural Networks
- 3.1.1 Convolutional Neural Network (CNN)
- 3.1.2 Recurrent Neural Network (RNN)
- 3.2 Deep Reinforcement Learning (DRL)
- 3.2.1 Deep Q-Network (DQN)
- 3.2.2 Deep Deterministic Policy Gradient (D-DPG)
- 3.2.3 Asynchronous Advantage Actor-Critic (A3C)
- 3.2.4 Trust Region Policy Optimization (TRPO)
- 3.2.5 Distributional Bellman Equation
- 4 Applications of DRL
- 4.1 Game Playing
- 4.1.1 Self-Play Reinforcement Learning
- 4.1.2 Monte Carlo Tree Search (MCTS)
- 4.1.3 Multiplayer Online Battle Arena (MOBA)
- 4.2 Robotics
- 4.3 Finance
- 5 Conclusion
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This thesis provides a comprehensive overview of recent advancements in deep reinforcement learning (DRL). It explores the integration of deep learning methods with reinforcement learning, highlighting the key algorithms and their performance in various domains. The research delves into both value-based and policy-based approaches, examining their strengths and limitations.
- Integration of deep learning and reinforcement learning
- Value-based and policy-based DRL algorithms
- Applications of DRL in game playing, robotics, and finance
- Open research questions and future directions in DRL
Zusammenfassung der Kapitel (Chapter Summaries)
Chapter 1 introduces the concept of DRL, highlighting its significance and potential applications. Chapter 2 provides a foundational understanding of reinforcement learning, covering key concepts like Markov Decision Processes (MDPs), value-based methods (Dynamic Programming, Monte Carlo, and Temporal Difference learning), and policy-based methods (policy iteration and policy gradient). Chapter 3 delves into deep learning, focusing on neural networks, including convolutional and recurrent neural networks, and their application in DRL. Key DRL algorithms, such as Deep Q-Networks (DQN), Deep Deterministic Policy Gradient (D-DPG), Asynchronous Advantage Actor-Critic (A3C), Trust Region Policy Optimization (TRPO), and the Distributional Bellman Equation are discussed. Chapter 4 explores practical applications of DRL in game playing, robotics, and finance. Finally, Chapter 5 concludes by summarizing the research findings, highlighting open research questions, and proposing future directions for DRL.
Schlüsselwörter (Keywords)
Deep reinforcement learning, deep learning, reinforcement learning, neural networks, value-based methods, policy-based methods, game playing, robotics, finance, open research questions, future directions.
- Quote paper
- Artur Sahakjan (Author), 2018, A Review of Recent Advancements in Deep Reinforcement Learning, Munich, GRIN Verlag, https://www.grin.com/document/432230