Grin logo
de en es fr
Shop
GRIN Website
Publish your texts - enjoy our full service for authors
Go to shop › Computer Science - Commercial Information Technology

A Review of Recent Advancements in Deep Reinforcement Learning

Title: A Review of Recent Advancements in Deep Reinforcement Learning

Bachelor Thesis , 2018 , 78 Pages , Grade: 1.0

Autor:in: Artur Sahakjan (Author)

Computer Science - Commercial Information Technology
Excerpt & Details   Look inside the ebook
Summary Excerpt Details

Reinforcement learning is a learning problem in which an actor has to behave optimally in its environment. Deep learning methods, on the other hand, are a subclass of representation learning, which in turn focuses on extracting the necessary features for the task (e.g. classification or detection). As such, they serve as powerful function approximators. The combination of those two paradigm results in deep reinforcement learning.

This thesis gives an overview of the recent advancement in the field. The results are divided into two broad research directions: value-based and policy-based approaches. This research shows several algorithms from those directions and how they perform. Finally, multiple open research questions are addressed and new research directions are proposed.

Excerpt


Table of Contents

1 Introduction

2 Research Method

2.1 Related Work

2.2 Research Conduction

3 Background

3.1 Reinforcement Learning

3.1.1 Markov Decision Process

3.1.2 Value Functions

3.1.3 Tabular Solution Methods

3.2 Deep Learning

4 Results

4.1 Value-Based Deep Reinforcement Learning

4.1.1 Deep Q-Learning and Deep Q-Networks

4.1.2 Double Q-Learning and Double Q-Network

4.1.3 Prioritized Replay

4.1.4 Dueling Network

4.1.5 Distributional Reinforcement Learning

4.1.6 Rainbow

4.2 Policy-Based Deep Reinforcement Learning

4.2.1 Asynchronous Advantage Actor-Critic

4.2.2 Trust Region Policy Optimization

4.2.3 Deep Deterministic Policy Gradients

4.2.4 Policy Iteration Using Monte Carlo Tree Search

4.2.5 Evolutionary Algorithms

4.3 Performance of the Algorithms

4.3.1 Atari 2600

4.3.2 MuJuCo

4.3.3 Various Measures

5 Discussion

5.1 Exploration vs. Exploitation

5.2 Need for Rewards

5.3 Knowledge Reusability

5.4 Inefficiency

5.5 Multi-Agent Reinforcement Learning

5.6 Model-Based Reinforcement Learning

5.7 Proposed Research Directions

6 Conclusion

Objectives and Research Themes

This thesis aims to provide a comprehensive review of recent advancements in Deep Reinforcement Learning (DRL) by categorizing the field into distinct research directions and evaluating the performance of key algorithms.

  • Overview of value-based Deep Reinforcement Learning approaches.
  • Analysis of policy-based DRL methods and their architectural innovations.
  • Evaluation of algorithmic performance across standard benchmarks like Atari 2600 and MuJoCo.
  • Discussion of fundamental challenges in RL, including exploration versus exploitation and reward engineering.
  • Identification of future research directions, such as knowledge reusability and multi-agent systems.

Excerpt from the Book

Deep Learning

In this section the main concepts of DL will be covered. For the purposes of this thesis, DL methods can be seen as a form of non-linear function approximator. DL methods are a subclass of representation learning, which in turn focuses on extracting the necessary features for the task (e.g. classification or detection) (Lecun et al., 2015, p. 436). Typically, supervised learning, which this section will be focused on, is used. Here, labeled training data is fed to a non-linear function approximator, such as a neural network. In this context, "labeled" means that along with the data (e.g. pixel values of an image) a target is also given (e.g. the image shows a dog). The network now learns a function that maps the inputs (e.g. the pixels) to the output (e.g. the label). The goal of the network is, given enough training data, to be able to generalize to new, unseen data (Lecun et al., 2015, p. 436-438).

The most widely used model architectures are feedforward (neural) networks (FNN), which are also called multilayer perceptrons (MLP) (Goodfellow et al., 2016, p. 167).

Summary of Chapters

1 Introduction: Provides the definition of Reinforcement Learning and describes the motivation for combining it with Deep Learning.

2 Research Method: Details the literature review process and the sources used to identify relevant research in the DRL field.

3 Background: Establishes the theoretical foundations of Reinforcement Learning (including Markov Decision Processes) and Deep Learning concepts.

4 Results: Presents an analysis of value-based and policy-based DRL algorithms and summarizes their performance on common benchmarks.

5 Discussion: Examines open research challenges such as the exploration-exploitation dilemma and multi-agent settings.

6 Conclusion: Summarizes the thesis findings and reflects on the trajectory of Artificial General Intelligence.

Keywords

Deep Reinforcement Learning, DRL, Artificial Intelligence, Neural Networks, Q-Learning, Policy Gradient, Atari 2600, MuJoCo, Exploration vs. Exploitation, Multi-Agent Reinforcement Learning, Markov Decision Process, Function Approximation, Experience Replay, Reward Shaping, Deep Learning.

Frequently Asked Questions

What is the core focus of this bachelor thesis?

The work provides a detailed review of recent advancements in Deep Reinforcement Learning (DRL), summarizing the most important algorithms and their performance in various environments.

What are the primary research areas discussed?

The thesis categorizes research into value-based approaches (like DQN) and policy-based approaches (like A3C, TRPO, and D-DPG), while also addressing model-based learning and evolutionary strategies.

What is the main objective of the research?

The objective is to offer a structured overview of the current DRL landscape and to synthesize how different techniques contribute to reaching an agent's goals.

Which scientific methods are primarily used for performance evaluation?

The thesis relies on benchmark testing using the Atari 2600 game suite and the MuJoCo physics simulation environment to compare different algorithmic implementations.

What topics are covered in the main section of the thesis?

The main part covers the theoretical background of RL and DL, the transition to Deep Q-Networks, improvements like Dueling Networks and Distributional RL, and a variety of policy-based methods.

Which keywords best characterize this work?

Key terms include Deep Reinforcement Learning, Neural Networks, Policy Gradients, Atari 2600 benchmarks, and Multi-Agent Reinforcement Learning.

What distinguishes the Rainbow agent from previous models?

Rainbow is a state-of-the-art agent that combines multiple enhancements—such as Prioritized Replay, Multi-Step Learning, and Noisy Nets—into a single architecture, significantly improving efficiency.

How does the thesis evaluate the efficiency of RL agents?

The work discusses the significant time and sample requirements of current models compared to human learning speed and highlights the need for higher sample efficiency in future applications.

Excerpt out of 78 pages  - scroll top

Details

Title
A Review of Recent Advancements in Deep Reinforcement Learning
College
University of Duisburg-Essen
Grade
1.0
Author
Artur Sahakjan (Author)
Publication Year
2018
Pages
78
Catalog Number
V432230
ISBN (eBook)
9783668765009
ISBN (Book)
9783668765016
Language
English
Tags
Machine learning artificial intelligence maschinelles lernen künstliche intelligenz
Product Safety
GRIN Publishing GmbH
Quote paper
Artur Sahakjan (Author), 2018, A Review of Recent Advancements in Deep Reinforcement Learning, Munich, GRIN Verlag, https://www.grin.com/document/432230
Look inside the ebook
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
  • Depending on your browser, you might see this message in place of the failed image.
Excerpt from  78  pages
Grin logo
  • Grin.com
  • Shipping
  • Contact
  • Privacy
  • Terms
  • Imprint