Mô tả

This is the most complete Advanced Reinforcement Learning course on Udemy. In it, you will learn to implement some of the most powerful Deep Reinforcement Learning algorithms in Python using PyTorch and PyTorch lightning. You will implement from scratch adaptive algorithms that solve control tasks based on experience. You will learn to combine these techniques with Neural Networks and Deep Learning methods to create adaptive Artificial Intelligence agents capable of solving decision-making tasks.

This course will introduce you to the state of the art in Reinforcement Learning techniques. It will also prepare you for the next courses in this series, where we will explore other advanced methods that excel in other types of task.

The course is focused on developing practical skills. Therefore, after learning the most important concepts of each family of methods, we will implement one or more of their algorithms in jupyter notebooks, from scratch.


Leveling modules: 


- Refresher: The Markov decision process (MDP).

- Refresher: Q-Learning.

- Refresher: Brief introduction to Neural Networks.

- Refresher: Deep Q-Learning.

- Refresher: Policy gradient methods



Advanced Reinforcement Learning:


- PyTorch Lightning.

- Hyperparameter tuning with Optuna.

- Deep Q-Learning for continuous action spaces (Normalized advantage function - NAF).

- Deep Deterministic Policy Gradient (DDPG).

- Twin Delayed DDPG (TD3).

- Soft Actor-Critic (SAC).

- Hindsight Experience Replay (HER).

Bạn sẽ học được gì

Yêu cầu

Nội dung khoá học

14 sections

Introduction

6 lectures
Introduction
04:51
Reinforcement Learning series
00:13
Google Colab
01:26
Where to begin
01:32
Complete code
00:07
Connect with me on social media
00:05

Refresher: The Markov Decision Process (MDP)

11 lectures
Module Overview
00:47
Elements common to all control tasks
05:44
The Markov decision process (MDP)
05:52
Types of Markov decision process
02:23
Trajectory vs episode
01:13
Reward vs Return
01:39
Discount factor
04:19
Policy
02:16
State values v(s) and action values q(s,a)
01:11
Bellman equations
03:17
Solving a Markov decision process
03:21

Refresher: Q-Learning

5 lectures
Module overview
00:31
Temporal difference methods
03:16
Solving control tasks with temporal difference methods
03:58
Q-Learning
02:22
Advantages of temporal difference methods
00:56

Refresher: Brief introduction to Neural Networks

7 lectures
Module overview
00:36
Function approximators
07:35
Artificial Neural Networks
03:32
Artificial Neurons
05:38
How to represent a Neural Network
06:44
Stochastic Gradient Descent
05:40
Neural Network optimization
04:01

Refresher: Deep Q-Learning

4 lectures
Module overview
00:26
Deep Q-Learning
03:02
Experience Replay
01:58
Target Network
03:28

PyTorch Lightning

15 lectures
PyTorch Lightning
07:56
Link to the code notebook
00:05
Introduction to PyTorch Lightning
05:11
Create the Deep Q-Network
04:48
Create the policy
04:51
Create the replay buffer
05:33
Create the environment
07:02
Define the class for the Deep Q-Learning algorithm
11:56
Define the play_episode() function
04:59
Prepare the data loader and the optimizer
04:51
Define the train_step() method
09:02
Define the train_epoch_end() method
04:25
[Important] Lecture correction.
00:12
Train the Deep Q-Learning algorithm
06:11
Explore the resulting agent
03:08

Hyperparameter tuning with Optuna

6 lectures
Hyperparameter tuning with Optuna
08:37
Link to the code notebook
00:05
Log average return
04:40
Define the objective function
05:28
Create and launch the hyperparameter tuning job
02:55
Explore the best trial
02:40

Deep Q-Learning for continuous action spaces (Normalized Advantage Function)

19 lectures
Continuous action spaces
06:01
The advantage function
04:05
Normalized Advantage Function (NAF)
02:49
Normalized Advantage Function pseudocode
05:27
Link to the code notebook
00:05
Hyperbolic tangent
01:29
Creating the (NAF) Deep Q-Network 1
08:04
Creating the (NAF) Deep Q-Network 2
03:20
Creating the (NAF) Deep Q-Network 3
01:08
Creating the (NAF) Deep Q-Network 4
10:21
Creating the policy
05:31
Create the environment
04:46
Polyak averaging
01:19
Implementing Polyak averaging
02:14
Create the (NAF) Deep Q-Learning algorithm
08:47
Implement the training step
02:56
Implement the end-of-epoch logic
02:38
Debugging and launching the algorithm
03:19
Checking the resulting agent
02:47

Refresher: Policy gradient methods

5 lectures
Policy gradient methods
04:16
Policy performance
02:16
Representing policies using neural networks
04:43
The policy gradient theorem
03:20
Entropy Regularization
05:39

Deep Deterministic Policy Gradient (DDPG)

13 lectures
The Brax Physics engine
03:24
Deep Deterministic Policy Gradient (DDPG)
08:51
DDPG pseudocode
03:31
Link to the code notebook
00:11
Deep Deterministic Policy Gradient (DDPG)
05:11
Create the gradient policy
09:40
Create the Deep Q-Network
05:01
Create the DDPG class
08:10
Define the play method
02:22
Setup the optimizers and dataloader
03:37
Define the training step
11:12
Launch the training process
05:35
Check the resulting agent
02:13

Twin Delayed DDPG (TD3)

8 lectures
Twin Delayed DDPG (TD3)
10:29
TD3 pseudocode
03:44
Link to code notebook
00:05
Twin Delayed DDPG (TD3)
02:54
Clipped double Q-Learning
04:23
Delayed policy updates
01:56
Target policy smoothing
04:35
Check the resulting agent
02:27

Soft Actor-Critic (SAC)

9 lectures
Soft Actor-Critic (SAC)
06:46
SAC pseudocode
01:48
Link to code notebook
00:05
Create the robotics task
11:41
Create the Deep Q-Network
04:33
Create the gradient policy
13:21
Implement the Soft Actor-Critic algorithm - Part 1
09:04
Implement the Soft Actor-Critic algorithm - Part 2
12:13
Check the results
02:18

Hindsight Experience Replay

6 lectures
Hindsight Experience Replay (HER)
03:58
Link to code notebook
00:05
Implement Hindsight Experience Replay (HER) - Part 1
06:14
Implement Hindsight Experience Replay (HER) - Part 2
02:58
Implement Hindsight Experience Replay (HER) - Part 3
11:35
Check the results
01:10

Final steps

2 lectures
Next steps
01:50
Next steps
00:06

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.