Mô tả

This is the most complete Reinforcement Learning course on Udemy. In it you will learn the basics of Reinforcement Learning, one of the three paradigms of modern artificial intelligence. You will implement from scratch adaptive algorithms that solve control tasks based on experience. You will also learn to combine these algorithms with Deep Learning techniques and neural networks, giving rise to the branch known as Deep Reinforcement Learning.


This course will give you the foundation you need to be able to understand new algorithms as they emerge. It will also prepare you for the next courses in this series, in which we will go much deeper into different branches of Reinforcement Learning and look at some of the more advanced algorithms that exist.


The course is focused on developing practical skills. Therefore, after learning the most important concepts of each family of methods, we will implement one or more of their algorithms in jupyter notebooks, from scratch.


This course is divided into three parts and covers the following topics:


Part 1 (Tabular methods):


- Markov decision process


- Dynamic programming


- Monte Carlo methods


- Time difference methods (SARSA, Q-Learning)


- N-step bootstrapping


Part 2 (Continuous state spaces):


- State aggregation


- Tile Coding


Part 3 (Deep Reinforcement Learning):


- Deep SARSA


- Deep Q-Learning


- REINFORCE


- Advantage Actor-Critic / A2C (Advantage Actor-Critic / A2C method)


Bạn sẽ học được gì

Yêu cầu

Nội dung khoá học

13 sections

Welcome module

6 lectures
[IMPORTANT] English captions available for sections 1-4
00:06
Welcome
07:04
Reinforcement Learning series
00:14
Course structure
02:00
Environment setup [Important]
00:45
Connect with us on social media
00:05

The Markov decision process (MDP)

13 lectures
Elements common to all control tasks
05:44
The Markov decision process (MDP)
05:52
Types of Markov decision process
02:23
Trajectory vs episode
01:13
Reward vs Return
01:39
Discount factor
04:19
Policy
02:16
State values v(s) and action values q(s,a)
01:11
Bellman equations
03:17
Solving a Markov decision process
03:21
Setup - MDP in code
00:40
MDP in code - Part 1
13:04
MDP in code - Part 2
13:08

Dynamic Programming

18 lectures
Introduction to Dynamic Programming
05:19
Value iteration
04:00
Setup - Value iteration
00:27
Coding - Value iteration 1
04:11
Coding - Value iteration 2
05:27
Coding - Value iteration 3
01:16
Coding - Value iteration 4
07:40
Coding - Value iteration 5
03:09
Policy iteration
02:19
Policy evaluation
02:11
Setup - Policy iteration
00:27
Coding - Policy iteration 1
05:09
Coding - Policy iteration 2
08:22
Policy Improvement
02:56
Coding - Policy iteration 3
06:33
Coding - Policy iteration 4
06:12
Policy iteration in practice
02:08
Generalized Policy Iteration (GPI)
02:17

Monte Carlo methods

14 lectures
Monte Carlo methods
03:09
Solving control tasks with Monte Carlo methods
06:56
On-policy Monte Carlo control
04:33
Setup - On-policy Monte Carlo control
00:28
Coding - On-policy Monte Carlo control 1
10:05
Coding - On-policy Monte Carlo control 2
10:20
Coding - On-policy Monte Carlo control 3
02:51
Setup - Constant alpha Monte Carlo
00:28
Coding - Constant alpha Monte Carlo
04:26
Off-policy Monte Carlo control
07:08
Setup - Off-policy Monte Carlo control
00:27
Coding - Off-policy Monte Carlo 1
11:32
Coding - Off-policy Monte Carlo 2
12:36
Coding - Off-policy Monte Carlo 3
03:13

Temporal difference methods

12 lectures
Temporal difference methods
03:16
Solving control tasks with temporal difference methods
03:58
Monte Carlo vs temporal difference methods
01:23
SARSA
03:53
Setup - SARSA
00:27
Coding - SARSA 1
05:18
Coding - SARSA 2
08:39
Q-Learning
02:22
Setup - Q-Learning
00:27
Coding - Q-Learning 1
05:09
Coding - Q-Learning 2
09:08
Advantages of temporal difference methods
00:56

N-step bootstrapping

7 lectures
N-step temporal difference methods
03:30
Where do n-step methods fit?
02:53
Effect of changing n
04:36
N-step SARSA
03:00
N-step SARSA in action
01:55
Setup - n-step SARSA
00:27
Coding - n-step SARSA
16:15

Continuous state spaces

12 lectures
Setup - Classic control tasks
00:27
Coding - Classic control tasks
10:53
Working with continuous state spaces
03:02
State aggregation
04:08
Setup - Continuous state spaces
00:28
Coding - State aggregation 1
20:29
Coding - State aggregation 2
03:04
Coding - State aggregation 3
03:44
Tile coding
05:14
Coding - Tile coding 1
21:28
Coding - Tile coding 2
07:35
Coding - Tile coding 3
03:03

Brief introduction to neural networks

6 lectures
Function approximators
07:35
Artificial Neural Networks
03:32
Artificial Neurons
05:38
How to represent a Neural Network
06:44
Stochastic Gradient Descent
05:40
Neural Network optimization
04:01

Deep SARSA

16 lectures
Deep SARSA
02:39
Neural Network optimization (Deep Q-Network)
02:41
Experience Replay
01:58
Target Network
03:28
Setup - Deep SARSA
00:27
Coding - Deep SARSA 1
07:40
Coding - Deep SARSA 2
13:47
Coding - Deep SARSA 3
04:09
Coding - Deep SARSA 4
01:51
Coding - Deep SARSA 4 (Addendum)
00:26
Coding - Deep SARSA 5
02:08
Coding - Deep SARSA 6
05:42
Coding - Deep SARSA 7
07:15
Coding - Deep SARSA 8
06:42
Coding - Deep SARSA 9
11:49
Coding -Deep SARSA 10
05:21

Deep Q-Learning

5 lectures
Deep Q-Learning
03:02
Setup - Deep Q-Learning
00:27
Coding - Deep Q-Learning 1
09:43
Coding - Deep Q-Learning 2
06:06
Coding - Deep Q-Learning 3
10:15

REINFORCE

14 lectures
Policy gradient methods
04:16
Representing policies using neural networks
04:43
Policy performance
02:16
The policy gradient theorem
03:20
REINFORCE
03:38
Parallel learning
03:06
Entropy regularization
05:39
REINFORCE 2
02:03
Setup - REINFORCE
00:45
Coding - REINFORCE 1
08:10
Coding - REINFORCE 2
13:12
Coding - REINFORCE 3
07:56
Coding - REINFORCE 4
11:15
Coding - REINFORCE 5
14:37

Advantage Actor-Critic (A2C)

6 lectures
A2C
10:49
Setup - A2C
00:45
Coding - A2C 1
05:20
Coding - A2C 2
04:29
Coding - A2C 3
05:49
Coding - A2C 4
11:30

Outro

3 lectures
Looking back
02:47
Next steps
01:29
Next steps
00:07

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.