Trang chủ

Reinforcement Learning

Advanced Reinforcement Learning in Python: from DQN to SAC

Advanced Reinforcement Learning in Python: from DQN to SAC

Loại khoá học: Data Science

Build Artificial Intelligence (AI) agents using Deep Reinforcement Learning and PyTorch: DDPG, TD3, SAC, NAF, HER.

50.000 VND

1.299.000 VND

Đầy Đủ Bài Giảng

Học Online Tiện Lợi

Kích Hoạt Nhanh 2-5 Phút

Thanh toán tự động

Được phép tải xuống

Mô tả

This is the most complete Advanced Reinforcement Learning course on Udemy. In it, you will learn to implement some of the most powerful Deep Reinforcement Learning algorithms in Python using PyTorch and PyTorch lightning. You will implement from scratch adaptive algorithms that solve control tasks based on experience. You will learn to combine these techniques with Neural Networks and Deep Learning methods to create adaptive Artificial Intelligence agents capable of solving decision-making tasks.

This course will introduce you to the state of the art in Reinforcement Learning techniques. It will also prepare you for the next courses in this series, where we will explore other advanced methods that excel in other types of task.

The course is focused on developing practical skills. Therefore, after learning the most important concepts of each family of methods, we will implement one or more of their algorithms in jupyter notebooks, from scratch.

Leveling modules:

- Refresher: The Markov decision process (MDP).

- Refresher: Q-Learning.

- Refresher: Brief introduction to Neural Networks.

- Refresher: Deep Q-Learning.

- Refresher: Policy gradient methods

Advanced Reinforcement Learning:

- PyTorch Lightning.

- Hyperparameter tuning with Optuna.

- Deep Q-Learning for continuous action spaces (Normalized advantage function - NAF).

- Deep Deterministic Policy Gradient (DDPG).

- Twin Delayed DDPG (TD3).

- Soft Actor-Critic (SAC).

- Hindsight Experience Replay (HER).

Bạn sẽ học được gì

Yêu cầu

Nội dung khoá học

14 sections

Introduction

6 lectures

Introduction

04:51

Reinforcement Learning series

00:13

Google Colab

01:26

Where to begin

01:32

Complete code

00:07

Connect with me on social media

00:05

Refresher: The Markov Decision Process (MDP)

11 lectures

Module Overview

00:47

Elements common to all control tasks

05:44

The Markov decision process (MDP)

05:52

Types of Markov decision process

02:23

Trajectory vs episode

01:13

Reward vs Return

01:39

Discount factor

04:19

Policy

02:16

State values v(s) and action values q(s,a)

01:11

Bellman equations

03:17

Solving a Markov decision process

03:21

Refresher: Q-Learning

5 lectures

Module overview

00:31

Temporal difference methods

03:16

Solving control tasks with temporal difference methods

03:58

Q-Learning

02:22

Advantages of temporal difference methods

00:56

Refresher: Brief introduction to Neural Networks

7 lectures

Module overview

00:36

Function approximators

07:35

Artificial Neural Networks

03:32

Artificial Neurons

05:38

How to represent a Neural Network

06:44

Stochastic Gradient Descent

05:40

Neural Network optimization

04:01

Refresher: Deep Q-Learning

4 lectures

Module overview

00:26

Deep Q-Learning

03:02

Experience Replay

01:58

Target Network

03:28

PyTorch Lightning

15 lectures

PyTorch Lightning

07:56

Link to the code notebook

00:05

Introduction to PyTorch Lightning

05:11

Create the Deep Q-Network

04:48

Create the policy

04:51

Create the replay buffer

05:33

Create the environment

07:02

Define the class for the Deep Q-Learning algorithm

11:56

Define the play_episode() function

04:59

Prepare the data loader and the optimizer

04:51

Define the train_step() method

09:02

Define the train_epoch_end() method

04:25

[Important] Lecture correction.

00:12

Train the Deep Q-Learning algorithm

06:11

Explore the resulting agent

03:08

Hyperparameter tuning with Optuna

6 lectures

Hyperparameter tuning with Optuna

08:37

Link to the code notebook

00:05

Log average return

04:40

Define the objective function

05:28

Create and launch the hyperparameter tuning job

02:55

Explore the best trial

02:40

Deep Q-Learning for continuous action spaces (Normalized Advantage Function)

19 lectures

Continuous action spaces

06:01

The advantage function

04:05

Normalized Advantage Function (NAF)

02:49

Normalized Advantage Function pseudocode

05:27

Link to the code notebook

00:05

Hyperbolic tangent

01:29

Creating the (NAF) Deep Q-Network 1

08:04

Creating the (NAF) Deep Q-Network 2

03:20

Creating the (NAF) Deep Q-Network 3

01:08

Creating the (NAF) Deep Q-Network 4

10:21

Creating the policy

05:31

Create the environment

04:46

Polyak averaging

01:19

Implementing Polyak averaging

02:14

Create the (NAF) Deep Q-Learning algorithm

08:47

Implement the training step

02:56

Implement the end-of-epoch logic

02:38

Debugging and launching the algorithm

03:19

Checking the resulting agent

02:47

Refresher: Policy gradient methods

5 lectures

Policy gradient methods

04:16

Policy performance

02:16

Representing policies using neural networks

04:43

The policy gradient theorem

03:20

Entropy Regularization

05:39

Deep Deterministic Policy Gradient (DDPG)

13 lectures

The Brax Physics engine

03:24

Deep Deterministic Policy Gradient (DDPG)

08:51

DDPG pseudocode

03:31

Link to the code notebook

00:11

Deep Deterministic Policy Gradient (DDPG)

05:11

Create the gradient policy

09:40

Create the Deep Q-Network

05:01

Create the DDPG class

08:10

Define the play method

02:22

Setup the optimizers and dataloader

03:37

Define the training step

11:12

Launch the training process

05:35

Check the resulting agent

02:13

Twin Delayed DDPG (TD3)

8 lectures

Twin Delayed DDPG (TD3)

10:29

TD3 pseudocode

03:44

Link to code notebook

00:05

Twin Delayed DDPG (TD3)

02:54

Clipped double Q-Learning

04:23

Delayed policy updates

01:56

Target policy smoothing

04:35

Check the resulting agent

02:27

Soft Actor-Critic (SAC)

9 lectures

Soft Actor-Critic (SAC)

06:46

SAC pseudocode

01:48

Link to code notebook

00:05

Create the robotics task

11:41

Create the Deep Q-Network

04:33

Create the gradient policy

13:21

Implement the Soft Actor-Critic algorithm - Part 1

09:04

Implement the Soft Actor-Critic algorithm - Part 2

12:13

Check the results

02:18

Hindsight Experience Replay

6 lectures

Hindsight Experience Replay (HER)

03:58

Link to code notebook

00:05

Implement Hindsight Experience Replay (HER) - Part 1

06:14

Implement Hindsight Experience Replay (HER) - Part 2

02:58

Implement Hindsight Experience Replay (HER) - Part 3

11:35

Check the results

01:10

Final steps

2 lectures

Next steps

01:50

Next steps

00:06

Đánh giá của học viên

Chưa có đánh giá

Course Rating

5

0%

4

0%

3

0%

2

0%

1

0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

Email

Nội dung

Khoá học liên quan

Master Network Automation with Python for Network Engineers

Master Network Automation with Python for Network Engineers

Unity Mobile Game Development - Exterminator

Unity Mobile Game Development - Exterminator

Mastering CSS Grid 2023 - With 3 cool projects

Mastering CSS Grid 2023 - With 3 cool projects

Django 3 - Full Stack Websites with Python Web Development

Django 3 - Full Stack Websites with Python Web Development

Appium -Mobile Testing (Android/IOS) from Scratch+Frameworks

Appium -Mobile Testing (Android/IOS) from Scratch+Frameworks

Appium - Mobile Testing with Latest 2.0 and Live Projects

Appium - Mobile Testing with Latest 2.0 and Live Projects

Deep Learning for Object Detection with Python and PyTorch

Deep Learning for Object Detection with Python and PyTorch

Full Android Development Masterclass | 14 Real Apps-46 Hours

Full Android Development Masterclass | 14 Real Apps-46 Hours

Go Java Full Stack with Spring Boot and Angular

Go Java Full Stack with Spring Boot and Angular

Machine Learning Practical Workout | 8 Real-World Projects

Machine Learning Practical Workout | 8 Real-World Projects

WordPress 2024: The Complete WordPress Website Course

WordPress 2024: The Complete WordPress Website Course

Android App Development Master Course with Java | Android

Android App Development Master Course with Java | Android

Unreal Engine 5: Soulslike Melee Combat System

Unreal Engine 5: Soulslike Melee Combat System

ASP.NET Core Web Application Using Razor Pages

ASP.NET Core Web Application Using Razor Pages

Build Undetectable Malware Using C Language: Ethical Hacking

Build Undetectable Malware Using C Language: Ethical Hacking

Angular - The Complete Guide (2024 Edition)

Angular - The Complete Guide (2024 Edition)

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.

Get khoá học cho tôi