Trang chủ

Deep Learning

Advanced Reinforcement Learning: policy gradient methods

Advanced Reinforcement Learning: policy gradient methods

Loại khoá học: Data Science

Build Artificial Intelligence (AI) agents using Deep Reinforcement Learning and PyTorch: (REINFORCE, A2C, PPO, etc)

50.000 VND

1.499.000 VND

Đầy Đủ Bài Giảng

Học Online Tiện Lợi

Kích Hoạt Nhanh 2-5 Phút

Thanh toán tự động

Được phép tải xuống

Mô tả

This is the most complete Reinforcement Learning course series on Udemy. In it, you will learn to implement some of the most powerful Deep Reinforcement Learning algorithms in Python using PyTorch and PyTorch lightning. You will implement from scratch adaptive algorithms that solve control tasks based on experience. You will learn to combine these techniques with Neural Networks and Deep Learning methods to create adaptive Artificial Intelligence agents capable of solving decision-making tasks.

This course will introduce you to the state of the art in Reinforcement Learning techniques. It will also prepare you for the next courses in this series, where we will explore other advanced methods that excel in other types of task.

The course is focused on developing practical skills. Therefore, after learning the most important concepts of each family of methods, we will implement one or more of their algorithms in jupyter notebooks, from scratch.

Leveling modules:

- Refresher: The Markov decision process (MDP).

- Refresher: Monte Carlo methods.

- Refresher: Temporal difference methods.

- Refresher: N-step bootstrapping.

- Refresher: Brief introduction to Neural Networks.

- Refresher: Policy gradient methods.

Advanced Reinforcement Learning:

- REINFORCE

- REINFORCE for continuous action spaces

- Advantage actor-critic (A2C)

- Trust region methods

- Proximal policy optimization (PPO)

- Generalized advantage estimation (GAE)

- Trust region policy optimization (TRPO)

Bạn sẽ học được gì

Master some of the most advanced Reinforcement Learning algorithms.

Learn how to create AIs that can act in a complex environment to achieve their goals.

Create from scratch advanced Reinforcement Learning agents using Python's most popular tools (PyTorch Lightning, OpenAI gym, Optuna)

Learn how to perform hyperparameter tuning (Choosing the best experimental conditions for our AI to learn)

Fundamentally understand the learning process for each algorithm.

Debug and extend the algorithms presented.

Understand and implement new algorithms from research papers.

Yêu cầu

Be comfortable programming in Python
Completing our course "Reinforcement Learning beginner to master" or being familiar with the basics of Reinforcement Learning (or watching the leveling sections included in this course).
Know basic statistics (mean, variance, normal distribution)

Nội dung khoá học

15 sections

Introduction

6 lectures

Introduction

06:07

Reinforcement Learning series

00:14

Google Colab

01:26

Where to begin

00:57

Complete code

00:03

Connect with me on social media

00:05

Refresher: The Markov Decision Process (MDP)

10 lectures

Elements common to all control tasks

05:44

The Markov decision process (MDP)

05:52

Types of Markov decision process

02:23

Trajectory vs episode

01:13

Reward vs Return

01:39

Discount factor

04:19

Policy

02:16

State values v(s) and action values q(s,a)

01:11

Bellman equations

03:17

Solving a Markov decision process

03:21

Refresher: Monte Carlo methods

3 lectures

Monte Carlo methods

03:09

Solving control tasks with Monte Carlo methods

06:56

On-policy Monte Carlo control

04:33

Refresher: Temporal difference methods

6 lectures

Temporal difference methods

03:16

Solving control tasks with temporal difference methods

03:58

Monte Carlo vs temporal difference methods

01:23

SARSA

03:53

Q-Learning

02:22

Advantages of temporal difference methods

00:56

Refresher: N-step bootstrapping

3 lectures

N-step temporal difference methods

03:30

Where do n-step methods fit?

02:53

Effect of changing n

04:36

Refresher: Brief introduction to Neural Networks

6 lectures

Function approximators

07:35

Artificial Neural Networks

03:32

Artificial Neurons

05:38

How to represent a Neural Network

06:44

Stochastic Gradient Descent

05:40

Neural Network optimization

04:01

Refresher: REINFORCE

8 lectures

Policy gradient methods

04:16

Representing policies using neural networks

04:43

Policy performance

02:16

The policy gradient theorem

03:20

REINFORCE

03:38

Parallel learning

03:06

Entropy regularization

05:39

REINFORCE 2

02:03

PyTorch Lightning

8 lectures

PyTorch Lightning

07:56

Link to the code notebook

00:02

Create the policy

13:37

Create the environment

09:31

Create the dataset

14:02

Create the REINFORCE algorithm - Part 1

06:46

Create the REINFORCE algorithm - Part 2

10:45

Check the resulting agent

05:57

REINFORCE for continuous control tasks

8 lectures

REINFORCE for continuous action spaces

04:55

Link to the code notebook

00:02

Create the policy

09:47

Create the inverted pendulum environment

08:46

Create the dataset

08:59

Creating the algorithm - Part 1

06:24

Creating the algorithm - Part 2

06:45

Check the resulting agent

02:39

Advantage Actor Critic (A2C)

8 lectures

A2C

10:49

Link to the code notebook

00:02

Create the policy and value network

04:19

Create the environment

05:39

Create the dataset

03:24

Implement A2C - Part 1

05:47

Implement A2C - Part 2

10:40

Check the resulting agent

02:39

Trust region methods

6 lectures

Line search vs trust region methods

02:23

Line search methods

06:05

Trust region methods 1

02:49

Kullback-Leibler divergence

04:08

Trust region methods 2

10:06

Trust region methods 3

02:42

Proximal Policy Optimization (PPO)

7 lectures

Proximal Policy Optimization

08:57

Link to the code notebook

00:02

Create the environment

08:04

Create the dataset

08:14

Create the PPO algorithm - Part 1

07:53

Create the PPO algorithm - Part 2

14:51

Check the resulting agent

02:16

Generalized Advantage Estimation (GAE)

7 lectures

Generalized Advantage Estimation

11:02

Link to the code notebook

00:02

Create the Half Cheetah environment

05:04

Create the dataset

11:52

PPO with generalized advantage estimation - Part 1

03:23

PPO with generalized advantage estimation - Part 2

07:09

Checking the resulting agent

01:31

Trust Region Policy Optimization (TRPO)

9 lectures

Trust region policy optimization 1

03:28

Trust region policy optimization 2

05:19

Link to the code notebook

00:02

TRPO in code - Part 1

03:08

TRPO in code - Part 2

02:08

TRPO in code - Part 3

01:47

TRPO in code - Part 4

04:26

TRPO in code - Part 5

09:34

TRPO in code - Part 6

01:34

Final steps

1 lectures

Final steps

00:06

Đánh giá của học viên

Chưa có đánh giá

Course Rating

5

0%

4

0%

3

0%

2

0%

1

0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

Email

Nội dung

Khoá học liên quan

Master Network Automation with Python for Network Engineers

Master Network Automation with Python for Network Engineers

Unity Mobile Game Development - Exterminator

Unity Mobile Game Development - Exterminator

Mastering CSS Grid 2023 - With 3 cool projects

Mastering CSS Grid 2023 - With 3 cool projects

Django 3 - Full Stack Websites with Python Web Development

Django 3 - Full Stack Websites with Python Web Development

Appium -Mobile Testing (Android/IOS) from Scratch+Frameworks

Appium -Mobile Testing (Android/IOS) from Scratch+Frameworks

Appium - Mobile Testing with Latest 2.0 and Live Projects

Appium - Mobile Testing with Latest 2.0 and Live Projects

Deep Learning for Object Detection with Python and PyTorch

Deep Learning for Object Detection with Python and PyTorch

Full Android Development Masterclass | 14 Real Apps-46 Hours

Full Android Development Masterclass | 14 Real Apps-46 Hours

Go Java Full Stack with Spring Boot and Angular

Go Java Full Stack with Spring Boot and Angular

Machine Learning Practical Workout | 8 Real-World Projects

Machine Learning Practical Workout | 8 Real-World Projects

WordPress 2024: The Complete WordPress Website Course

WordPress 2024: The Complete WordPress Website Course

Android App Development Master Course with Java | Android

Android App Development Master Course with Java | Android

Unreal Engine 5: Soulslike Melee Combat System

Unreal Engine 5: Soulslike Melee Combat System

ASP.NET Core Web Application Using Razor Pages

ASP.NET Core Web Application Using Razor Pages

Build Undetectable Malware Using C Language: Ethical Hacking

Build Undetectable Malware Using C Language: Ethical Hacking

Angular - The Complete Guide (2024 Edition)

Angular - The Complete Guide (2024 Edition)

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.

Get khoá học cho tôi