Mô tả

Ever wondered how AI technologies like OpenAI ChatGPT and GPT-4 really work? In this course, you will learn the foundations of these groundbreaking applications.

When people talk about artificial intelligence, they usually don’t mean supervised and unsupervised machine learning.

These tasks are pretty trivial compared to what we think of AIs doing - playing chess and Go, driving cars, and beating video games at a superhuman level.

Reinforcement learning has recently become popular for doing all of that and more.

Much like deep learning, a lot of the theory was discovered in the 70s and 80s but it hasn’t been until recently that we’ve been able to observe first hand the amazing results that are possible.

In 2016 we saw Google’s AlphaGo beat the world Champion in Go.

We saw AIs playing video games like Doom and Super Mario.

Self-driving cars have started driving on real roads with other drivers and even carrying passengers (Uber), all without human assistance.

If that sounds amazing, brace yourself for the future because the law of accelerating returns dictates that this progress is only going to continue to increase exponentially.

Learning about supervised and unsupervised machine learning is no small feat. To date I have over TWENTY FIVE (25!) courses just on those topics alone.

And yet reinforcement learning opens up a whole new world. As you’ll learn in this course, the reinforcement learning paradigm is very from both supervised and unsupervised learning.

It’s led to new and amazing insights both in behavioral psychology and neuroscience. As you’ll learn in this course, there are many analogous processes when it comes to teaching an agent and teaching an animal or even a human. It’s the closest thing we have so far to a true artificial general intelligence.  What’s covered in this course?

  • The multi-armed bandit problem and the explore-exploit dilemma

  • Ways to calculate means and moving averages and their relationship to stochastic gradient descent

  • Markov Decision Processes (MDPs)

  • Dynamic Programming

  • Monte Carlo

  • Temporal Difference (TD) Learning (Q-Learning and SARSA)

  • Approximation Methods (i.e. how to plug in a deep neural network or other differentiable model into your RL algorithm)

  • How to use OpenAI Gym, with zero code changes

  • Project: Apply Q-Learning to build a stock trading bot

If you’re ready to take on a brand new challenge, and learn about AI techniques that you’ve never seen before in traditional supervised machine learning, unsupervised machine learning, or even deep learning, then this course is for you.

See you in class!


"If you can't implement it, you don't understand it"

  • Or as the great physicist Richard Feynman said: "What I cannot create, I do not understand".

  • My courses are the ONLY courses where you will learn how to implement machine learning algorithms from scratch

  • Other courses will teach you how to plug in your data into a library, but do you really need help with 3 lines of code?

  • After doing the same thing with 10 datasets, you realize you didn't learn 10 things. You learned 1 thing, and just repeated the same 3 lines of code 10 times...


Suggested Prerequisites:

  • Calculus

  • Probability

  • Object-oriented programming

  • Python coding: if/else, loops, lists, dicts, sets

  • Numpy coding: matrix and vector operations

  • Linear regression

  • Gradient descent


WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:

  • Check out the lecture "Machine Learning and AI Prerequisite Roadmap" (available in the FAQ of any of my courses, including the free Numpy course)


UNIQUE FEATURES

  • Every line of code explained in detail - email me any time if you disagree

  • No wasted time "typing" on the keyboard like other courses - let's be honest, nobody can really write code worth learning about in just 20 minutes from scratch

  • Not afraid of university-level math - get important details about algorithms that other courses leave out

Bạn sẽ học được gì

Apply gradient-based supervised machine learning methods to reinforcement learning

Understand reinforcement learning on a technical level

Understand the relationship between reinforcement learning and psychology

Implement 17 different reinforcement learning algorithms

Understand important foundations for OpenAI ChatGPT, GPT-4

Yêu cầu

  • Calculus (derivatives)
  • Probability / Markov Models
  • Numpy, Matplotlib
  • Beneficial to have experience with at least a few supervised machine learning methods
  • Gradient descent
  • Good object-oriented programming skills

Nội dung khoá học

14 sections

Welcome

5 lectures
Introduction
03:14
Course Outline and Big Picture
07:55
Where to get the Code
04:36
How to Succeed in this Course
03:04
Warmup
15:36

Return of the Multi-Armed Bandit

26 lectures
Section Introduction: The Explore-Exploit Dilemma
10:17
Applications of the Explore-Exploit Dilemma
08:00
Epsilon-Greedy Theory
07:04
Calculating a Sample Mean (pt 1)
05:56
Epsilon-Greedy Beginner's Exercise Prompt
05:05
Designing Your Bandit Program
04:09
Epsilon-Greedy in Code
07:12
Comparing Different Epsilons
06:02
Optimistic Initial Values Theory
05:40
Optimistic Initial Values Beginner's Exercise Prompt
02:26
Optimistic Initial Values Code
04:18
UCB1 Theory
14:32
UCB1 Beginner's Exercise Prompt
02:14
UCB1 Code
03:28
Bayesian Bandits / Thompson Sampling Theory (pt 1)
12:43
Bayesian Bandits / Thompson Sampling Theory (pt 2)
17:35
Thompson Sampling Beginner's Exercise Prompt
02:50
Thompson Sampling Code
05:03
Thompson Sampling With Gaussian Reward Theory
11:24
Thompson Sampling With Gaussian Reward Code
06:18
Exercise on Gaussian Rewards
01:20
Why don't we just use a library?
05:40
Nonstationary Bandits
07:11
Bandit Summary, Real Data, and Online Learning
06:29
(Optional) Alternative Bandit Designs
10:05
Suggestion Box
03:10

High Level Overview of Reinforcement Learning

2 lectures
What is Reinforcement Learning?
08:08
From Bandits to Full Reinforcement Learning
08:42

Markov Decision Proccesses

14 lectures
MDP Section Introduction
06:19
Gridworld
12:35
Choosing Rewards
03:58
The Markov Property
06:12
Markov Decision Processes (MDPs)
14:42
Future Rewards
09:34
Value Functions
05:07
The Bellman Equation (pt 1)
08:46
The Bellman Equation (pt 2)
06:42
The Bellman Equation (pt 3)
06:09
Bellman Examples
22:24
Optimal Policy and Optimal Value Function (pt 1)
09:17
Optimal Policy and Optimal Value Function (pt 2)
04:36
MDP Summary
02:58

Dynamic Programming

14 lectures
Dynamic Programming Section Introduction
08:59
Iterative Policy Evaluation
15:36
Designing Your RL Program
05:00
Gridworld in Code
11:37
Iterative Policy Evaluation in Code
12:17
Windy Gridworld in Code
07:47
Iterative Policy Evaluation for Windy Gridworld in Code
07:14
Policy Improvement
11:23
Policy Iteration
07:57
Policy Iteration in Code
08:27
Policy Iteration in Windy Gridworld
08:50
Value Iteration
07:40
Value Iteration in Code
06:36
Dynamic Programming Summary
04:57

Monte Carlo

8 lectures
Monte Carlo Intro
09:21
Monte Carlo Policy Evaluation
10:52
Monte Carlo Policy Evaluation in Code
07:52
Monte Carlo Control
09:00
Monte Carlo Control in Code
08:51
Monte Carlo Control without Exploring Starts
04:41
Monte Carlo Control without Exploring Starts in Code
05:40
Monte Carlo Summary
01:53

Temporal Difference Learning

8 lectures
Temporal Difference Introduction
03:55
TD(0) Prediction
05:24
TD(0) Prediction in Code
04:54
SARSA
04:36
SARSA in Code
06:20
Q Learning
04:55
Q Learning in Code
05:02
TD Learning Section Summary
02:27

Approximation Methods

11 lectures
Approximation Methods Section Introduction
04:19
Linear Models for Reinforcement Learning
08:32
Feature Engineering
10:16
Approximation Methods for Prediction
09:55
Approximation Methods for Prediction Code
08:26
Approximation Methods for Control
04:41
Approximation Methods for Control Code
08:54
CartPole
05:34
CartPole Code
06:00
Approximation Methods Exercise
04:07
Approximation Methods Section Summary
03:05

Interlude: Common Beginner Questions

1 lectures
This Course vs. RL Book: What's the Difference?
07:10

Stock Trading Project with Reinforcement Learning

10 lectures
Beginners, halt! Stop here if you skipped ahead
14:09
Stock Trading Project Section Introduction
05:13
Data and Environment
12:22
How to Model Q for Q-Learning
09:37
Design of the Program
06:45
Code pt 1
07:59
Code pt 2
09:40
Code pt 3
04:28
Code pt 4
07:17
Stock Trading Project Discussion
03:37

Setting Up Your Environment (FAQ by Student Request)

3 lectures
Pre-Installation Check
04:12
Anaconda Environment Setup
20:20
How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow
17:32

Extra Help With Python Coding for Beginners (FAQ by Student Request)

4 lectures
How to Code by Yourself (part 1)
15:54
How to Code by Yourself (part 2)
09:23
Proof that using Jupyter Notebook is the same as not using it
12:29
Python 2 vs Python 3
04:38

Effective Learning Strategies for Machine Learning (FAQ by Student Request)

4 lectures
How to Succeed in this Course (Long Version)
10:24
Is this for Beginners or Experts? Academic or Practical? Fast or slow-paced?
22:04
Machine Learning and AI Prerequisite Roadmap (pt 1)
11:18
Machine Learning and AI Prerequisite Roadmap (pt 2)
16:07

Appendix / FAQ Finale

2 lectures
What is the Appendix?
02:48
BONUS
05:48

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.