Mô tả

This is a hands-on, project-based course designed to help you master the foundations for classification modeling in Python.


We’ll start by reviewing the data science workflow, discussing the primary goals & types of classification algorithms, and do a deep dive into the classification modeling steps we’ll be using throughout the course.


You’ll learn to perform exploratory data analysis, leverage feature engineering techniques like scaling, dummy variables, and binning, and prepare data for modeling by splitting it into train, test, and validation datasets.


From there, we’ll fit K-Nearest Neighbors & Logistic Regression models, and build an intuition for interpreting their coefficients and evaluating their performance using tools like confusion matrices and metrics like accuracy, precision, and recall. We’ll also cover techniques for modeling imbalanced data, including threshold tuning, sampling methods like oversampling & SMOTE, and adjusting class weights in the model cost function.


Throughout the course, you'll play the role of Data Scientist for the risk management department at Maven National Bank. Using the skills you learn throughout the course, you'll use Python to explore their data and build classification models to accurately determine which customers have high, medium, and low credit risk based on their profiles.


Last but not least, you'll learn to build and evaluate decision tree models for classification. You’ll fit, visualize, and fine-tune these models using Python, then apply your knowledge to more advanced ensemble models like random forests and gradient boosted machines.


COURSE OUTLINE:


  • Intro to Data Science

    • Introduce the fields of data science and machine learning, review essential skills, and introduce each phase of the data science workflow


  • Classification 101

    • Review the basics of classification, including key terms, the types and goals of classification modeling, and the modeling workflow


  • Pre-Modeling Data Prep & EDA

    • Recap the data prep & EDA steps required to perform modeling, including key techniques to explore the target, features, and their relationships


  • K-Nearest Neighbors

    • Learn how the k-nearest neighbors (KNN) algorithm classifies data points and practice building KNN models in Python


  • Logistic Regression

    • Introduce logistic regression, learn the math behind the model, and practice fitting them and tuning regularization strength


  • Classification Metrics

    • Learn how and when to use several important metrics for evaluating classification models, such as precision, recall, F1 score, and ROC-AUC


  • Imbalanced Data

    • Understand the challenges of modeling imbalanced data and learn strategies for improving model performance in these scenarios


  • Decision Trees

    • Build and evaluate decision tree models, algorithms that look for the splits in your data that best separate your classes


  • Ensemble Models

    • Get familiar with the basics of ensemble models, then dive into specific models like random forests and gradient boosted machines


__________


Ready to dive in? Join today and get immediate, LIFETIME access to the following:


  • 9.5 hours of high-quality video

  • 18 homework assignments

  • 9 quizzes

  • 2 projects

  • Data Science in Python: Classification ebook (250+ pages)

  • Downloadable project files & solutions

  • Expert support and Q&A forum

  • 30-day Udemy satisfaction guarantee


If you're an aspiring data scientist looking for an introduction to the world of classification modeling with Python, this is the course for you.


Happy learning!

-Chris Bruehl (Data Science Expert & Lead Python Instructor, Maven Analytics)

Bạn sẽ học được gì

Yêu cầu

Nội dung khoá học

14 sections

Introduction

8 lectures
Course Introduction
02:00
About This Series
00:42
Course Structure & Outline
02:16
READ ME: Important Notes for New Students
02:18
DOWNLOAD: Course Resources
00:13
Introducing the Course Project
00:50
Setting Expectations
01:27
Jupyter Installation & Launch
04:03

Intro to Data Science

10 lectures
What is Data Science?
02:44
The Data Science Skillset
01:46
What is Machine Learning?
02:43
Common Machine Learning Algorithms
01:59
Data Science Workflow
01:08
Data Prep & EDA Steps
03:42
Modeling Steps
02:54
Classification Modeling
00:37
Key Takeaways
01:17
Intro to Data Science
5 questions

Classification 101

6 lectures
Classification 101
05:45
Goals of Classification
01:50
Types of Classification
02:21
Classification Modeling Workflow
02:52
Key Takeaways
01:21
Classification 101
5 questions

Data Prep & EDA

27 lectures
EDA For Classification
03:30
Defining a Target
04:27
DEMO: Defining a Target
05:44
Exploring the Target
04:29
Exploring the Features
02:07
DEMO: Exploring the Features
05:08
ASSIGNMENT: Exploring the Target & Features
02:18
SOLUTION: Exploring the Target & Features
08:28
Correlation
05:14
PRO TIP: Correlation Matrix
02:28
DEMO: Correlation Matrix
04:59
Feature-Target Relationships
07:18
Feature-Feature Relationships
02:29
PRO TIP: Pair Plots
04:28
ASSIGNMENT: Exploring Relationships
01:33
SOLUTION: Exploring Relationships
07:53
Feature Engineering Overview
04:43
Numeric Feature Engineering
04:10
Dummy Variables
04:48
Binning Categories
03:34
DEMO: Feature Engineering
07:01
Data Splitting
05:28
Preparing Data for Modeling
02:05
ASSIGNMENT: Preparing the Data for Modeling
01:59
SOLUTION: Prepare the Data for Modeling
07:29
Key Takeaways
01:37
Data Prep & EDA
5 questions

K-Nearest Neighbors

18 lectures
K-Nearest Neighbors
05:44
The KNN Workflow
04:57
KNN in Python
02:16
Model Accuracy
03:55
Confusion Matrix
03:58
DEMO: Confusion Matrix
04:10
ASSIGNMENT: Fitting a Simple KNN Model
01:50
SOLUTION: Fitting a Simple KNN Model
03:42
Hyperparameter Tuning
03:39
Overfitting & Validation
07:07
DEMO: Hyperparameter Tuning
06:13
Hard vs. Soft Classification
04:54
DEMO: Probability vs. Event Rate
10:05
ASSIGNMENT: Tuning a KNN Model
01:16
SOLUTION: Tuning a KNN Model
03:33
Pros & Cons of KNN
04:17
Key Takeaways
01:12
K-Nearest Neighbors
5 questions

Logistic Regression

22 lectures
Logistic Regression
03:00
Logistic vs. Linear Regression
02:41
The Logistic Function
03:24
Likelihood
04:53
Multiple Logistic Regression
03:17
The Logistic Regression Workflow
00:52
Logistic Regression in Python
04:43
Interpreting Coefficients
03:41
ASSIGNMENT: Logistic Regression
01:35
SOLUTION: Logistic Regression
03:24
Feature Engineering & Selection
03:53
Regularization
05:57
Tuning a Regularized Model
03:51
DEMO: Regularized Logistic Regression
03:45
ASSIGNMENT: Regularized Logistic Regression
01:07
SOLUTION: Regularized Logistic Regression
04:28
Multi-class Logistic Regression
06:43
ASSIGNMENT: Multi-class Logistic Regression
01:22
SOLUTION: Multi-class Logistic Regression
03:52
Pros & Cons of Logistic Regression
02:33
Key Takeaways
01:40
Logistic Regression
5 questions

Classification Metrics

21 lectures
Classification Metrics
02:37
Accuracy, Precision & Recall
06:39
DEMO: Accuracy, Precision & Recall
05:24
PRO TIP: F1 Score
03:39
ASSIGNMENT: Model Metrics
00:57
SOLUTION: Model Metrics
04:06
Soft Classification
07:02
DEMO: Leveraging Soft Classification
03:28
PRO TIP: Precision-Recall & F1 Curves
03:44
DEMO: Plotting Precision-Recall & F1 Curves
04:09
The ROC Curve & AUC
03:15
DEMO: The ROC Curve & AUC
03:46
Classification Metrics Recap
02:22
ASSIGNMENT: Threshold Shifting
01:25
SOLUTION: Threshold Shifting
05:33
Multi-class Metrics
05:43
Multi-class Metrics in Python
01:36
ASSIGNMENT: Multi-class Metrics
01:00
SOLUTION: Multi-class Metrics
02:54
Key Takeaways
01:32
Classification Metrics
5 questions

Imbalanced Data

20 lectures
Imbalanced Data
04:03
Managing Imbalanced Data
04:03
Threshold Shifting
02:24
Sampling Strategies
01:49
Oversampling
01:30
Oversampling in Python
02:44
DEMO: Oversampling
04:32
SMOTE
01:10
SMOTE in Python
02:31
Undersampling
02:11
Undersampling in Python
05:14
ASSIGNMENT: Sampling Methods
02:19
SOLUTION: Sampling Methods
05:21
Changing Class Weights
02:51
DEMO: Changing Class Weights
02:55
ASSIGNMENT: Changing Class Weights
00:59
SOLUTION: Changing Class Weights
03:25
Imbalanced Data Recap
01:50
Key Takeaways
01:08
Imbalanced Data
5 questions

Mid-Course Project

2 lectures
Project Brief
04:33
Solution Walkthrough
11:11

Decision Trees

15 lectures
Decision Trees
03:44
Entropy
05:45
Decision Tree Predictions
04:07
Decision Trees in Python
02:59
DEMO: Decision Trees
03:55
Feature Importance
04:58
ASSIGNMENT: Decision Trees
01:14
SOLUTION: Decision Trees
05:52
Hyperparameter Tuning for Decision Trees
04:17
DEMO: Hyperparameter Tuning
02:33
ASSIGNMENT: Tuned Decision Tree
00:48
SOLUTION: Tuned Decision Tree
04:11
Pros & Cons of Decision Trees
02:34
Key Takeaways
01:00
Decision Trees
5 questions

Ensemble Models

23 lectures
Ensemble Models
03:56
Simple Ensemble Models
02:15
DEMO: Simple Ensemble Models
03:32
ASSIGNMENT: Simple Ensemble Models
01:18
SOLUTION: Simple Ensemble Models
03:14
Random Forests
01:13
Fitting Random Forests in Python
04:10
Hyperparameter Tuning for Random Forests
04:32
PRO TIP: Random Search
05:08
Pros & Cons of Random Forests
01:37
ASSIGNMENT: Random Forests
01:07
SOLUTION: Random Forests
05:20
Gradient Boosting
02:11
Gradient Boosting in Python
02:12
Hyperparameter Tuning for Gradient Boosting
04:44
DEMO: Hyperparameter Tuning for Gradient Boosting
03:23
Pros & Cons of Gradient Boosting
01:39
ASSIGNMENT: Gradient Boosting
01:16
SOLUTION: Gradient Boosting
04:01
PRO TIP: SHAP Values
06:01
DEMO: SHAP Values
05:08
Key Takeaways
01:12
Ensemble Models
5 questions

Classification Summary

4 lectures
Recap: Classification Models & Workflow
02:48
Pros & Cons of Classification Models
03:00
DEMO: Production Pipeline & Deployment
11:14
Looking Ahead: Unsupervised Learning
00:58

Final Project

2 lectures
Project Brief
02:33
Solution Walkthrough
06:42

Next Steps

1 lectures
BONUS LESSON
01:42

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.