Mô tả

In this course I show you how to take advantage of Polars - the fast-growing open source dataframe library that is becoming the go-to dataframe library for data scientists in python. I am a Polars contributor with a focus on making Polars accessible to new users.


"A thorough introduction to Polars" - Ritchie Vink, creator of Polars


"Thank you for your great work with this course - I've optimized some code thanks to it already!" Maiia Bocharova


The course is for data scientists who have some familiarity with a dataframe library like Pandas but who want to move to Polars because it is easier to write and faster to run. The core materials are Jupyter notebooks that examine each topic in depth. Each notebook comes with a set of exercises to help you develop your understanding of the core concepts.


For many key topics this course is the only source of documentation. I have focused on producing Jupyter notebooks to allow anyone taking the course to start using the full power of Polars. As a consequence the video content is limited. More videos that go beyond the notebooks will be added in the coming months once the core functionality has been documented in the notebooks.


The course introduces the syntax of Polars and shows you the many ways that Polars allows you to produce queries that are easy to read and write. However, the course also delves deeper to help you understand and exploit the algorithms that drive the outstanding performance of Polars.


By the end of the course you will have optimised ways to:

  • load and transform your data from CSV, Excel, Parquet, cloud storage or a database

  • run your analysis in parallel

  • work with larger-than-memory datasets

  • carry out aggregations on your data

  • combine your datasets

  • visualise your outputs with Matplotlib, Seaborn, Plotly & Altair and

  • prepare your data for machine learning pipelines

Bạn sẽ học được gì

Taking advantage of parallel and optimised analysis with Polars

Working with larger-than-memory data

Using Polars expressions for analysis that is easy to read and write

Loading data from a wide variety of data sources

Combining data from different datasets using fast joins operations

Grouping and parallel aggregations

Deriving insight from time series

Preparing data for machine learning pipelines

Visualising data with Matplotlib, Seaborn, Altair & Plotly

Yêu cầu

  • Computer with Windows/Linux/MacOS and a python installation

Nội dung khoá học

9 sections

Up and running with Polars

12 lectures
Course introduction
01:38
Why use Polars instead of Pandas?
04:05
How can you make best use of the course materials?
01:03
Course materials
01:14
Polars quickstart
07:03
Lazy mode: Introducing lazy mode
00:10
Lazy mode: evaluating queries
00:11
Introduction to Data types
03:22
Series and DataFrame
04:49
Converting to and from Pandas & Numpy
08:15
Visualisation
00:07
Lazy mode
4 questions

Filtering rows

5 lectures
Filtering rows I: Filtering rows with square brackets
03:09
Filtering rows 2: Using `filter` and the Expression API
08:24
Filtering rows 3: using `filter` in lazy mode
05:48
Filtering rows based on values from another DataFrame
00:05
Filtering rows
4 questions

Selecting columns and transforming dataframes

12 lectures
Selecting columns 1: using square brackets
03:07
Selecting columns 2: using select and expressions
05:06
Selecting columns 3: choosing multiple columns
06:49
Selecting columns 4: transforming and adding columns
04:14
Selecting columns 5: Transforming and adding multiple columns
00:09
Selecting columns 6: Adding a column based on a condition or mapping
05:49
Sorting and fast-track algorithms
04:11
Transforming a DataFrame
05:06
Iterating through a DataFrame
00:12
Selecting columns
4 questions
Adding new columns
3 questions
Adding a new column
2 questions

Data types and missing values

12 lectures
Missing values
06:09
Replacing missing values
09:01
Replacing missing values with expressions
05:28
Missing values
3 questions
Numerical dtypes and precision
00:20
Introducing categorical data
04:42
Categoricals and the string cache
04:42
Introduction to nested dtypes: List, Struct and Object
00:09
List dtype 1: Creating and transforming List columns
00:13
List dtype 2: using expressions on List columns
00:10
Text transformation
00:09
Nested dtypes
4 questions

Statistics, counts and grouping

11 lectures
Statistics
00:07
Value counts
04:20
Groupby 1: The GroupBy object
04:02
Groupby 2: Aggregations and expressions
04:11
Groupby 3: Multiple aggregations
04:13
Groupby 4: Lazy groupby
02:23
Counting values
6 questions
Working with GroupBy and groups
4 questions
Grouping and aggregations
1 question
Quantiles
00:07
Introduction to group operations with over()
00:10

Combining dataframes

6 lectures
Concatenating DataFrames
00:07
Left, inner and fast-track joins
00:10
Joins on string and categorical data
00:14
Filtering a DataFrame by another DataFrame
00:14
Using another DataFrame in an expression
00:13
Extending, stacking and concatenating
00:08

Time series analysis

10 lectures
Introduction to time series dtypes
00:09
Time zones
00:16
Time zones quiz
3 questions
Parsing datetime strings
00:07
Adjusting datetimes
00:08
Parsing and adjusting datetimes quiz
3 questions
Extracting datetime components
00:09
Filtering time series
00:06
Temporal groupby - introduction to groupby_dynamic
00:07
Controlling the `groupby_dynamic` window
00:08

Input/Output

8 lectures
Read a single CSV file
00:17
CSV files 2: multiple CSV files
02:36
Read an Excel file
00:10
Read JSON and newline delimited JSON
00:12
CSV files 3: reading larger-than-memory CSV files in batches
00:22
CSV files 4: streaming larger-than-memory datasets
00:14
Parquet files 1: single Parquet files
00:10
Reading from a database
00:08

Nested dtypes

3 lectures
Visualisations with Plotly
00:10
Visualisations with Matplotlib
00:16
Visualisations with Seaborn
00:07

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.