Mô tả

Apache Airflow is an open-source  platform to programmatically author, schedule and monitor workflows. If you have many ETL(s) to manage, Airflow is a must-have.

In this course you are going to learn everything you need to start using Apache Airflow through theory and pratical videos. Starting from very basic notions such as, what is Airflow and how it works, we will dive into advanced concepts such as, how to create plugins and make real dynamic pipelines.

Bạn sẽ học được gì

Create plugins to add functionalities to Apache Airflow.

Using Docker with Airflow and different executors

Master core functionalities such as DAGs, Operators, Tasks, Workflows, etc

Understand and apply advanced concepts of Apache Airflow such as XCOMs, Branching and SubDAGs.

The difference between Sequential, Local and Celery Executors, how do they work and how can you use them.

Use Apache Airflow in a Big Data ecosystem with Hive, PostgreSQL, Elasticsearch etc.

Install and configure Apache Airflow

Think, answer and implement solutions using Airflow to real data processing problems

Yêu cầu

  • VirtualBox must be installed - A VM of 3Gb will have to be downloaded
  • At least 8 gigabytes of memory
  • Some prior programming or scripting experience. Python experience will help you a lot but since it's a very easy language to learn, it shouldn't be too difficult if you are not familiar with.

Nội dung khoá học

9 sections

Course Introduction

4 lectures
Prerequisites
00:35
Course Objectives
01:17
Who I am?
00:52
Development Environment
00:45

Getting Started with Airflow

13 lectures
Why Airflow?
01:03
What is Airflow?
01:31
Core Components
02:08
Core Concepts
02:10
Airflow is not...
01:05
Single Node Architecture
01:13
Multi Node Architecture
01:52
How does it work?
02:19
[Practice] Installing Apache Airflow
01:46
What is Docker?
01:06
The docker-compose file
01:56
Quiz Time!
6 questions
Key Takeaways
00:25

The important views of the Airflow UI

10 lectures
The DAGs View
02:52
Run your first DAG
3 questions
The Grid View
01:56
The Graph View
01:11
The Landing Times View
00:54
The Calendar View
01:17
The Gantt View
01:15
The Code View
00:28
Wrap up!
00:15
Quiz!
5 questions

Coding Your First Data Pipeline with Airflow

28 lectures
The Project
00:44
Advices
00:45
What is a DAG?
00:45
DAG Skeleton
02:16
Define your first DAG
1 question
What is an Operator?
01:46
Providers
01:08
Create a Table
02:06
Create a connection
01:22
Implement the create table task
1 question
The secret weapon!
01:49
What is a Sensor?
01:34
Is the API available?
01:11
Implement the sensor is_api_available
1 question
Extract users
01:39
Implement extract users
1 question
Process users
02:22
Before running process_user
01:24
Implement process_user
1 question
What is a Hook?
01:03
Store users
01:37
Implement store_user
1 question
Order matters!
01:17
Your DAG in action!
02:26
DAG Scheduling
02:58
Backfilling: How does it work?
01:30
Wrap up!
00:55
Quiz Time!
7 questions

The New Way of Scheduling DAGs

8 lectures
Why do you need that feature?
03:03
What is a Dataset?
03:19
Adios schedule_interval!
01:12
Create the Producer DAG
04:07
Create the Consumer DAG
03:29
Track your Datasets with the new view!
01:48
Wait for many datasets
02:04
Dataset limitations
00:30

Databases and Executors

16 lectures
What's an executor?
01:17
The default config
02:02
The Sequential Executor
00:42
The Local Executor
01:00
The Celery Executor
01:50
The current config
02:14
Add the DAG parallel_dag.py into the dags folder
00:24
Monitor your tasks with Flower
02:06
Remove DAG examples
00:15
Running tasks on Celery Workers
01:26
What is a queue?
01:43
Add a new Celery Worker
01:15
Create a queue to better distribute tasks
01:22
Send a task to a specific queue
01:36
Concurrency, the parameters you must know!
00:40
Quiz Time!
5 questions

Implementing Advanced Concepts in Airflow

14 lectures
Adios repetitive patterns
00:42
Add the DAG group_dag.py
00:26
How to use SubDAGs?
04:46
[Practice] Group tasks with SubDAGs!
2 questions
Adios SubDAGs, welcome TaskGroups!
02:19
Group tasks with TaskGroups!
2 questions
Add the DAG xcom_dag.py
00:23
Sharing data between tasks with XComs
01:49
[Practice] XComs in action!
03:25
Choosing a specific path in your DAG
00:53
[Practice] Executing a task according to a condition
03:44
Trigger rules or how tasks get triggered
02:41
Fixing the BranchPythonOperator with trigger rules
1 question
Quiz Time!
3 questions

Creating Airflow Plugins with Elasticsearch and PostgreSQL

9 lectures
Introduction
00:53
What's Elasticsearch?
00:40
Running Elasticsearch with Airflow
02:16
How the plugin system works?
02:32
Create the connection
00:39
Create the ElasticHook
05:23
Add ElasticHook to the Plugin system
01:48
Add the DAG elastic_dag.py
00:17
Your Hook in Action!
01:21

BONUS - APPENDIX

16 lectures
[BLOG POST] How to use the DockerOperator with Templating and Apache Spark
00:31
[BLOG POST] Apache Airflow with Kubernetes Executor
00:31
[BLOG POST] How to use templates and macros in Apache Airflow
00:42
[BLOG POST] How to use timezones in Apache Airflow
00:40
[BLOG POST] How to use the BashOperator
00:13
[BLOG POST] Variables in Apache Airflow: The Guide
00:14
[BLOG POST] Best Practices in Apache Airflow (part 1)
00:23
[VIDEO] Running Apache Airflow on a multi-nodes Kubernetes cluster locally
00:21
[BLOG POST] The PostgresOperator: All you need to know
00:20
[VIDEO] The DockerOperator: The Basics and more!
19:35
[VIDEO] The New Of Scheduling your DAGs
11:51
[VIDEO] Build a data pipeline with AWS, Snowflake and Airflow
24:08
[VIDEO] What's new in Airflow 2.4?
06:33
[VIDEO] What's new in Airflow 2.5?
00:06
[VIDEO] What's new in Airflow 2.6?
00:06
BONUS : COUPON FOR MY OTHER COURSES!
00:13

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.