Mô tả

If you are interested in becoming a Certified Data Engineer Associate from Databricks, you have come to the right place! This study guide will help you with preparing for this certification exam.


By the end of this course, you should be able to:

  • Understand how to use and the benefits of using the Databricks Lakehouse Platform and its tools, including:

    • Data Lakehouse (architecture, descriptions, benefits)

    • Data Science and Engineering workspace (clusters, notebooks, data storage)

    • Delta Lake (general concepts, table management and manipulation, optimizations)

  • Build ETL pipelines using Apache Spark SQL and Python, including:

    • Relational entities (databases, tables, views)

    • ELT (creating tables, writing data to tables, cleaning data, combining and reshaping tables, SQL UDFs)

    • Python (facilitating Spark SQL with string manipulation and control flow, passing data between PySpark and Spark SQL)

  • Incrementally process data, including:

    • Structured Streaming (general concepts, triggers, watermarks)

    • Auto Loader (streaming reads)

    • Multi-hop Architecture (bronze-silver-gold, streaming applications)

    • Delta Live Tables (benefits and features)

  • Build production pipelines for data engineering applications and Databricks SQL queries and dashboards, including:

    • Jobs (scheduling, task orchestration, UI)

    • Dashboards (endpoints, scheduling, alerting, refreshing)

  • Understand and follow best security practices, including:

    • Unity Catalog (benefits and features)

    • Entity Permissions (data objects Privileges)


With the knowledge you gain during this course, you will be ready to take the certification exam.

I am looking forward to meeting you!

Bạn sẽ học được gì

Understand how to use Databricks Lakehouse Platform and its tools

Build ETL pipelines using Apache Spark SQL and Python

Process data incrementally in batch and streaming mode

Orchestrate production pipelines

Understand and follow best security practices in Databricks

Yêu cầu

  • Basic SQL knowledge will be required
  • Basic Python programming experience will be required
  • Knowledge of cloud fundamentals will be beneficial, but not necessary

Nội dung khoá học

7 sections

Introduction

9 lectures
Course Overview
01:36
What is Databricks
05:05
Get started with Community Edition
03:21
Free trial on Azure
03:39
Exploring Workspace
03:36
Course Materials
01:31
Creating Cluster
06:47
Notebooks Fundamentals
13:48
Databricks Repos
08:39

Databricks Lakehouse Platform

9 lectures
Delta Lake
05:25
Understanding Delta Tables (Hands On)
06:46
Advanced Delta Lake Features
04:17
Apply Advanced Delta Features (Hands On)
07:20
Relational entities
05:19
Databases and Tables on Databricks (Hands On)
07:08
Set Up Delta Tables
06:38
Views
03:40
Working with Views (Hands On)
07:15

ELT with Spark SQL and Python

5 lectures
Querying Files
06:13
Querying Files (Hands On)
12:39
Writing to Tables (Hands On)
09:00
Advanced Transformations (Hands On)
08:50
Higher Order Functions and SQL UDFs (Hands On)
07:15

Incremental Data Processing

6 lectures
Structured Streaming
07:30
Structured Streaming (Hands On)
08:35
Incremental Data Ingestion
04:41
Auto Loader (Hands On)
05:36
Multi-hop Architecture
02:16
Multi-hop Architecture (Hands On)
10:07

Production Pipelines

5 lectures
Delta Live Tables (Hands On)
13:29
Change Data Capture
05:03
Processing CDC Feed with DLT (Hands On)
06:54
Jobs (Hands On)
09:03
Databricks SQL
12:40

Data Governance

3 lectures
Data Objects Privileges
03:42
Managing Permissions (Hands On)
07:50
Unity Catalog
08:21

Certification Overview

2 lectures
Certification Overview
06:34
Bonus Lecture
00:43

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.