Mô tả

Introduction: Welcome to Azure Projects for Data Engineering, your gateway to mastering data engineering with Microsoft Azure! In today's data-driven world, organizations crave skilled professionals who can harness data to drive insights and innovation. Azure offers a robust suite of tools tailored for data engineering tasks, from ingestion to analysis. Our course equips you with the skills to excel in this dynamic field using Azure services.

Motivation: Data engineering is pivotal in transforming data into actionable insights for business growth. As demand for Azure-savvy data engineers rises, mastering Azure Projects for Data Engineering opens doors to exciting career opportunities. With hands-on projects and practical exercises, you'll develop the expertise needed to tackle real-world challenges and drive impactful outcomes.

Why Choose This Course? Our course stands out for its hands-on approach and real-world projects. From app analytics to transportation insights, you'll explore diverse domains using Azure services like Data Factory and Databricks. Gain practical skills and build a compelling portfolio to showcase your expertise to potential employers. Whether you're launching a new career or upskilling, our flexible learning environment caters to your goals and aspirations.

Course Highlights:

  • Hands-on projects covering app analytics, sports data, developer surveys, and transportation insights.

  • In-depth exploration of Azure services including Data Factory, Databricks, and Azure Storage.

  • Insights from seasoned data engineering professionals.

  • Flexible, self-paced learning with interactive assignments.

  • Opportunities for collaboration and networking with peers and industry experts.


Projects

1. Title: Play Store Dataset Analysis

Description:

Integrate azure service using key vault that provide encrypted security layer. Then, build a copy activity pipeline using azure data factory (ADF) that will dynamically fetch compressed tar.gzip file from GitHub and upload it to container within a storage account. Mount the container on databricks, and pre-processing it to make data accurate and reliable. Then we build insights for better data understanding. As,

Installs Analysis:

• What is the most installed category of apps?

• Top 5 Apps in the top 5 installed category.

Rating Analysis:

• What is the most rated category of apps?

• Top 5 Apps in the top 5 rated category.

Free vs. Paid Apps Analysis:

• What is the proportion of free vs. paid apps available?

• Distribution of Paid and Free Apps in Each Category.

• Do paid apps have higher average ratings than free apps?

Price Analysis:

• What is the average price of paid apps in different categories?


Key Features:

a. Integration of key vault with other services

b. GitHub

c. Copy compressed data using data factory to storage account

d. Mounting container

e. Play Store Insight


2. Title: Olympics Dataset Analysis

Project Description:

API request to fetch dataset from kaggle. Unzip dataset and write source in particular storage account container. Pre-process and build insights on Olympics events. As,

Gender Level Analysis:

• How has the participation of male and female athletes changed over time?

• Are there sports that have seen significant increases in participation by one gender?

National Level Analysis:

• Which countries have historically performed well in the Olympics based on the total number of medals won?

• Are there specific sports where certain countries excel?

Sports Level Analysis:

• Which sports have the highest and lowest participation rates?

• Are there sports that have gained or lost popularity over the years?


Key Features:

a. API request for kaggle data

b. Write data on storage account from databricks

c. Olympics Insights


3. Title: Stack Overflow Dataset Analysis

Project Description: Build dynamic robust data factory pipeline for stack overflow annual developer survey 2023 data, which first copy zipped folder from stack overflow official website, write it to storage account. Then, second copy activity unzip folder to extract source files. Mount the container on databricks, and pre-processing it to make data accurate and reliable. Then we build insights for better developer background understanding. As,

Developers Education Analysis:

• Analyzing the distribution of Developer's Education Levels on Stack Overflow?

• Whats the education diversity among Developers at various career levels?

“How to Learn Code?” Analysis:

• What are the most preferred sources of learning coding by Developers?

• What are the most preferred sources of learning by developers who are in the learning phase?

• What is the distribution of preferred ways of learning code by different age groups of developers?


Key Features:

a. Copy zipped folder containing different format of data

b. Unzip folder and write data within container on storage account

c. Developer Survey Insights


4. Title: Uber Taxi Dataset Analysis

Project Description: Build a robust data factory pipeline that recursively fetch data files from NYC TLC site using forEach activity and copy it to the container in storage account. Mount the container on databricks, and pre-processing it to make data accurate and reliable. Then we build insights for better understanding. As,

Taxi Demand Analysis:

• What is the overall demand for Uber rides during different time-periods (days of the week, hours of the day, etc.)?

• How does the passenger count vary during peak and off-peak hours?

Payment Analysis:

• What is the distribution of payment types (cash, credit card, etc.)?

• Is there any correlation between payment types and trip distance?

Key Features:

a. ForEach activity in data factory

b. Dynamic filepath creation

c. Uber Taxi Insights


Who Should Enroll:

  • Aspiring Data Engineers

  • Developers Interested in Cloud

  • Business Concerns who want to start using Cloud

Prerequisites: No prior experience with Azure is required, but a basic understanding of Python and PySpark will be helpful.

What You'll Gain:

  • A Solid Grasp of Azure Basics

  • Hands-on Experience in deploying and using multiple AWS services

  • Learn integration of different source for dynamic fetching of data

  • Exposure with transformation and building insights of source data using cloud


Join us on a transformative journey into the world of Azure Projects for Data Engineering. Unlock your potential, elevate your skills, and become a driving force in the data revolution. Enroll now and take the first step toward success in data engineering with Azure.

Bạn sẽ học được gì

A Solid Grasp of Azure Basics

Hands-on Experience in deploying and using multiple AWS services

Learn integration of different source for dynamic fetching of data

Exposure with transformation and building insights of source data using cloud

Yêu cầu

  • No prior experience with Azure is required, but a basic understanding of Python and PySpark will be helpful.

Nội dung khoá học

6 sections

Introduction

5 lectures
Introduction to Azure
04:22
Importance of Cloud Computing
05:50
Introduction to Azure Projects
02:46
Outline
04:39
Links for the Course's Materials and Codes
00:10

Account Setp Up

6 lectures
Links for the Course's Materials and Codes
00:10
Resources Group
06:13
Storage Account
06:19
Azure Data Factory
01:56
Azure Data Bricks
01:27
Azure Key Vault
03:25

Project 1 PlayStore Dataset

16 lectures
Links for the Course's Materials and Codes
00:10
Play Store Dataset Analysis
03:29
Linked Services
15:43
Copy Data Activity
15:47
Create Compute
04:47
Create Secret Scope
06:23
Mounting Container
14:33
Multiple Ways of Reading Data
17:03
Data PreProcessing
15:59
Install Rating Analysis
17:20
Free Paid Price Analysis
08:37
What service is integrated with Azure Key Vault to provide an encrypted security layer in the project?
1 question
Which file format is dynamically fetched from GitHub in the project?
1 question
Which Azure service is used to mount the container containing the Play Store dataset?
1 question
What type of analysis is performed to determine the most rated category of apps?
1 question
What is the main purpose of integrating Key Vault with other services in the project?
1 question

Project 2 Olympics Dataset

13 lectures
Links for the Course's Materials and Codes
00:10
Olympics Dataset Analysis
04:59
Fetch Data from kaggle
14:19
Write to Blob
09:19
Data PreProcessing
09:28
Gender Level Analysis
12:36
National Level Analysis
14:01
Sports Level Analysis
14:46
From where is the dataset fetched in the project?
1 question
Which storage account is used to write the Olympics dataset from Databricks?
1 question
What type of analysis is performed to understand the historical performance of countries in the Olympics?
1 question
Which API is used to fetch the Olympics dataset?
1 question
Which service is used to pre-process the Olympics dataset in Databricks?
1 question

Project 3 Stack Overflow Dataset

12 lectures
Links for the Course's Materials and Codes
00:10
Stack Overflow Dataset Analysis
06:26
Copy Zip Data
12:15
Unzip Data
10:47
PreProcessing
15:40
Education Level Analysis
16:31
Learn How to Code Analysis
17:55
What is one of the primary focus of the Stack Overflow Dataset Analysis project?
1 question
Which Azure service is used to build a dynamic robust data factory pipeline in the project?
1 question
What type of data is fetched from the Stack Overflow official website in the project?
1 question
What is one of the key insights derived from the Stack Overflow Dataset Analysis project?
1 question
What is the main purpose of unzipping the folder containing Stack Overflow data?
1 question

Project 4 Uber Taxi Dataset

14 lectures
Links for the Course's Materials and Codes
00:10
Uber Taxi Dataset Analysis
05:07
Introduction to Project
06:10
Copy Activity
06:51
ForEach Activity
12:08
PreProcessing
13:47
Taxi Demand Analysis
18:42
Payment Analysis
20:24
What is the main source of data for the Uber Taxi Dataset Analysis project?
1 question
Which activity is used in the data factory pipeline to fetch data files from the NYC TLC site?
1 question
What is the purpose of pre-processing the data in Databricks?
1 question
Which analysis helps in understanding the overall demand for Uber rides during different time periods?
1 question
What is the main purpose of using the forEach activity in the data factory pipeline?
1 question
Farwell
01:20

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.