Trang chủ

Programming Languages

CUDA

CUDA programming Masterclass with C++

CUDA programming Masterclass with C++

Loại khoá học: Programming Languages

Learn parallel programming on GPU's with CUDA from basic concepts to advance algorithm implementations.

50.000 VND

1.499.000 VND

Đầy Đủ Bài Giảng

Học Online Tiện Lợi

Kích Hoạt Nhanh 2-5 Phút

Thanh toán tự động

Được phép tải xuống

Mô tả

This course is all about CUDA programming. We will start our discussion by looking at basic concepts including CUDA programming model, execution model, and memory model. Then we will show you how to implement advance algorithms using CUDA. CUDA programming is all about performance. So through out this course you will learn multiple optimization techniques and how to use those to implement algorithms. Also we will extensively discuss profiling techniques and some of the tools including nvprof, nvvp, CUDA Memcheck, CUDA-GDB tools in the CUDA toolkit. This course contains following sections.

Introduction to CUDA programming and CUDA programming model

CUDA Execution model

CUDA memory model-Global memory

CUDA memory model-Shared and Constant memory

CUDA streams

Tuning CUDA instruction level primitives

Algorithm implementation with CUDA

CUDA tools

With this course we include lots of programming exercises and quizzes as well. Answering all those will help you to digest the concepts we discuss here.

This course is the first course of the CUDA master class series we are current working on. So the knowledge you gain here is essential of following those course as well.

Bạn sẽ học được gì

All the basic knowladge about CUDA programming

Ability to desing and implement optimized parallel algorithms

Basic work flow of parallel algorithm design

Advance CUDA concepts

Yêu cầu

Basic C or C++ programming knowladge
How to use Visual studio IDE
CUDA toolkit
Nvidia GPU
You should be familiar with basic setup of a C++ project, how to change project properties etc

Nội dung khoá học

8 sections

Introduction to CUDA programming and CUDA programming model

20 lectures

Very very important

07:48

Introduction to parallel programming

08:50

Parallel computing and Super computing

07:19

Let's investigate some background.

4 questions

How to install CUDA toolkit and first look at CUDA program

06:12

Basic elements of CUDA program

16:50

Organization of threads in a CUDA program - threadIdx

08:38

Organization of thread in a CUDA program - blockIdx,blockDim,gridDim

06:14

Programming exercise 1

00:29

Unique index calculation using threadIdx blockId and blockDim

09:20

Unique index calculation for 2D grid 1

05:53

Unique index calculation for 2D grid 2

05:09

Memory transfer between host and device

11:13

Programming exercise 2

01:04

Sum array example with validity check

09:13

Sum array example with error handling

04:32

Sum array example with timing

08:18

Extend sum array implementation to sum up 3 arrays

1 question

Device properties

05:30

Summary

04:17

CUDA Execution model

16 lectures

Understand the device better

08:46

All about warps

09:43

Warp divergence

12:28

Resource partitioning and latency hiding 1

05:35

Resource partitioning and latency hiding 2

10:41

Occupancy

11:16

Profile driven optimization with nvprof

12:04

Parallel reduction as synchronization example

19:08

Parallel reduction as warp divergence example

10:11

Parallel reduction with loop unrolling

07:03

Parallel reduction as warp unrolling

06:48

Reduction with complete unrolling

04:09

Performance comparison of reduction kernels

05:18

CUDA Dynamic parallelism

10:03

Reduction with dynamic parallelism

05:33

Summary

04:36

CUDA memory model

12 lectures

CUDA memory model

06:49

Different memory types in CUDA

09:04

Memory management and pinned memory

07:19

Zero copy memory

08:45

Unified memory

04:39

Global memory access patterns

12:55

Global memory writes

03:53

AOS vs SOA

06:03

Matrix transpose

19:34

Matrix transpose with unrolling

06:21

Matrix transpose with diagonal coordinate system

08:36

Summary

03:00

CUDA Shared memory and constant memory

13 lectures

Introduction to CUDA shared memory

09:04

Shared memory access modes and memory banks

09:06

Row major and Column major access to shared memory

08:51

Static and Dynamic shared memory

04:19

Shared memory padding

05:44

Parallel reduction with shared memory

04:44

Synchronization in CUDA

03:38

Matrix transpose with shared memory

11:53

CUDA constant memory

13:10

Matrix transpose with Shared memory padding

05:47

CUDA warp shuffle instructions

14:59

Parallel reduction with warp shuffle instructions

03:50

Summary

02:10

CUDA Streams

8 lectures

Introduction to CUDA streams and events

06:25

How to use CUDA asynchronous functions

07:10

How to use CUDA streams

10:28

Overlapping memory transfer and kernel execution

05:23

Stream synchronization and blocking behavious of NULL stream

06:57

Explicit and implicit synchronization

02:31

CUDA events and timing with CUDA events

06:03

Creating inter stream dependencies with events

04:31

Performance Tuning with CUDA instruction level primitives

4 lectures

Introduction to different types of instructions in CUDA

04:01

Floating point operations

06:46

Standard and Instrict functions

08:29

Atomic functions

08:22

Parallel Patterns and Applications

6 lectures

Scan algorithm introduction

05:38

Simple parallel scan

08:24

Work efficient parallel exclusive scan

09:33

Work efficient parallel inclusive scan

07:41

Parallel scan for large data sets

04:52

Parallel Compact algorithm

07:49

Bonus: Introduction to Image processing with CUDA

6 lectures

Introduction part 1

08:04

Introduction part 2

11:41

Digital image processing

09:39

Digital image fundametals : Human perception

11:10

Digital image fundamentals : Image formation

15:22

OpenCV installation

06:28

Đánh giá của học viên

Chưa có đánh giá

Course Rating

5

0%

4

0%

3

0%

2

0%

1

0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

Email

Nội dung

Khoá học liên quan

How To Create 2D Defender Game With Unity & C#

How To Create 2D Defender Game With Unity & C#

The Complete UML Course (2024): Learn to Design UML Diagrams

The Complete UML Course (2024): Learn to Design UML Diagrams

Master Network Automation with Python for Network Engineers

Master Network Automation with Python for Network Engineers

Unity Mobile Game Development - Exterminator

Unity Mobile Game Development - Exterminator

Mastering CSS Grid 2023 - With 3 cool projects

Mastering CSS Grid 2023 - With 3 cool projects

Django 3 - Full Stack Websites with Python Web Development

Django 3 - Full Stack Websites with Python Web Development

Appium -Mobile Testing (Android/IOS) from Scratch+Frameworks

Appium -Mobile Testing (Android/IOS) from Scratch+Frameworks

Appium - Mobile Testing with Latest 2.0 and Live Projects

Appium - Mobile Testing with Latest 2.0 and Live Projects

Deep Learning for Object Detection with Python and PyTorch

Deep Learning for Object Detection with Python and PyTorch

Full Android Development Masterclass | 14 Real Apps-46 Hours

Full Android Development Masterclass | 14 Real Apps-46 Hours

Go Java Full Stack with Spring Boot and Angular

Go Java Full Stack with Spring Boot and Angular

Machine Learning Practical Workout | 8 Real-World Projects

Machine Learning Practical Workout | 8 Real-World Projects

WordPress 2024: The Complete WordPress Website Course

WordPress 2024: The Complete WordPress Website Course

Android App Development Master Course with Java | Android

Android App Development Master Course with Java | Android

Unreal Engine 5: Soulslike Melee Combat System

Unreal Engine 5: Soulslike Melee Combat System

ASP.NET Core Web Application Using Razor Pages

ASP.NET Core Web Application Using Razor Pages

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.

Get khoá học cho tôi