Mô tả

This course is an introduction to speaker recognition techniques.


Speaker recognition lies in the intersection of audio processing, biometrics, and machine learning, and has various applications. You can find the application of speaker recognition on your smart phones, smart home devices, and various commercial services.


In this course, we will start with an introduction to the history of speaker recognition techniques, to see how it evolved from simple human efforts to modern deep learning based intelligent systems.


We will cover the basics of acoustics, perception, audio processing, signal processing, and feature extraction, so you don't need a background in these domains. We will also have an introduction of popular machine learning approaches, such as Gaussian mixture models, support vector machines, factor analysis, and neural networks.


We will focus on how to build speaker recognition systems based on acoustic features and machine learning models, with an emphasis on modern speaker recognition with deep learning, such as the different options for inference logic, loss function, and neural network topologies.


We will also talk about data processing techniques such as data cleansing, data augmentation, and data fusion.


We included lots of hands-on practices and coding examples for you to really master the topics introduced in this course, and a final project to guide you through building your own speaker recognition system from scratch.


If you are a college student interested in AI or signal processing, or a software engineer, system architect or product manager working with related technologies, then this course is definitely for you!

Bạn sẽ học được gì

Basic concepts and core algorithms in speaker recognition

Audio processing and acoustics

Machine learning and deep learning basics

Coding practice and toolkits for audio and speech

Python and PyTorch for machine learning

Building a speaker recognition system from scratch

Yêu cầu

  • College level mathematics
  • Experience with machine learning or coding will be a plus

Nội dung khoá học

10 sections

Introduction to this course

6 lectures
Should I take this course?
03:17
Expected outcome from this course
02:31
About this course
2 questions
How to max your win from this course
01:50
Max your win from this course
2 questions
Syllabus
01:44

The History of Voice Identity Techniques

8 lectures
What is voice identity
06:17
The earliest voice-id techniques
06:52
Voice identity concepts and earliest systems
4 questions
Get started with the Audacity software
2 questions
The development of voice-id techniques
10:05
The new age of voice-id techniques
06:07
Development and new age of voice-id techniques
3 questions
Brainstorm about applications of speaker recognition
1 question

Fundamental of Audio Processing

13 lectures
Audio and acoustics
07:30
Audio and acoustics
3 questions
Hearing and perception 1
06:10
Hearing and perception 2
05:54
Mel scale
1 question
Hearing and perception
3 questions
Audio signal processing
09:02
Audio coding and formats
11:17
Audio signals
4 questions
Parse a WAV file
5 questions
Convert SPHERE file to WAV
1 question
Learning to use SoX
08:28
Convert MP4 file to FLAC
1 question

Acoustic Feature Extraction

9 lectures
Short-time analysis
11:53
Time domain features
06:07
Short-time analysis and time domain features
5 questions
Zero cross rate
1 question
Frequency domain features
08:10
Discrete Fourier Transform
1 question
Commonly used features
09:23
Acoustic features
4 questions
Visualize MFCC features of a YouTube video
2 questions

Fundamentals of Speaker Recognition

13 lectures
Intro to speaker recognition 1
08:15
Intro to speaker recognition 2
09:42
Intro to speaker recognition
4 questions
System workflow of speaker recognition
05:49
Similarity scoring
08:28
Cosine similarity in Python
1 question
System workflow and similarity scoring
3 questions
Evaluation and metrics 1
08:51
Evaluation and metrics 2
08:06
Equal error rate (EER) in Python
1 question
Score normalization
05:14
Evaluation metrics and score normalization
4 questions
T-norm in Python
1 question

Early Speaker Recognition Approaches

13 lectures
Gaussian mixture models 1
08:17
Gaussian mixture models 2
05:37
Gaussian mixture models
3 questions
Gaussian mixture models 3
06:14
Fit a GMM to data
2 questions
Universal background model
06:21
Support vector machines 1
05:04
Support vector machines 2
06:27
GMM-UBM and GMM-SVM
3 questions
Factor analysis
07:31
Joint factor analysis
04:25
i-vector
07:16
Factor analysis, JFA, and i-vector
3 questions

Deep Learning Basics

13 lectures
Intro to deep learning 1
07:04
Intro to deep learning 2
06:57
Intro to deep learning
3 questions
Feed-forward neural networks
06:05
Convolutional neural networks
09:25
Feed-forward and CNN
3 questions
1-dimensional convolutions
1 question
Recurrent neural networks
07:49
RNN
2 questions
Attention and transformer
07:50
Attention
2 questions
Deep learning with PyTorch
06:42
Sequence classification with PyTorch
4 questions

Speaker Recognition with Deep Learning

13 lectures
Indirect use of neural networks
07:50
Direct use of neural networks
04:16
Indirect and direct use of neural networks
1 question
Inference 1
07:02
Inference 2
10:02
Inference
2 questions
Loss function 1
08:17
Softmax function
1 question
Cross entropy loss
1 question
Loss function 2
08:22
Triplet loss
1 question
Loss function 3
11:29
Loss function
3 questions

Data Processing in Speaker Recognition

12 lectures
Data requirement
09:05
Data requirement
2 questions
Data preprocessing
09:17
Data preprocessing
3 questions
Data augmentation 1
08:39
Data augmentation 2
09:03
Data augmentation 3
05:49
Data augmentation
4 questions
Data augmentation with Python
1 question
Data fusion
06:33
Common datasets
06:07
Evaluation trial list
07:39

Final Project

5 lectures
Build a speaker recognition system from scratch
7 questions
Template project code explained
14:52
Parallelism
09:20
Represent any dataset with CSV
05:55
What's next?
00:45

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.