Mô tả

Within the area of Computer Vision is the sub-area of Optical Character Recognition (OCR), which aims to transform images into texts. OCR can be described as converting images containing typed, handwritten or printed text into characters that a machine can understand. It is possible to convert scanned or photographed documents into texts that can be edited in any tool, such as the Microsoft Word. A common application is automatic form reading, in which you can send a photo of your credit card or your driver's license, and the system can read all your data without the need to type them manually. A self-driving car can use OCR to read traffic signs and a parking lot can guarantee access by reading the license plate of the cars!

To take you to this area, in this course you will learn in practice how to use OCR libraries to recognize text in images and videos, all the code implemented step by step using the Python programming language! We are going to use Google Colab, so you do not have to worry about installing libraries on your machine, as everything will be developed online using Google's GPUs! You will also learn how to build your own OCR from scratch using Deep Learning and Convolutional Neural Networks! Below you can check the main topics of the course:

  • Recognition of texts in images and videos using Tesseract, EasyOCR and EAST

  • Search for specific terms in images using regular expressions

  • Techniques for improving image quality, such as: thresholding, color inversion, grayscale, resizing, noise removal, morphological operations and perspective transformation

  • EAST architecture and EasyOCR library for better performance in natural scenes

  • Training an OCR from scratch using TensorFlow and modern Deep Learning techniques, such as Convolutional Neural Networks

  • Application of natural language processing techniques in the texts extracted by OCR (word cloud and named entity recognition)

  • License plate reading

These are just some of the main topics! By the end of the course, you will know everything you need to create your own text recognition projects using OCR!

Bạn sẽ học được gì

Use Tesseract, EAST and EasyOCR tools for text recognition in images and videos

Understand the differences between OCR in controlled and natural environments

Apply image pre-processing techniques to improve image quality, such as: thresholding, inversion, resizing, morphological operations and noise reduction

Use EAST architecture and EasyOCR library for better performance in natural scenes

Train an OCR from scratch using Deep Learning and Convolutional Neural Networks

Application of natural language processing techniques in the texts extracted by OCR (word cloud and named entity recognition)

License plate reading

Yêu cầu

  • Programming logic
  • Python programming basic

Nội dung khoá học

13 sections

Introduction

3 lectures
Course content
12:57
Introduction to OCR
06:37
Course materials
00:11

OCR with Tesseract

11 lectures
Introduction to Tesseract
12:09
Preparing the environment
07:56
First text recognition
02:28
Support for other languages
10:46
Page segmentation mode (PSM)
11:02
Page orientation detection
04:35
Selection of texts 1
09:06
Selection of texts 2
14:28
Selection of texts 3
10:21
Search using regular expressions
11:14
Detections in natural scenarios
06:28

Techniques for image pre-processing

16 lectures
Grayscale
07:22
Thresholding - intuition
12:22
Simple thresholding
06:36
Thresholding with Otsu method
06:36
Adaptive thresholding
06:27
Gaussian adaptive thresholding
04:47
Color inversion
04:37
Resizing - intuition
05:31
Resizing - implementation
05:37
Morphological operations - intuition
03:48
Morphological operations - implementation
10:56
Noise removal - intuition
15:58
Noise removal - implementation
08:16
Text recognition with OCR
04:07
HOMEWORK
00:08
Homework solution
04:21

OCR with EAST for natural scenes

6 lectures
EAST - introduction
09:19
Pre-processing the image
12:25
Loading the neural network
10:29
Decoding the image 1
07:59
Decoding the image 2
14:38
Text recognition
05:54

Training a custom OCR

18 lectures
Importing the libraries
04:40
MNIST 0-9 dataset
11:11
Kaggle A-Z dataset
11:12
Joining the datasets
05:33
Pre-processing the data
17:52
Building the neural network
14:03
Training the neural network
07:58
Evaluating the neural network
12:04
Saving the neural network
03:17
Testing with images
10:39
Preparing the environment
05:45
Pre-processing the image
07:40
Contour detection
14:16
Processing the detections 1
12:05
Processing the detections 2
07:37
Character recognition
12:37
Problems with 0 and O, 1 and l, 5 and S
06:31
Problems with undetected texts
05:51

Natural scenarios with EasyOCR

5 lectures
Preparing the environment
05:37
Text recognition
02:14
Writing the results on the image
13:42
Other languages - French and Chinese
05:58
Text recognition (background)
08:22

OCR in videos

5 lectures
Preparing the environment
07:51
Video settings
11:39
Processing the video
04:42
OCR with EAST and Tesseract
13:04
OCR with EasyOCR
05:44

Project 1: Searching for specific terms

7 lectures
Preparing the environment
06:16
Text recognition
06:30
Searching for texts
05:58
Word cloud
12:53
Named entity recognition
03:37
Search for texts in images
10:27
Saving the results
04:54

Project 2: Scanner + OCR

6 lectures
Preparing the environment
05:40
Contour detection
08:37
Perspective transformation
11:37
OCR with Tesseract
03:41
Improving image quality
08:31
Putting all together
02:41

Project 3: License plate reading

3 lectures
Pre-processing the image
09:02
Text recognition
05:22
Improving image quality
02:17

Extra content 1: artificial neural networks

8 lectures
Biological fundamentals
05:42
Single layer perceptron
19:23
Multilayer perceptron – sum and activation functions
14:20
Multilayer perceptron – error calculation
05:19
Gradient descent
09:49
Delta parameter
08:09
Updating weights with backpropagation
14:03
Bias, error, stochastic gradient descent, and more parameters
17:56

Extra content 2: convolutional neural networks

5 lectures
Introduction to convolutional neural networks
07:18
Convolutional operation
10:04
Pooling
05:28
Flattening
06:31
Dense neural network
05:10

Final remarks

2 lectures
Final remarks
01:53
BONUS
01:32

Đánh giá của học viên

Chưa có đánh giá
Course Rating
5
0%
4
0%
3
0%
2
0%
1
0%

Bình luận khách hàng

Viết Bình Luận

Bạn đánh giá khoá học này thế nào?

image

Đăng ký get khoá học Udemy - Unica - Gitiho giá chỉ 50k!

Get khoá học giá rẻ ngay trước khi bị fix.