What is Computer Vision?

According to Ballard and Brown in 1982, Computer Vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human visual system can do.

At 19th CERN School of Computing, T. S. Huang states that Computer Vision has a dual goal. From the biological science point of view, Computer Vision aims to come up with computational models of the human visual system. From the engineering point of view, computer vision aims to build autonomous systems which could perform some of the tasks which the human visual system can perform (and even surpass it in many cases).

These two goals are intimately related. The properties and characteristics of the human visual system often give inspiration to engineers who are designing Computer Vision systems. Conversely, Computer Vision algorithms can offer insights into how the human visual system works.

Sub-domains of Computer Vision include scene reconstruction, object detection, event detection, video tracking, object recognition, 3D pose estimation, learning, indexing, motion estimation, visual servoing, 3D scene modeling, and image restoration (Morris, 2004).

Why use Python for Computer Vision?

  1. Easy to use
  2. Open source
  3. Python has become the language of scientific computing
  4. Easy for visualization and debugging
  5. It can be directly integrated with web frameworks (as well as GUIs)

Delphi adds Powerful GUI Features and Functionalities to Python

In this tutorial, we’ll build Windows Apps with extensive Computer Vision capabilities by integrating Python’s Computer Vision libraries with Embarcadero’s Delphi, using Python4Delphi (P4D).

A small disclaimer: we have used some publicly-available images for the fair-use purposes of education so we can teach you how face recognition works. The copyright of the images remains with the owner and we acknowledge the source and their ownership.

P4D empowers Python users with Delphi’s award-winning VCL functionalities for Windows which enables us to build native Windows apps 5x faster. This integration enables us to create a modern GUI with Windows 10 looks and responsive controls for our Python Computer Vision applications. Python4Delphi also comes with an extensive range of demos, use cases, and tutorials.

We’re going to cover the following…

How to use OpenCV, Mahotas, Face Recognition, EasyOCR, and Keras Python libraries to perform Computer Vision tasks

All of them would be integrated with Python4Delphi to create Windows Apps with Computer Vision capabilities.

Prerequisites

Before we begin to work, download and install the latest Python for your platform. Follow the Python4Delphi installation instructions mentioned here. Alternatively, you can check out the easy instructions found in the Getting Started With Python4Delphi video by Jim McKeeth.

Time to get Started!

First, open and run our Python GUI using project Demo1 from Python4Delphi with RAD Studio. Then insert the script into the lower Memo, click the Execute button, and get the result in the upper Memo. You can find the Demo1 source on GitHub. The behind the scene details of how Delphi manages to run your Python code in this amazing Python GUI can be found at this link.

5 Ways To Use Computer Vision In Your Windows Apps demo of code
Open Demo01.dproj.

How do I perform Computer Vision with OpenCV on Windows?

OpenCV (Open Source Computer Vision Library) is an open-source Computer Vision and Machine Learning software library. OpenCV was built to provide a common infrastructure for Computer Vision applications and to accelerate the use of machine perception in commercial products. OpenCV supports various programming languages including Python.

OpenCV has more than 2500 optimized algorithms, which includes a comprehensive set of both classic and state-of-the-art Computer Vision and Machine Learning algorithms. These algorithms can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D models of objects, produce 3D point clouds from stereo cameras, stitch images together to produce a high-resolution image of an entire scene, find similar images from an image database, remove red eyes from images taken using flash, follow eye movements, recognize scenery and establish markers to overlay it with augmented reality, etc.

First, here is how you can get OpenCV to work with Python4Delphi to create GUI with Computer Vision and Machine Learning capabilities:

Note: This is an unofficial pre-built CPU-only OpenCV package for Python.

Don’t forget to put the path where your OpenCV installed, to the System Environment Variables:

System Environment Variable Examples

The following is a code example of OpenCV to perform perspective transformation of an image  (run this inside the lower Memo of Python4Delphi Demo01 GUI):

Here is the result in Python GUI

OpenCV Demo with Python4Delphi in Windows.
OpenCV Demo with Python4Delphi in Windows.

Read more: https://pythongui.org/learn-to-build-a-python-gui-for-computer-vision-tasks-with-powerful-opencv-library-in-a-delphi-windows-app/

How do I perform Computer Vision with Mahotas on Windows?

Mahotas is a fast computer vision algorithms library (all implemented in C++ for speed) that operates over NumPy arrays. Mahotas supports Python 2.7 and 3.4+.

Currently, Mahotas has over 100 functions for image processing and computer vision and it keeps growing.

Here are some notable algorithms provided by Mahotas:

  • Watershed
  • Convex points calculations.
  • Hit & miss, thinning.
  • Zernike & Haralick, LBP, and TAS features.
  • Speeded-Up Robust Features (SURF), a form of local features.
  • Thresholding.
  • Convolution.
  • Sobel edge detection.
  • Spline interpolation
  • SLIC superpixels.

Are you looking for a powerful Computer Vision library and build a nice GUI for them? This section will show you how to get started!

First, here is how you can get Mahotas:

The following is a code example of Mahotas to use the Ridler-Calvard threshold to transform an image  (run this inside the lower Memo of Python4Delphi Demo01 GUI):

Here is the Mahotas result in the Python GUI:

Mahotas Demo with Python4Delphi in Windows.
Mahotas Demo with Python4Delphi in Windows.

How do I perform Computer Vision with Face Recognition on Windows?

Face Recognition library-known as the world’s simplest face recognition library, has the capabilities to recognize and manipulate faces using Python or the command line.

Face Recognition is built using dlib’s state-of-the-art face recognition built with deep learning. The model has an accuracy of 99.38% on the Labeled Faces in the Wild benchmark.

This library also provides a simple face_recognition command-line tool that lets us do face recognition on a folder of images from the command line.

This section will guide you to combine Python4Delphi with the Face Recognition library, inside Delphi and C++Builder, from installing Face Recognition with pip until using it to recognize all faces in any given image!

First, here is how you can get Face Recognition:

Some of you might encounter some error when installing Face Recognition, caused by dlib (one of the Face Recognition requirements). Please refer to this link for the solutions.

Next, we will test the Face Recognition library to detect faces in this image:

Use the following code to recognize faces from any image, using Histogram of Oriented Gradients (HOG) based model (run this inside the lower Memo of Python4Delphi Demo01 GUI):

Face Recognition Python4Delphi Results

Face Recognition Demo with Python4Delphi in Windows.
Face Recognition Demo with Python4Delphi in Windows.
Face Recognition Demo with Python4Delphi in Windows.
Face Recognition Demo with Python4Delphi in Windows.
Face Recognition Demo with Python4Delphi in Windows.
Face Recognition Demo with Python4Delphi in Windows.
Face Recognition Demo with Python4Delphi in Windows.
Face Recognition Demo with Python4Delphi in Windows.

How do I perform Computer Vision with EasyOCR on Windows?

EasyOCR, as the name suggests, is a Python package that allows computer vision developers to effortlessly perform Optical Character Recognition. EasyOCR provides end-to-end, and ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic, etc.

When it comes to OCR, using EasyOCR is by far the most straightforward way to apply Optical Character Recognition:

  • The EasyOCR package can be installed with single pip command.
  • The dependencies on the EasyOCR package are minimal, making it easy to configure your OCR development environment.
  • Once EasyOCR is installed, only one import statement is required to import the package into your project.
  • From there, all you need is two lines of code to perform OCR — one to initialize the Reader class and then another to OCR the image via the readtext function.

First, here is how you can get EasyOCR:

Next, we will test the EasyOCR library to detect both Chinese and English characters in this image:

The following is a basic usage of EasyOCR to detect both Chinese and English characters in the sample image above (run this inside the lower Memo of Python4Delphi Demo01 GUI):

EasyOCR Optical Character Recognition Result

EasyOCR Demo with Python4Delphi in Windows.
EasyOCR Demo with Python4Delphi in Windows.

Amazing isn’t it? You can make your computer recognize Chinese and English characters, as well as other 80+ supported languages.

How do I perform Computer Vision with Keras on Windows?

Keras is a high-level neural networks API for Python. Keras acts as an interface for the TensorFlow library. As a central part of the tightly connected TensorFlow 2.0 ecosystem, Keras is covering every step of the Machine Learning workflow, from data management to hyperparameter training to deployment solutions.

Do you want to use Keras to solve Computer Vision problems, and build a nice GUI for it? This section will show you a demo of the Keras image segmentation model trained from scratch on the Oxford Pets dataset.

First, here is how you can get Keras:

Download the dataset, by running the following commands on your Windows cmd:

5 Ways To Use Computer Vision In Your Windows Apps command prompt
Download the Oxford Pets Dataset.
5 Ways To Use Computer Vision In Your Windows Apps command prompt 2
Download the Oxford Pets Dataset.
5keras3_downloaddata3
Download the Oxford Pets Dataset.

The following is a code example of Keras to perform image segmentation with a U-Net-like architecture (run this inside the lower Memo of Python4Delphi Demo01 GUI). The code used in this section is authored by François Chollet.

The Input Image vs its Segmentation Mask Result in the Python4Delphi GUI

5 Ways To Use Computer Vision In Your Windows Apps - Keras Demo with Python4Delphi in Windows.
Keras Demo with Python4Delphi in Windows.
Keras Demo with Python4Delphi in Windows.
Keras Demo with Python4Delphi in Windows.

Want to know some more? Then check out Python4Delphi which easily allows you to build Python GUIs for Windows using Delphi.