Solving Some Image Processing and Computer Vision Problems with Python libraries

In this article, a few image processing / computer vision problems and their solutions  with python libraries (scikit-image, cv2) will be discussed. Some of the problems are from the exercises from this book (available on amazon). This blog will be continued here.

Removing Gaussian Noise from images by computing mean and median images with scikit-image

  1. Start with an input image.
  2. Create n (e.g, n=100) noisy images by adding i.i.d. Gaussian noise (with zero mean) to the original image, with scikit-image.
  3. Compute the mean (median) of the noisy images.
  4. Compare PSNR with the original image.
  5. Vary n and compare the results.
from skimage import img_as_float
from skimage.util import random_noise
from skimage.measure import compare_psnr
from skimage.io import imread
import matplotlib.pylab as plt
import numpy as np

im = img_as_float(imread('../new images/parrot.jpg')) # original image
np.random.seed(0)
# generate n noisy images from the original image by adding Gaussian noise
n = 25
images = np.zeros((n, im.shape[0], im.shape[1], im.shape[2]))
sigma = 0.2
for i in range(n):
    images[i,...] = random_noise(im, var=sigma**2)

im_mean = images.mean(axis=0)
im_median = np.median(images, axis=0)
plt.figure(figsize=(20,16))
plt.subplots_adjust(left=.02, right=.98, bottom=.001, top=.96, wspace=.05, hspace=.01)
plt.subplot(221), plt.imshow(im), plt.axis('off'), plt.title('Original image', size=20)
plt.subplot(222), plt.imshow(images[0]), plt.axis('off'), plt.title('Noisy PSNR: ' + str(compare_psnr(im, images[0])), size=20)
plt.subplot(223), plt.imshow(im_mean), plt.axis('off'), plt.title('Mean PSNR: ' + str(compare_psnr(im, im_mean)), size=20)
plt.subplot(224), plt.imshow(im_median), plt.axis('off'), plt.title('Median PSNR: ' + str(compare_psnr(im, im_median)), size=20)
plt.show()

The next figure shows the original image, a noisy image generated from it by adding Gaussian noise (with 0 mean) to it and the images obtained by taking mean / median over all the n noisy images generated. As can be seen, the Gaussian noise in the images gets cancelled out by taking mean / median.

with n = 25

p25.png

with n=100
p100.png


plt.hist(images[:,100,100,0], color='red', alpha=0.2, label='red')
plt.hist(images[:,100,100,1], color='green', alpha=0.2, label='green')
plt.hist(images[:,100,100,2], color='blue', alpha=0.2, label='blue')
plt.legend()
plt.grid()
plt.show()

The next figure shows how a pixel value (that can be considered a random variable) for a particular location in different noisy images follows approximately a Gaussian distribution.

Distribution of a pixel value at location (100,100) in the noisy images

pg.png


ns = [25, 50, 100, 200]
# mean_psnrs contain the PSNR values for different n
plt.plot(ns, mean_psnrs, '.--', label='PSNR (mean)')
plt.plot(ns, median_psnrs, '.--', label='PSNR (median)')
plt.legend()
plt.xlabel('n'),  plt.ylabel('PSNR')
plt.show()

The following figure shows that the PSNR improves with large n (since by SLLN / WLLN, the sample mean converges to population mean 0 of the Gaussian noise). Also, for median the improvement in the image quality is higher for larger values of n.

psnr

Tracking Pedestrians with HOG-SVM with OpenCV / scikit-image

  1. Start with a video with pedestrians.
  2. Capture the video / extract frames from the video.
  3. For each frame
    1. Create HOG scale pyramid of the frame image.
    2. At each scale, use a sliding window to extract the corresponding block from the frame, compute the HOG descriptor features.
    3. Use cv2‘s HOGDescriptor_getDefaultPeopleDetector() – a pre-trained SVM classifier on the HOG descriptor to classify whether the corresponding block contains a pedestrian or not.
    4. Run non-max-suppression to get rid of multiple detection of the same person.
    5. Use cv2‘s  detectMultiScale() function to implement steps 3-4.

The code is adapted from the code here and here.


# HOG descriptor using default people (pedestrian) detector
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

# run detection, using a spatial stride of 4 pixels,
# a scale stride of 1.02, and zero grouping of rectangles
# (to demonstrate that HOG will detect at potentially
# multiple places in the scale pyramid)
(foundBoundingBoxes, weights) = hog.detectMultiScale(frame, winStride=(4, 4), padding=(8, 8), scale=1.02, finalThreshold=0, useMeanshiftGrouping=False)

# convert bounding boxes from format (x1, y1, w, h) to (x1, y1, x2, y2)
rects = np.array([[x, y, x + w, y + h] for (x, y, w, h) in foundBoundingBoxes])

# run non-max suppression on the boxes based on an overlay of 65%
nmsBoundingBoxes = non_max_suppression(rects, probs=None, overlapThresh=0.65)

cv2 functions are used to extract HOG descriptor features and pedestrian detection with SVM,  whereas scikit-image functions are used to visualize the HOG features. The animations below display the original video, what HOG sees and  the detected pedestrians after non-max suppression. Notice there are a few false positive detection.

Original Videoped1

HOG-descriptor features video (what HOG sees)pedh1Original Video with detected Pedestrians ped1o

Face Detection with HaarCascade pre-trained AdaBoost classifiers with OpenCV

  1. Capture video with webcam with cv2.VideoCapture().
  2. For each frame, use the pre-trained Adaboost Cascade classifiers (the haarcascade_frontalface_default classifier for face detection and haarcascade_eye_tree_eyeglasses classifier for better detection of the eyes with glasses, from the corresponding xml files that come with cv2’s installation) using Haar-like features with cv2.CascadeClassifier().
  3. First detect the face(s) with the detectMultiScale() function and draw a bounding box. Then detect the eyes inside a detected face with the same function.
  4. The following python code snippet shows how to detect faces and eyes with cv2. The code is adapted from here.

 


# read the cascade classifiers from the xml files from the correct path into face_cascade  # and eye_cascade
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# return bounding box of the face(s) if one is detected
faces = face_cascade.detectMultiScale(gray, 1.03, 5)
for (x,y,w,h) in faces:
   frame = cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),2)
   roi_gray = gray[y:y+h, x:x+w]
   roi_color = frame[y:y+h, x:x+w]
   eyes = eye_cascade.detectMultiScale(roi_gray)
   for (ex,ey,ew,eh) in eyes:
      cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)

The next animation shows the results of face detection when scalefactor 1.03 was used to create the scale pyramid.  As can be seen, the eyes with the glasses on and some small faces from the photos are not detected at this scale.

fcr.gif

The next animation shows the results of face detection when scalefactor 1.3 was used to create the scale pyramid.  As can be seen, the eyes with/without the glasses on as well as most of the small faces from the photos are detected at this scale most of the time.

me

kiff

Object Tracking with OpenCV trackers

ftfttmilfttcrt

Object Saliency Detection with OpenCV

saliency

Linear / QR Barcode Generation and Detection with OpenCV

me_barcodeme_barcode_detected

me_barcode_detected

bard

Image Segmentation with Random Walk with scikit-image

rw_segmentation1.png

 

Image Segmentation with Grab-Cut with OpenCV

gr_whale.png

Segmentation with SLIC + RAG in scikit-image

me_seg

Homography with scikit-image / OpenCV

books

me10.jpg
homo1

shutterstock

me12.jpg

me_shutter

 

boimela

 

 

homo.gif

Face Morphing (Beier-Neely morphing) with Pystasm

me14     sheldon

fk.png

ad

Object Detection with YOLO DarkNet / Keras / OpenCV (Deep Learning model)

yolo_me

Semantic Segmentation with ENet / DeepLab (Deep Learning  model)

Input video and the segmented Output video
result

Input video and the segmented Output video
cycle

Neural Style Transfer with OpenCV / Torch (Deep Learning Model)

Input  & Output with style (Starry night)me_style

Text Detection with EAST (Deep Learning Model)

st

result

 

Image Colorization with Deep Learning (OpenCV / Caffe)

pcolor

result

OCR + Text Recognition with EAST + Tesseract

tes

 

Advertisements