# Solving Some Image Processing and Computer Vision Problems with Python libraries

In this article, a few image processing / computer vision problems and their solutions  with python libraries (scikit-image, cv2) will be discussed. Some of the problems are from the exercises from this book (available on amazon). This blog will be continued here.

## Removing Gaussian Noise from images by computing mean and median images with scikit-image

2. Create n (e.g, n=100) noisy images by adding i.i.d. Gaussian noise (with zero mean) to the original image, with scikit-image.
3. Compute the mean (median) of the noisy images.
4. Compare PSNR with the original image.
5. Vary n and compare the results.
```from skimage import img_as_float
from skimage.util import random_noise
from skimage.measure import compare_psnr
import matplotlib.pylab as plt
import numpy as np

im = img_as_float(imread('../new images/parrot.jpg')) # original image
np.random.seed(0)
# generate n noisy images from the original image by adding Gaussian noise
n = 25
images = np.zeros((n, im.shape, im.shape, im.shape))
sigma = 0.2
for i in range(n):
images[i,...] = random_noise(im, var=sigma**2)

im_mean = images.mean(axis=0)
im_median = np.median(images, axis=0)
plt.figure(figsize=(20,16))
plt.subplots_adjust(left=.02, right=.98, bottom=.001, top=.96, wspace=.05, hspace=.01)
plt.subplot(221), plt.imshow(im), plt.axis('off'), plt.title('Original image', size=20)
plt.subplot(222), plt.imshow(images), plt.axis('off'), plt.title('Noisy PSNR: ' + str(compare_psnr(im, images)), size=20)
plt.subplot(223), plt.imshow(im_mean), plt.axis('off'), plt.title('Mean PSNR: ' + str(compare_psnr(im, im_mean)), size=20)
plt.subplot(224), plt.imshow(im_median), plt.axis('off'), plt.title('Median PSNR: ' + str(compare_psnr(im, im_median)), size=20)
plt.show()

```

The next figure shows the original image, a noisy image generated from it by adding Gaussian noise (with 0 mean) to it and the images obtained by taking mean / median over all the n noisy images generated. As can be seen, the Gaussian noise in the images gets cancelled out by taking mean / median.

with n = 25 with n=100 ```
plt.hist(images[:,100,100,0], color='red', alpha=0.2, label='red')
plt.hist(images[:,100,100,1], color='green', alpha=0.2, label='green')
plt.hist(images[:,100,100,2], color='blue', alpha=0.2, label='blue')
plt.legend()
plt.grid()
plt.show()

```

The next figure shows how a pixel value (that can be considered a random variable) for a particular location in different noisy images follows approximately a Gaussian distribution.

Distribution of a pixel value at location (100,100) in the noisy images ```
ns = [25, 50, 100, 200]
# mean_psnrs contain the PSNR values for different n
plt.plot(ns, mean_psnrs, '.--', label='PSNR (mean)')
plt.plot(ns, median_psnrs, '.--', label='PSNR (median)')
plt.legend()
plt.xlabel('n'),  plt.ylabel('PSNR')
plt.show()

```

The following figure shows that the PSNR improves with large n (since by SLLN / WLLN, the sample mean converges to population mean 0 of the Gaussian noise). Also, for median the improvement in the image quality is higher for larger values of n. ## Tracking Pedestrians with HOG-SVM with OpenCV / scikit-image

2. Capture the video / extract frames from the video.
3. For each frame
1. Create HOG scale pyramid of the frame image.
2. At each scale, use a sliding window to extract the corresponding block from the frame, compute the HOG descriptor features.
3. Use cv2‘s HOGDescriptor_getDefaultPeopleDetector() – a pre-trained SVM classifier on the HOG descriptor to classify whether the corresponding block contains a pedestrian or not.
4. Run non-max-suppression to get rid of multiple detection of the same person.
5. Use cv2‘s  detectMultiScale() function to implement steps 3-4.

The code is adapted from the code here and here.

```
# HOG descriptor using default people (pedestrian) detector
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

# run detection, using a spatial stride of 4 pixels,
# a scale stride of 1.02, and zero grouping of rectangles
# (to demonstrate that HOG will detect at potentially
# multiple places in the scale pyramid)
(foundBoundingBoxes, weights) = hog.detectMultiScale(frame, winStride=(4, 4), padding=(8, 8), scale=1.02, finalThreshold=0, useMeanshiftGrouping=False)

# convert bounding boxes from format (x1, y1, w, h) to (x1, y1, x2, y2)
rects = np.array([[x, y, x + w, y + h] for (x, y, w, h) in foundBoundingBoxes])

# run non-max suppression on the boxes based on an overlay of 65%
nmsBoundingBoxes = non_max_suppression(rects, probs=None, overlapThresh=0.65)

```

cv2 functions are used to extract HOG descriptor features and pedestrian detection with SVM,  whereas scikit-image functions are used to visualize the HOG features. The animations below display the original video, what HOG sees and  the detected pedestrians after non-max suppression. Notice there are a few false positive detection.

Original Video HOG-descriptor features video (what HOG sees) Original Video with detected Pedestrians 1. Capture video with webcam with cv2.VideoCapture().
2. For each frame, use the pre-trained Adaboost Cascade classifiers (the haarcascade_frontalface_default classifier for face detection and haarcascade_eye_tree_eyeglasses classifier for better detection of the eyes with glasses, from the corresponding xml files that come with cv2’s installation) using Haar-like features with cv2.CascadeClassifier().
3. First detect the face(s) with the detectMultiScale() function and draw a bounding box. Then detect the eyes inside a detected face with the same function.
4. The following python code snippet shows how to detect faces and eyes with cv2. The code is adapted from here.

```
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# return bounding box of the face(s) if one is detected
for (x,y,w,h) in faces:
frame = cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),2)
roi_gray = gray[y:y+h, x:x+w]
roi_color = frame[y:y+h, x:x+w]
for (ex,ey,ew,eh) in eyes:
cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)

```

The next animation shows the results of face detection when scalefactor 1.03 was used to create the scale pyramid.  As can be seen, the eyes with the glasses on and some small faces from the photos are not detected at this scale. The next animation shows the results of face detection when scalefactor 1.3 was used to create the scale pyramid.  As can be seen, the eyes with/without the glasses on as well as most of the small faces from the photos are detected at this scale most of the time.

##  ## Object Tracking with OpenCV trackers   ## Object Saliency Detection with OpenCV ## Linear / QR Barcode Generation and Detection with OpenCV    ## Image Segmentation with Random Walk with scikit-image ## Image Segmentation with Grab-Cut with OpenCV ## Segmentation with SLIC + RAG in scikit-image ## Homography with scikit-image / OpenCV        ## Face Morphing (Beier-Neely morphing) with Pystasm     ## Object Detection with YOLO DarkNet / Keras / OpenCV (Deep Learning model) ## Semantic Segmentation with ENet / DeepLab (Deep Learning  model)

Input video and the segmented Output video Input video and the segmented Output video ## Text Detection with EAST (Deep Learning Model)  ## Image Colorization with Deep Learning (OpenCV / Caffe)  ## OCR + Text Recognition with EAST + Tesseract 