Detection of a Human Object with HOG Descriptor Features using SVM (Primal QuadProg implementation using CVXOPT) in Python

In this article, first how to extract the HOG descriptor from an image will be discuss. Then how a support vector machine binary classifier can be trained on a dataset containing labeled images (using the extracted HOG descriptor features) and later how the SVM model can be used (along with a sliding window) to predict whether or not a human object exists in a test image will be described.  How SVM can be represented as a Primal Quadratic Programming problem and can be solved with CVXOPT that will also be discussed. This problem appeared as an assignment problem in this Computer Vision course from UCF.

Problem 1: Compute HOG features

Let’s first Implement Histogram of Orientated Gradients (HOG). The dataset to be used is the INRIA Person Dataset from here. The dataset consists of positive and negative examples for training as well as testing images. Let us do the following:

i. Take 2003 positive training images of size 96×160
ii. Take 997 negative training images of size 96×160
iii. Compute HOG for positive and negative examples.
iv. Show the visualization of HOG for some positive and negative examples.

The Histograms of Oriented Gradients for Human Detection (HOG) is a very heavily cited paper by N. Dalal and B. Triggs from CVPR 2005. The following figure shows the  algorithm proposed by them can be used to compute the HOG features for a 96×160 image:


The next python code snippet shows some helper functions to compute the hog features:

import numpy as np
from scipy import signal
import scipy.misc

def s_x(img):
    kernel = np.array([[-1, 0, 1]])
    imgx = signal.convolve2d(img, kernel, boundary='symm', mode='same')
    return imgx
def s_y(img):
    kernel = np.array([[-1, 0, 1]]).T
    imgy = signal.convolve2d(img, kernel, boundary='symm', mode='same')
    return imgy

def grad(img):
    imgx = s_x(img)
    imgy = s_y(img)
    s = np.sqrt(imgx**2 + imgy**2)
    theta = np.arctan2(imgx, imgy) #imgy, imgx)
    theta[theta<0] = np.pi + theta[theta<0]
    return (s, theta)


The following figures animations show some positive and negative training examples along with the HOG features computed using the algorithm.


Positive Example 1

The next animation shows how the HOG features are computed using the above algorithm.


Positive Example 2


The next animation shows how the HOG features are computed using the above algorithm.



Positive Example 3


The next animation shows how the HOG features are computed using the above algorithm.

Negative Example 1


The next animation shows how the HOG features are computed using the above algorithm.


Problem 2: Use sklearn’s SVC and 80-20 validation to compute accuracy on the held-out training images dataset using the extracted HOG features.

Before implementing SVC on our own with primal quadratic programming solver, let’s use the scikit-learn SVC implementation (with linear kernel) to train a support vector classifier on the training positive and negative examples using the HOG features extracted  from the training images with 80-20 validation and compute accuracy of classification on the held-out images.

The following python code does exactly that, with the X matrix containing the 1620 HOG features extracted from each image and the corresponding label (pos/neg, depending on whether human is present or not), with 98.5% accuracy on the held-out dataset.

import time
from sklearn.metrics import accuracy_score
from sklearn.cross_validation import train_test_split
from sklearn.svm import SVC
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, train_size=0.8, random_state=123)
timestamp1 = time.time()
clf = SVC(C=1, kernel='linear'), ytrain)
print("%d support vectors out of %d points" % (len(clf.support_vectors_), len(Xtrain)))
timestamp2 = time.time()
print "sklearn LinearSVC took %.2f seconds" % (timestamp2 - timestamp1)
ypred = clf.predict(Xtest)
print('accuracy', accuracy_score(ytest, ypred))

430 support vectors out of 2400 points
sklearn LinearSVC took 3.40 seconds
accuracy 0.985

The next figures show the confusion matrices for the prediction on the held-out dataset with the SVC model learnt.




Problem 3: Implement SVM by solving the Primal form of the problem using Quadratic Programming

Let’s implement Support Vector Machine (SVM) using Quadratic Programming. We shall use python’s CVXOPT package for this purpose. Let’s do the following:

i. Try to understand each input term in cvxopt.solvers.qp.
ii. Formulate soft- margin primal SVM in term of inputs of cvxopt.solvers.qp
iii. Show ‘P’, ‘Q’, ‘G”, ‘h’, ‘A’ and ‘b’ Matrices.
iv. Obtain parameter vector ‘w’ and bias term ‘b’ using cvxopt.solvers.qp


To be done


Problem 4: Detect Human in testing images using trained model (‘w’, ‘b’) from the last problem

Let’s use the coefficients learnt by the SVM model from the training dataset and do the following:

i. Take at least 5 testing images from Test/pos.
ii. Test the trained model over testing images. Testing can be performed using
w*feature vector + b.
iii. Use sliding window approach to obtain detection at each location in the image.
iv. Perform non-maximal suppression and choose the highest scored location.
v. Display the bounding box at the final detection.


To be done