Solving Some Image Processing Problems with Python libraries

In this article a few popular image processing problems along with their solutions are going to be discussed. Python image processing libraries are going to be used to solve these problems. Some of the problems are from the exercises from this book  (available on amazon).

 

Image Transformations and Warping

 

0. Display RGB image color channels in 3D

  1. A gray-scale image can be thought of a 2-D function f(x,y) of the pixel locations (x,y), that maps each pixel into its corresponding gray level (an integer in [0,255], e.g.,).
  2. For an RGB image there are 3 such functions, f_R(x,y), f_G(x.y), f_B(x.y).
  3. matplotlib’s 3-D plot functions can be used to plot each function.

The following python code shows how to plot the RGB channels separately in 3D:

import matplotlib.pylab as plt
from mpl_toolkits.mplot3d import Axes3D
def plot_3d(X, Y, Z, title, cmap):
    # implement this function to plot the channel pixel values in 3D
    plt.show()

im = imread('../images/parrot.jpg')
Y = np.arange(im.shape[0])
X = np.arange(im.shape[1])
Z1 = im[...,0]
Z2 = im[...,1]
Z3 = im[...,2]
plot_3d(X, Y, Z1, cmap='Reds', title='3D plot for the Red Channel')
plot_3d(X, Y, Z2, cmap='Greens', title='3D plot for the Green Channel')
plot_3d(X, Y, Z3, cmap='Blues', title='3D plot for the Blue Channel')

The RGB image
parrot.jpg

parrot_redparrot_greenparrot_blue

1. Wave Transform

  1. Use scikit-image’s warp() function to implement the wave transform.
  2. Note that wave transform can be expressed with the following equations:

We shall use the madrill image to implement the wave transform. The next python code fragment shows how to do it:


def wave(xy):
    xy[:, 1] += 20*np.sin(2*np.pi*xy[:, 0]/64)
    return xy

from skimage.io import imread
from skimage.transform import warp
import matplotlib.pylab as plt
im = imread('images/mandrill.jpg')
im = warp(im, wave)
plt.imshow(im)
plt.show()

The next figure shows the original mandrill input image and the output image obtained after applying the wave transform.

mandrill      mandrill_w.png

mw.gif

2. Swirl Transform

  1. Use scikit-image’s warp() function to implement the swirl transform.
  2. Note that swirl transform can be expressed with the following equations


We shall use the madrill image to implement the wave transform. The next python code fragment shows how to do it:


def swirl(xy, x0, y0, R):
    r = np.sqrt((xy[:,1]-x0)**2 + (xy[:,0]-y0)**2)
    a = np.pi * r / R
    xy[:, 1] = (xy[:, 1]-x0)*np.cos(a) + (xy[:, 0]-y0)*np.sin(a) + x0
    xy[:, 0] = -(xy[:, 1]-x0)*np.sin(a) + (xy[:, 0]-y0)*np.cos(a) + y0
    return xy

im = imread('../images/mandrill.jpg')
im = warp(im, swirl, map_args={'x0':112, 'y0':112, 'R':512})
plt.imshow(im)
plt.axis('off')
plt.show()

The next figure shows the original mandrill input image and the output image obtained after applying the swirl transform.

mandrill   ms.png

ms.gif

Compare this with the output of the scikit-image swirl() function.

3. Very simple Face morphing with α-blending

  1. Start from one face image (e.g., let image1 be the face of Messi) and end into another image (let image2 be the face of Ronaldo) iteratively, creating some intermediate images in between.
  2. At each iteration create an image by using a linear combination of the two image numpy ndarrays given by
         
     3. Iteratively increase α from 0 to 1.

The following code block shows how to implement it using matplotlib’s image and pylab modules.

im1 = mpimg.imread("../images/messi.jpg") / 255 # scale RGB values in [0,1]
im2 = mpimg.imread("../images/ronaldo.jpg") / 255
i = 1
plt.figure(figsize=(18,15))
for alpha in np.linspace(0,1,20):
 plt.subplot(4,5,i)
 plt.imshow((1-alpha)*im1 + alpha*im2)
 plt.axis('off')
 i += 1
plt.subplots_adjust(wspace=0.05, hspace=0.05)
plt.show()

The next animation shows the simple face morphing:
fm.gif
There are more sophisticated techniques to improve the quality of morphing, but this is the simplest one.

4. Creating Instagram-like Gotham Filter

The Gotham filter

The Gotham filter is computed as follows (the steps taken from here), applying the following operations on an image, the corresponding python code, input and output images are shown along with the operations (with the following input image):

city2.jpg

  1. A mid-tone red contrast boost
    from PIL import Image
    import numpy as np
    import matplotlib.pylab as plt
    im = Image.open('../images/city.jpg') # pixel values in [0,255]
    r, g, b = im.split()
    red_levels = [0., 12.75, 25.5, 51., 76.5, 127.5, 178.5, 204., 229.5, 242.25, 255.]
    r1 = Image.fromarray((np.reshape(np.interp(np.array(r).ravel(), np.linspace(0,255,len(red_levels)), red_levels), (im.height, im.width))).astype(np.uint8), mode='L')
    plt.figure(figsize=(20,15))
    plt.subplot(221)
    plt.imshow(im)
    plt.title('original', size=20)
    plt.axis('off')
    plt.subplot(222)
    im1 = Image.merge('RGB', (r1, g, b))
    plt.imshow(im1)
    plt.axis('off')
    plt.title('with red channel interpolation', size=20)
    plt.subplot(223)
    plt.hist(np.array(r).ravel(), normed=True)
    plt.subplot(224)
    plt.hist(np.array(r1).ravel(), normed=True)
    plt.show()
    

    gotham1.png

  2. Make the blacks a little bluer
    plt.figure(figsize=(20,10))
    plt.subplot(121)
    plt.imshow(im1)
    plt.title('last image', size=20)
    plt.axis('off')
    b1 = Image.fromarray(np.clip(np.array(b) + 7.65, 0, 255).astype(np.uint8))
    im1 = Image.merge('RGB', (r1, g, b1))
    plt.subplot(122)
    plt.imshow(im1)
    plt.axis('off')
    plt.title('with transformation', size=20)
    plt.tight_layout()
    plt.show()
    

    gotham2.png

  3. A small sharpening
    from PIL.ImageEnhance import Sharpness
    plt.figure(figsize=(20,10))
    plt.subplot(121)
    plt.imshow(im1)
    plt.title('last image', size=20)
    plt.axis('off')
    im2 = Sharpness(im1).enhance(3.0)
    plt.subplot(122)
    plt.imshow(im2)
    plt.axis('off')
    plt.title('with transformation', size=20)
    plt.tight_layout()
    plt.show()
    

    gotham3.png

  4. A boost in blue channel for lower mid-tones
  5. A decrease in blue channel for upper mid-tones
    blue_levels = [0., 11.985, 30.09, 64.005, 81.09, 99.96, 107.1, 111.945, 121.125, 143.055, 147.9, 159.885, 171.105, 186.915, 215.985, 235.875, 255.]
    b2 = Image.fromarray((np.reshape(np.interp(np.array(b1).ravel(), np.linspace(0,255,len(blue_levels)), blue_levels), (im.height, im.width))).astype(np.uint8), mode='L')
    plt.figure(figsize=(20,15))
    plt.subplot(221)
    plt.imshow(im2)
    plt.title('last image', size=20)
    plt.axis('off')
    plt.subplot(222)
    im3 = Image.merge('RGB', (r1, g, b2))
    plt.imshow(im3)
    plt.axis('off')
    plt.title('with blue channel interpolation', size=20)
    plt.subplot(223)
    plt.hist(np.array(b1).ravel(), normed=True)
    plt.subplot(224)
    plt.hist(np.array(b2).ravel(), normed=True)
    plt.show()
    

gotham4.png

The output image obtained after applying the Gotham filter is shown below:

gotham_out

Down-sampling with anti-aliasing using Gaussian Filter

  1. Start with a large gray-scale image and reduce the image size 16 times, by reducing both height and width by 4 times.
  2. Select every 4th pixel in the x and the y direction from the original image to compute the values of the pixels in the smaller image.
  3. Before down-sampling apply a Gaussian filter (to smooth the image) for anti-aliasing.
  4. Compare the quality of the output image obtained by down-sampling without a Gaussian filter (with aliasing).

The next code block performs the above steps. Since the Gaussian blur is a low-pass filter, it removes the high frequencies from the original input image, hence it’s possible to achieve sampling rate above the Nyquist rate (by sampling theorem) to avoid aliasing.


from scipy.ndimage import gaussian_filter
im = rgb2gray(imread('images/umbc.png'))
print(im.shape)
plt.figure(figsize=(20,20))
plt.imshow(im)
plt.show()
plt.figure(figsize=(20,20))
im_blurred = gaussian_filter(im, sigma=2.5) #(5,5,1)
plt.imshow(im_blurred)
plt.show()
n = 4 # create and image 16 times smaller in size
w, h = im.shape[0] // n, im.shape[1] // n
im_small = np.zeros((w,h))
for i in range(w):
   for j in range(h):
      im_small[i,j] = im[n*i, n*j]
plt.figure(figsize=(20,20))
plt.imshow(im_small)
plt.show()
im_small = np.zeros((w,h))
for i in range(w):
   for j in range(h):
      im_small[i,j] = im_blurred[n*i, n*j]
plt.figure(figsize=(20,20))
plt.imshow(im_small)
plt.show()

Original Image
orig_umbc.png

    Image blurred with Gaussian Filter LPF blur_umbc.png

Down-sampled Image from the original image (with aliasing)

alias_umbc.png

Down-sampled Image from the blurred image (with anti-aliasing)anti_alias_umbc

Some Applications of DFT

 

0. Fourier Transform of a Gaussian Kernel is another Gaussian Kernel

Also, the spread in the frequency domain  inversely proportional to the spread in the spatial domain (known as Heisenberg’s inequality). Here is the proof:

f9.png

The following animation shows an example visualizing the Gaussian contours in spatial and corresponding frequency domains:

gaussian.gif

 

1. Using DFT to up-sample an image

  1. Let’s use the lena gray-scale image.
  2. First double the size of the by padding zero rows/columns at every alternate positions.
  3. Use FFT followed by an LPF.
  4. Finally use IFFT to get the output image.

 

The following code block shows the python code for implementing the steps listed above:


import numpy as np
import numpy.fft as fp
import matplotlib.pyplot as plt

im = np.mean(imread('images/lena.jpg'), axis=2)
im1 = np.zeros((2*im.shape[0], 2*im.shape[1]))
print(im.shape, im1.shape)
for i in range(im.shape[0]):
    for j in range(im.shape[1]):
        im1[2*i,2*j] = im[i,j]

def padwithzeros(vector, pad_width, iaxis, kwargs):
    vector[:pad_width[0]] = 0
    vector[-pad_width[1]:] = 0
    return vector

# the LPF kernel
kernel = [[0.25, 0.5, 0.25], [0.5, 1, 0.5], [0.25, 0.5, 0.25]]
# enlarge the kernel to the shape of the image
kernel = np.pad(kernel, (((im1.shape[0]-3)//2,(im1.shape[0]-3)//2+1), ((im1.shape[1]-3)//2,(im1.shape[1]-3)//2+1)), padwithzeros) 

plt.figure(figsize=(15,10))
plt.gray() # show the filtered result in grayscale

freq = fp.fft2(im1)
freq_kernel = fp.fft2(fp.ifftshift(kernel))
freq_LPF = freq*freq_kernel # by the Convolution theorem
im2 = fp.ifft2(freq_LPF)
freq_im2 = fp.fft2(im2)

plt.subplot(2,3,1)
plt.imshow(im)
plt.title('Original Image', size=20)
plt.subplot(2,3,2)
plt.imshow(im1)
plt.title('Padded Image', size=20)
plt.subplot(2,3,3)
plt.imshow( (20*np.log10( 0.1 + fp.fftshift(freq))).astype(int), cmap='jet')
plt.title('Original Image Spectrum', size=20)
plt.subplot(2,3,4)
plt.imshow( (20*np.log10( 0.1 + fp.fftshift(freq_kernel))).astype(int), cmap='jet')
plt.title('Image Spectrum of the LPF', size=20)
plt.subplot(2,3,5)
plt.imshow( (20*np.log10( 0.1 + fp.fftshift(freq_im2))).astype(int), cmap='jet')
plt.title('Image Spectrum after LPF', size=20)
plt.subplot(2,3,6)
plt.imshow(im2.astype(np.uint8)) # the imaginary part is an artifact
plt.title('Output Image', size=20)

The next figure shows the output. As can be seen from the next figure, the LPF removed the high frequency components from the Fourier spectrum of the padded image and with a subsequent inverse Fourier transform  we get a decent enlarged image.

lena_LPF_fft.png

 

2. Frequency Domain Gaussian Filter

  1. Use an input image and use DFT to create the frequency 2D-array.
  2. Create a small Gaussian 2D Kernel (to be used as an LPF) in the spatial domain and pad it to enlarge it to the image dimensions.
  3. Use DFT to obtain the Gaussian Kernel in the frequency domain.
  4. Use the Convolution theorem to convolve the LPF with the input image in the frequency domain.
  5. Use IDFT to obtain the output image.
  6. Plot the frequency spectrum of the image, the gaussian kernel and the image obtained after convolution in the frequency domain, in 3D.

The following code block shows the python code:


import matplotlib.pyplot as plt
from matplotlib import cm
from skimage.color import rgb2gray
from skimage.io import imread
import scipy.fftpack as fp

im = rgb2gray(imread('images/temple.jpg'))
kernel = np.outer(signal.gaussian(im.shape[0], 10), signal.gaussian(im.shape[1], 10))
freq = fp.fft2(im)
assert(freq.shape == kernel.shape)
freq_kernel = fp.fft2(fp.ifftshift(kernel))
convolved = freq*freq_kernel # by the Convolution theorem
im_blur = fp.ifft2(convolved).real
im_blur = 255 * im_blur / np.max(im_blur)

# center the frequency response
plt.imshow( (20*np.log10( 0.01 + fp.fftshift(freq_kernel))).real.astype(int), cmap='coolwarm')
plt.colorbar()
plt.show()

plt.figure(figsize=(20,20))
plt.imshow(im, cmap='gray')
plt.show()

from mpl_toolkits.mplot3d import Axes3D
from matplotlib.ticker import LinearLocator, FormatStrFormatter
# ... code for 3D visualization of the spectrums

The original color temple image (time / spatial domain)

temple.JPG

The temple image (frequency domain)

temple_freq

The Gaussian Kernel LPF in 2D (frequency domain)

gaussian_freq2d

The Gaussian Kernel LPF (frequency domain)
gaussian_freq

The smoothed temple image with the LPF (frequency domain)

temple_freq_LPF

If we set the standard deviation of the LPF Gaussian kernel to be 10 we get the following output as shown in the next figures. As can be seen, the frequency response value drops much quicker from the center.

The smoothed temple image with the LPF with higher s.d. (frequency domain)

temple_freq_LPF2

The output image after convolution (spatial / time domain)

temple_blurred

3. Using the inverse filter to restore a motion-blurred image

  1. First create a motion blur kernel of a given shape.
  2. Convolve the kernel with an input image in the frequency domain.
  3. Get the motion-blurred image in the spatial domain with IDFT.
  4. Compute the inverse filter kernel and convolve with the blurred image in the frequency domain.
  5. Get the convolved image back in the spatial domain.
  6. Plot all the images and kernels in the frequency domain.

The following code block shows the python code:


im = rgb2gray(imread('../images/madurga.jpg'))
# create the motion blur kernel
size = 21
kernel = np.zeros((size, size))
kernel[int((size-1)/2), :] = np.ones(size)
kernel = kernel / size
kernel = np.pad(kernel, (((im.shape[0]-size)//2,(im.shape[0]-size)//2+1), ((im.shape[1]-size)//2,(im.shape[1]-size)//2+1)), padwithzeros)

freq = fp.fft2(im)
freq_kernel = fp.fft2(fp.ifftshift(kernel))
convolved1 = freq1*freq_kernel1
im_blur = fp.ifft2(convolved1).real
im_blur = im_blur / np.max(im_blur)

epsilon = 10**-6

freq = fp.fft2(im_blur)
freq_kernel = 1 / (epsilon + freq_kernel1)

convolved = freq*freq_kernel
im_restored = fp.ifft2(convolved).real
im_restored = im_restored / np.max(im_restored)

plt.figure(figsize=(18,12))
plt.subplot(221)
plt.imshow(im)
plt.title('Original image', size=20)
plt.axis('off')
plt.subplot(222)
plt.imshow(im_blur)
plt.title('Blurred image with motion blur kernel', size=20)
plt.axis('off')
plt.subplot(223)
plt.imshow(im_restored)
plt.title('Restored image with inverse filter', size=20)
plt.axis('off')
plt.subplot(224)
plt.imshow(im_restored - im)
plt.title('Diff restored & original image', size=20)
plt.axis('off')
plt.show()

# Plot the surface of the frequency responses here

madurga_inverse

Frequency response of the input image

madurga_freq.png

(log) Frequency response of the motion blur kernel (LPF)

motion_blur_freq.png

Input image convolved with the motion blur kernel (frequency domain)

madurga_convolved1

(log) Frequency response of the inverse frequency filter kernel (HPF)

motion_blur_freq

Motion-blurred image convolved with the inverse frequency filter kernel (frequency domain)

madurga_convolved.png

 

4. Impact of noise on the inverse filter

  1. Add some random noise to the Lena image.
  2. Blur the image with a Gaussian kernel.
  3. Restore the image using inverse filter.

With the original image

Let’s first blur and apply the inverse filter on the noiseless blurred image. The following figures show the outputs:

lena_inverse

(log) Frequency response of the input imagelena_freq.png

(log) Frequency response of the Gaussian blur kernel (LPF)lena_gaussian_freq

(log) Frequency response of the blurred image

lena_blurred_spectrum.png

(log) Frequency response of the inverse kernel (HPF)

gauss_inverse_kernel.png

Frequency response of the output imagelena_inverse_spectrum

Adding noise to the original image

The following python code can be used to add Gaussian noise to an image:


from skimage.util import random_noise
im = random_noise(im, var=0.1)


The next figures show the noisy lena image, the blurred image with a Gaussian Kernel and the restored image with the inverse filter. As can be seen, being a high-pass filter, the inverse filter enhances the noise, typically corresponding to high frequencies.

 

lena_noisy_inverse

5. Use a notch filter to remove periodic noise from the following half-toned car image.

halftone.png

  1. Use DFT to obtain the frequency spectrum of the image.
  2. Block the high frequency components that are most likely responsible fro noise.
  3. Use IDFT to come back to the spatial domain.

 


from scipy import fftpack
im = imread('images/halftone.png')
F1 = fftpack.fft2((im).astype(float))
F2 = fftpack.fftshift(F1)
for i in range(60, w, 135):
   for j in range(100, h, 200):
     if not (i == 330 and j == 500):
        F2[i-10:i+10, j-10:j+10] = 0
for i in range(0, w, 135):
   for j in range(200, h, 200):
     if not (i == 330 and j == 500):
        F2[max(0,i-15):min(w,i+15), max(0,j-15):min(h,j+15)] = 0
plt.figure(figsize=(6.66,10))
plt.imshow( (20*np.log10( 0.1 + F2)).astype(int), cmap=plt.cm.gray)
plt.show()
im1 = fp.ifft2(fftpack.ifftshift(F2)).real
plt.figure(figsize=(10,10))
plt.imshow(im1, cmap='gray')
plt.axis('off')
plt.show()


Frequency response of the input image
car_freq

Frequency response of the input image with blocked frequencies with notch

car_notch_blocked

Output image

car_notch

With a low-pass-filter (LPF):

Frequency response of the input image with blocked frequencies with LPF
car_freq_notch

Output imagecar_notch

Histogram Matching with color images

As described here, here is the algorithm:

  1. The cumulative histogram is computed for each image dataset, see the figure below.
  2. For any particular value (xi) in the input image data to be adjusted has a cumulative histogram value given by G(xi).
  3. This in turn is the cumulative distribution value in the reference (template) image  dataset, namely H(xj). The input data value xi is replaced by xj.

 

im = imread('images/lena.jpg')
im_t = imread('images/vstyle.png')
im1 = np.zeros(im.shape).astype(np.uint8)
plt.figure(figsize=(20,10))
for i in range(3):
    c = cdf(im[...,i])
    c_t = cdf(im_t[...,i])
    im1[...,i] = hist_matching(c, c_t, im[...,i]) # implement this function with the above algorithm
    c1 = cdf(im1[...,i])
    col = 'r' if i == 0 else ('g' if i == 1 else 'b')
    plt.plot(np.arange(256), c, col + ':', label='input ' + col.upper(), linewidth=5)
    plt.plot(np.arange(256), c_t, col + '--', label='template ' + col.upper(), linewidth=5)
    plt.plot(np.arange(256), c1, col + '-', label='output ' + col.upper(), linewidth=2)
plt.title('CDF', size=20)
plt.legend(prop={'size': 15})
plt.show()

plt.figure(figsize=(10,10))
plt.imshow(im1[...,:3])
plt.axis('off')
plt.show()

Input image                                                 
lena

Template Imagevstyle

Output Image
lena_starry

The following figure shows how the histogram of the input image is matched with the histogram of the template image.

lena_starry_cdf

Another example:

Input image                                                 

me1

Template Image

treebirds

Output Image

me_t

me_hm

Mathematical Morphology

1. Automatically cropping an image

  1. Let’s use the following image. The image has unnecessary white background  outside the molecule of the organic compound.L_2d
  2. First convert the image to a binary image and compute the convex hull of the molecule object.
  3. Use the convex hull image to find the bounding box for cropping.
  4. Crop the original image with the bounding box.

The next python code shows how to implement the above steps:


from PIL import Image
from skimage.io import imread
from skimage.morphology import convex_hull_image
im = imread('../images/L_2d.jpg')
plt.imshow(im)
plt.title('input image')
plt.show()
im1 = 1 - rgb2gray(im)
threshold = 0.5
im1[im1  threshold] = 1
chull = convex_hull_image(im1)
plt.imshow(chull)
plt.title('convex hull in the binary image')
plt.show()
imageBox = Image.fromarray((chull*255).astype(np.uint8)).getbbox()
cropped = Image.fromarray(im).crop(imageBox)
cropped.save('L_2d_cropped.jpg')
plt.imshow(cropped)
plt.title('cropped image')
plt.show()

chem1chem2chem3

This can also be found here.

2. Opening and Closing are Dual operations in mathematical morphology

  1. Start with a binary image and apply opening operation with some structuring element (e.g., a disk) on it to obtain an output image.
  2. Invert the image (to change the foreground to background and vice versa) and apply closing operation on it with the same structuring element to obtain another output image.
  3. Invert the second output image obtained and observe that it’s same as the first output image.
  4. Thus applying opening operation to the foreground of a binary image is equivalent to applying closing operation to the background of the same image with the same structuring element.

The next python code shows the implementation of the above steps.


from skimage.morphology import binary_opening, binary_closing, disk
from skimage.util import invert
im = rgb2gray(imread('../new images/circles.jpg'))
im[im  0.5] = 1
plt.gray()
plt.figure(figsize=(20,10))
plt.subplot(131)
plt.imshow(im)
plt.title('original', size=20)
plt.axis('off')
plt.subplot(1,3,2)
im1 = binary_opening(im, disk(12))
plt.imshow(im1)
plt.title('opening with disk size ' + str(12), size=20)
plt.axis('off')
plt.subplot(1,3,3)
im1 = invert(binary_closing(invert(im), disk(12)))
plt.imshow(im1)
plt.title('closing with disk size ' + str(12), size=20)
plt.axis('off')
plt.show()

As can be seen the output images obtained are exactly same.

opening_closing_dual

Floyd-Steinberg Dithering (to convert a grayscale to a binary image)

The next figure shows the algorithm for error diffusion dithering.

fsa.png


def find_closest_palette_color(oldpixel):
    return int(round(oldpixel / 255)*255)

im = rgb2gray(imread('../my images/godess.jpg'))*255
pixel = np.copy(im)
w, h = im.shape

for x in range(w):
    for y in range(h):
        oldpixel = pixel[x][y]
        newpixel = find_closest_palette_color(oldpixel)
        pixel[x][y] = newpixel
        quant_error = oldpixel - newpixel
        if x + 1 < w-1:
          pixel[x + 1][y] = pixel[x + 1][y] + quant_error * 7 / 16
        if x > 0 and y < h-1:
           pixel[x - 1][y + 1] = pixel[x - 1][y + 1] + quant_error * 3 / 16
        if y < h-1:
           pixel[x ][y + 1] = pixel[x ][y + 1] + quant_error * 5 / 16
        if x < w-1 and y < h-1:
           pixel[x + 1][y + 1] = pixel[x + 1][y + 1] + quant_error * 1 / 16

plt.figure(figsize=(10,20))
plt.imshow(pixel, cmap='gray')
plt.axis('off')
plt.show()

The input image (gray-scale)

godess_gray.jpg

The output Image (binary)

godess_binary.png

The next animation shows how an another input grayscale image gets converted to output binary image using the error diffusion dithering.

swan.gif

Sharpen a color image

  1. First blur the image with an LPF (e.g., Gaussian Filter).
  2. Compute the detail image as the difference between the original and the blurred image.
  3. Now the sharpened image can be computed as a linear combination of the original image and the detail  image. The next figure illustrates the concept.

f8

The next python code shows how this can be implemented in python:


from scipy import misc, ndimage
import matplotlib.pyplot as plt
import numpy as np

def rgb2gray(im):
    return np.clip(0.2989 * im[...,0] + 0.5870 * im[...,1] + 0.1140 * im[...,2], 0, 1)

im = misc.imread('../my images/me.jpg')/255
im_blurred = ndimage.gaussian_filter(im, (5,5,0))
im_detail = np.clip(im - im_blurred, 0, 1)
fig, axes = plt.subplots(nrows=2, ncols=3, sharex=True, sharey=True, figsize=(15, 15))
axes = axes.ravel()
axes[0].imshow(im)
axes[0].set_title('Original image', size=15)
axes[1].imshow(im_blurred)
axes[1].set_title('Blurred image, sigma=5', size=15)
axes[2].imshow(im_detail)
axes[2].set_title('Detail image', size=15)
alpha = [1, 5, 10]
for i in range(3):
    im_sharp = np.clip(im + alpha[i]*im_detail, 0, 1)
    axes[3+i].imshow(im_sharp)
    axes[3+i].set_title('Sharpened image, alpha=' + str(alpha[i]), size=15)
for ax in axes:
    ax.axis('off')
fig.tight_layout()
plt.show()

The next figure shows the output of the above code block. As cane be seen, the output gets more sharpened as the value of alpha gets increased.

me_unsharp2.png

The next animation shows how the image gets more and more sharpened with increasing alpha.

me_sharp

Edge Detection with LOG and Zero-Crossing Algorithm by Marr and Hildreth

The following figure shows LOG filter and its DOG approximation.

f1.png

LOG.png

In order to detect edges as a binary image, finding the zero-crossings in the LoG-convolved image was proposed by Marr and Hildreth. Identification of the edge pixels can be done by viewing the sign of the LoG-smoothed image by defining it as a binary image, the algorithm is as follows:

Algorithm to compute the zero-crossing 

  1. First convert the LOG-convolved image to a binary image, by replacing the pixel values by 1 for positive values and 0 for negative values.
  2. In order to compute the zero crossing pixels, we need to simply look at the boundaries of the non-zero regions in this binary image.
  3. Boundaries can be found by finding any non-zero pixel that has an immediate neighbor which is is zero.
  4. Hence, for each pixel, if it is non-zero, consider its 8 neighbors, if any of the neighboring pixels is zero, the pixel can be identified as an edge.

The next python code and the output images / animations generated show how to detect the edges from the  zebra image with LOG + zero-crossings:


from scipy import ndimage, misc
import matplotlib.pyplot as plt
from scipy.misc import imread
from skimage.color import rgb2gray
def any_neighbor_zero(img, i, j):
    for k in range(-1,2):
      for l in range(-1,2):
         if img[i+k, j+k] == 0:
            return True
    return False
def zero_crossing(img):
  img[img > 0] = 1
  img[img  0 and any_neighbor_zero(img, i, j):
        out_img[i,j] = 255
  return out_img

img = rgb2gray(imread('../images/zebras.jpg'))

fig = plt.figure(figsize=(25,15))
plt.gray() # show the filtered result in grayscale

for sigma in range(2,10, 2):
plt.subplot(2,2,sigma/2)
result = ndimage.gaussian_laplace(img, sigma=sigma)
plt.imshow(zero_crossing(result))
plt.axis('off')
plt.title('LoG with zero-crossing, sigma=' + str(sigma), size=30)

plt.tight_layout()
plt.show()

Original Input Imagezebras.jpg

Output with edges detected with LOG + zero-crossing at different sigma scales
zera_log_zc.png

zebra_log

With another input image
me6.jpg

Output with edges detected with LOG + zero-crossing at different sigma scales

me6_log_zc.png

me_log.gif

Constructing the Gaussian Pyramid with scikit-image transform module’s reduce function and Laplacian Pyramid from the Gaussian Pyramid and the expand function

The Gaussian Pyramid can be computed with the following steps:

  1. Start with the original image.
  2. Iteratively compute the image at each level of the pyramid first by smoothing the image (with gaussian filter) and then downsampling it .
  3. Stop at a level where the image size becomes sufficiently small (e.g., 1×1).

The Laplacian Pyramid can be computed with the following steps:

  1. Start with the Gaussian Pyramid and with the smallest image.
  2. Iteratively compute the difference image in between the image at the current level and the image obtained by first upsampling and then smoothing the image (with gaussian filter) from the previous level of the Gaussian Pyramid.
  3. Stop at a level where the image size becomes equal to the original image size.

The next python code shows how to create a Gaussian Pyramid from an image.

import numpy as np
import matplotlib.pyplot as plt
from skimage.io import imread
from skimage.color import rgb2gray
from skimage.transform import pyramid_reduce, pyramid_expand, resize

def get_gaussian_pyramid(image):
    rows, cols, dim = image.shape
    gaussian_pyramid = [image]
    while rows> 1 and cols > 1:
        image = pyramid_reduce(image, downscale=2)
        gaussian_pyramid.append(image)
        print(image.shape)
        rows //= 2
        cols //= 2
    return gaussian_pyramid

def get_laplacian_pyramid(gaussian_pyramid):
    laplacian_pyramid = [gaussian_pyramid[len(gaussian_pyramid)-1]]
    for i in range(len(gaussian_pyramid)-2, -1, -1):
        image = gaussian_pyramid[i] - resize(pyramid_expand(gaussian_pyramid[i+1]), gaussian_pyramid[i].shape)
        laplacian_pyramid.append(np.copy(image))
    laplacian_pyramid = laplacian_pyramid[::-1]
    return laplacian_pyramid

image = imread('../images/antelops.jpeg')
gaussian_pyramid = get_gaussian_pyramid(image)
laplacian_pyramid = get_laplacian_pyramid(gaussian_pyramid)

w, h = 20, 12
for i in range(3):
    plt.figure(figsize=(w,h))
    p = gaussian_pyramid[i]
    plt.imshow(p)
    plt.title(str(p.shape[0]) + 'x' + str(p.shape[1]), size=20)
    plt.axis('off')
    w, h = w / 2, h / 2
    plt.show()

w, h = 10, 6
for i in range(1,4):
    plt.figure(figsize=(w,h))
    p = laplacian_pyramid[i]
    plt.imshow(rgb2gray(p), cmap='gray')
    plt.title(str(p.shape[0]) + 'x' + str(p.shape[1]), size=20)
    plt.axis('off')
    w, h = w / 2, h / 2
    plt.show()

 

Some images from the Gaussian Pyramid

a1
a2
a3
a4

Some images from the Laplacian Pyramid

a5a6

a7

Blending images with Gaussian and Laplacian pyramids

Here is the algorithm:

f6.png

Blending the following input images A, B with mask image M

Input Image A (Goddess Durga)madurga1

Input Image B (Lord Shiva)
mahadeb

Mask Image M
mask1

 

with the following python code creates the output image I shown below


A = imread('../images/madurga1.jpg')/255
B = imread('../images/mahadeb.jpg')/255
M = imread('../images/mask1.jpg')/255
# get the Gaussian and Laplacian pyramids, implement the functions
pyramidA = get_laplacian_pyramid(get_gaussian_pyramid(A))
pyramidB = get_laplacian_pyramid(get_gaussian_pyramid(B))
pyramidM = get_gaussian_pyramid(M)

pyramidC = []
for i in range(len(pyramidM)):
   im = pyramidM[i]*pyramidA[i] + (1-pyramidM[i])*pyramidB[i]
   pyramidC.append(im)
# implement the following function to construct an image from its laplacian pyramids
I = reconstruct_image_from_laplacian_pyramid(pyramidC)

Output Image I (Ardhanarishwar)
ardhanariswara_output.pngThe following animation shows how the output image is formed:

ardhanariswar

 

ardhanariswar1

Another blending (horror!) example (from prof. dmartin)

horrorhorror1

 

Advertisements