In this article a few popular image processing problems along with their solutions are going to be discussed. Python image processing libraries are going to be used to solve these problems. Some of the problems are from the exercises from this book (available on amazon).

## Image Transformations and Warping

### 0. Display RGB image color channels in 3D

- A
*gray-scale*image can be thought of a 2-D function*f(x,y)*of the pixel locations*(x,y),*that maps each pixel into its corresponding*gray level*(an integer in [0,255], e.g.,). - For an RGB image there are 3 such functions,
*f_R(x,y), f_G(x.y), f_B(x.y)*. - matplotlib’s 3-D plot functions can be used to plot each function.

The following python code shows how to plot the RGB channels separately in 3D:

import matplotlib.pylab as plt from mpl_toolkits.mplot3d import Axes3D def plot_3d(X, Y, Z, title, cmap): # implement this function to plot the channel pixel values in 3D plt.show() im = imread('../images/parrot.jpg') Y = np.arange(im.shape[0]) X = np.arange(im.shape[1]) Z1 = im[...,0] Z2 = im[...,1] Z3 = im[...,2] plot_3d(X, Y, Z1, cmap='Reds', title='3D plot for the Red Channel') plot_3d(X, Y, Z2, cmap='Greens', title='3D plot for the Green Channel') plot_3d(X, Y, Z3, cmap='Blues', title='3D plot for the Blue Channel')

**The RGB image
**

### 1. Wave Transform

- Use
*scikit-image’s warp()*function to implement the*wave*transform. - Note that wave transform can be expressed with the following equations:

We shall use the *madrill* image to implement the wave transform. The next python code fragment shows how to do it:

def wave(xy): xy[:, 1] += 20*np.sin(2*np.pi*xy[:, 0]/64) return xy from skimage.io import imread from skimage.transform import warp import matplotlib.pylab as plt im = imread('images/mandrill.jpg') im = warp(im, wave) plt.imshow(im) plt.show()

The next figure shows the original *mandrill* input image and the output image obtained after applying the *wave* transform.

### 2. Swirl Transform

- Use
*scikit-image’s warp()*function to implement the*swirl*transform. - Note that swirl transform can be expressed with the following equations

We shall use the *madrill* image to implement the wave transform. The next python code fragment shows how to do it:

def swirl(xy, x0, y0, R): r = np.sqrt((xy[:,1]-x0)**2 + (xy[:,0]-y0)**2) a = np.pi * r / R xy[:, 1] = (xy[:, 1]-x0)*np.cos(a) + (xy[:, 0]-y0)*np.sin(a) + x0 xy[:, 0] = -(xy[:, 1]-x0)*np.sin(a) + (xy[:, 0]-y0)*np.cos(a) + y0 return xy im = imread('../images/mandrill.jpg') im = warp(im, swirl, map_args={'x0':112, 'y0':112, 'R':512}) plt.imshow(im) plt.axis('off') plt.show()

The next figure shows the original *mandrill* input image and the output image obtained after applying the *swirl *transform.

Compare this with the output of the *scikit-image swirl()* function.

### 3. Very simple Face morphing with α-blending

- Start from one face image (e.g., let image
_{1}be the face of Messi) and end into another image (let image_{2}be the face of Ronaldo) iteratively, creating some intermediate images in between. - At each iteration create an image by using a
*linear combination*of the two image*numpy ndarrays*given by

*α*from 0 to 1.

The following code block shows how to implement it using matplotlib’s image and pylab modules.

im1 = mpimg.imread("../images/messi.jpg") / 255 # scale RGB values in [0,1] im2 = mpimg.imread("../images/ronaldo.jpg") / 255 i = 1 plt.figure(figsize=(18,15)) for alpha in np.linspace(0,1,20): plt.subplot(4,5,i) plt.imshow((1-alpha)*im1 + alpha*im2) plt.axis('off') i += 1 plt.subplots_adjust(wspace=0.05, hspace=0.05) plt.show()

The next animation shows the simple face morphing:

There are more sophisticated techniques to improve the quality of morphing, but this is the simplest one.

### 4. Creating Instagram-like Gotham Filter

#### The Gotham filter

The Gotham filter is computed as follows (the steps taken from here), applying the following operations on an image, the corresponding python code, input and output images are shown along with the operations (with the following input image):

- A mid-tone red contrast boost
from PIL import Image import numpy as np import matplotlib.pylab as plt im = Image.open('../images/city.jpg') # pixel values in [0,255] r, g, b = im.split() red_levels = [0., 12.75, 25.5, 51., 76.5, 127.5, 178.5, 204., 229.5, 242.25, 255.] r1 = Image.fromarray((np.reshape(np.interp(np.array(r).ravel(), np.linspace(0,255,len(red_levels)), red_levels), (im.height, im.width))).astype(np.uint8), mode='L') plt.figure(figsize=(20,15)) plt.subplot(221) plt.imshow(im) plt.title('original', size=20) plt.axis('off') plt.subplot(222) im1 = Image.merge('RGB', (r1, g, b)) plt.imshow(im1) plt.axis('off') plt.title('with red channel interpolation', size=20) plt.subplot(223) plt.hist(np.array(r).ravel(), normed=True) plt.subplot(224) plt.hist(np.array(r1).ravel(), normed=True) plt.show()

- Make the blacks a little bluer
plt.figure(figsize=(20,10)) plt.subplot(121) plt.imshow(im1) plt.title('last image', size=20) plt.axis('off') b1 = Image.fromarray(np.clip(np.array(b) + 7.65, 0, 255).astype(np.uint8)) im1 = Image.merge('RGB', (r1, g, b1)) plt.subplot(122) plt.imshow(im1) plt.axis('off') plt.title('with transformation', size=20) plt.tight_layout() plt.show()

- A small sharpening
from PIL.ImageEnhance import Sharpness plt.figure(figsize=(20,10)) plt.subplot(121) plt.imshow(im1) plt.title('last image', size=20) plt.axis('off') im2 = Sharpness(im1).enhance(3.0) plt.subplot(122) plt.imshow(im2) plt.axis('off') plt.title('with transformation', size=20) plt.tight_layout() plt.show()

- A boost in blue channel for lower mid-tones
- A decrease in blue channel for upper mid-tones
blue_levels = [0., 11.985, 30.09, 64.005, 81.09, 99.96, 107.1, 111.945, 121.125, 143.055, 147.9, 159.885, 171.105, 186.915, 215.985, 235.875, 255.] b2 = Image.fromarray((np.reshape(np.interp(np.array(b1).ravel(), np.linspace(0,255,len(blue_levels)), blue_levels), (im.height, im.width))).astype(np.uint8), mode='L') plt.figure(figsize=(20,15)) plt.subplot(221) plt.imshow(im2) plt.title('last image', size=20) plt.axis('off') plt.subplot(222) im3 = Image.merge('RGB', (r1, g, b2)) plt.imshow(im3) plt.axis('off') plt.title('with blue channel interpolation', size=20) plt.subplot(223) plt.hist(np.array(b1).ravel(), normed=True) plt.subplot(224) plt.hist(np.array(b2).ravel(), normed=True) plt.show()

The output image obtained after applying the *Gotham filter* is shown below:

## Down-sampling with anti-aliasing using Gaussian Filter

- Start with a large gray-scale image and reduce the image size 16 times, by reducing both height and width by 4 times.
- Select every 4th pixel in the x and the y direction from the original image to compute the values of the pixels in the smaller image.
- Before down-sampling apply a Gaussian filter (to smooth the image) for anti-aliasing.
- Compare the quality of the output image obtained by down-sampling without a Gaussian filter (with aliasing).

The next code block performs the above steps. Since the Gaussian blur is a low-pass filter, it removes the high frequencies from the original input image, hence it’s possible to achieve sampling rate above the **Nyquist** rate (by **sampling theorem**) to avoid **aliasing**.

from scipy.ndimage import gaussian_filter im = rgb2gray(imread('images/umbc.png')) print(im.shape) plt.figure(figsize=(20,20)) plt.imshow(im) plt.show() plt.figure(figsize=(20,20)) im_blurred = gaussian_filter(im, sigma=2.5) #(5,5,1) plt.imshow(im_blurred) plt.show() n = 4 # create and image 16 times smaller in size w, h = im.shape[0] // n, im.shape[1] // n im_small = np.zeros((w,h)) for i in range(w): for j in range(h): im_small[i,j] = im[n*i, n*j] plt.figure(figsize=(20,20)) plt.imshow(im_small) plt.show() im_small = np.zeros((w,h)) for i in range(w): for j in range(h): im_small[i,j] = im_blurred[n*i, n*j] plt.figure(figsize=(20,20)) plt.imshow(im_small) plt.show()

**Original Image**

** Image blurred with Gaussian Filter LPF**

** Down-sampled Image from the original image (with aliasing)**

** Down-sampled Image from the blurred image (with anti-aliasing)**

## Some Applications of DFT

### 0. Fourier Transform of a Gaussian Kernel is another Gaussian Kernel

Also, the spread in the frequency domain inversely proportional to the spread in the spatial domain (known as **Heisenberg’s inequality**). Here is the proof:

The following animation shows an example visualizing the Gaussian contours in spatial and corresponding frequency domains:

### 1. Using DFT to up-sample an image

- Let’s use the
*lena gray-scale*image. - First double the size of the by padding zero rows/columns at every alternate positions.
- Use
*FFT*followed by an*LPF.* - Finally use
*IFFT*to get the output image.

The following code block shows the python code for implementing the steps listed above:

import numpy as np import numpy.fft as fp import matplotlib.pyplot as plt im = np.mean(imread('images/lena.jpg'), axis=2) im1 = np.zeros((2*im.shape[0], 2*im.shape[1])) print(im.shape, im1.shape) for i in range(im.shape[0]): for j in range(im.shape[1]): im1[2*i,2*j] = im[i,j] def padwithzeros(vector, pad_width, iaxis, kwargs): vector[:pad_width[0]] = 0 vector[-pad_width[1]:] = 0 return vector # the LPF kernel kernel = [[0.25, 0.5, 0.25], [0.5, 1, 0.5], [0.25, 0.5, 0.25]] # enlarge the kernel to the shape of the image kernel = np.pad(kernel, (((im1.shape[0]-3)//2,(im1.shape[0]-3)//2+1), ((im1.shape[1]-3)//2,(im1.shape[1]-3)//2+1)), padwithzeros) plt.figure(figsize=(15,10)) plt.gray() # show the filtered result in grayscale freq = fp.fft2(im1) freq_kernel = fp.fft2(fp.ifftshift(kernel)) freq_LPF = freq*freq_kernel # by the Convolution theorem im2 = fp.ifft2(freq_LPF) freq_im2 = fp.fft2(im2) plt.subplot(2,3,1) plt.imshow(im) plt.title('Original Image', size=20) plt.subplot(2,3,2) plt.imshow(im1) plt.title('Padded Image', size=20) plt.subplot(2,3,3) plt.imshow( (20*np.log10( 0.1 + fp.fftshift(freq))).astype(int), cmap='jet') plt.title('Original Image Spectrum', size=20) plt.subplot(2,3,4) plt.imshow( (20*np.log10( 0.1 + fp.fftshift(freq_kernel))).astype(int), cmap='jet') plt.title('Image Spectrum of the LPF', size=20) plt.subplot(2,3,5) plt.imshow( (20*np.log10( 0.1 + fp.fftshift(freq_im2))).astype(int), cmap='jet') plt.title('Image Spectrum after LPF', size=20) plt.subplot(2,3,6) plt.imshow(im2.astype(np.uint8)) # the imaginary part is an artifact plt.title('Output Image', size=20)

The next figure shows the output. As can be seen from the next figure, the LPF removed the high frequency components from the Fourier spectrum of the padded image and with a subsequent inverse Fourier transform we get a decent enlarged image.

### 2. Frequency Domain Gaussian Filter

- Use an input image and use DFT to create the frequency 2D-array.
- Create a small Gaussian 2D Kernel (to be used as an LPF) in the spatial domain and pad it to enlarge it to the image dimensions.
- Use DFT to obtain the Gaussian Kernel in the frequency domain.
- Use the Convolution theorem to convolve the LPF with the input image in the frequency domain.
- Use IDFT to obtain the output image.
- Plot the frequency spectrum of the image, the gaussian kernel and the image obtained after convolution in the frequency domain, in 3D.

The following code block shows the python code:

import matplotlib.pyplot as plt from matplotlib import cm from skimage.color import rgb2gray from skimage.io import imread import scipy.fftpack as fp im = rgb2gray(imread('images/temple.jpg')) kernel = np.outer(signal.gaussian(im.shape[0], 10), signal.gaussian(im.shape[1], 10)) freq = fp.fft2(im) assert(freq.shape == kernel.shape) freq_kernel = fp.fft2(fp.ifftshift(kernel)) convolved = freq*freq_kernel # by the Convolution theorem im_blur = fp.ifft2(convolved).real im_blur = 255 * im_blur / np.max(im_blur) # center the frequency response plt.imshow( (20*np.log10( 0.01 + fp.fftshift(freq_kernel))).real.astype(int), cmap='coolwarm') plt.colorbar() plt.show() plt.figure(figsize=(20,20)) plt.imshow(im, cmap='gray') plt.show() from mpl_toolkits.mplot3d import Axes3D from matplotlib.ticker import LinearLocator, FormatStrFormatter # ... code for 3D visualization of the spectrums

#### The original color temple image (time / spatial domain)

**The temple image (frequency domain)**

#### The Gaussian Kernel LPF in 2D (frequency domain)

#### The Gaussian Kernel LPF (frequency domain)

#### The smoothed temple image with the LPF (frequency domain)

If we set the standard deviation of the LPF Gaussian kernel to be 10 we get the following output as shown in the next figures. As can be seen, the frequency response value drops much quicker from the center.

#### The smoothed temple image with the LPF with higher s.d. (frequency domain)

**The output image after convolution (spatial / time domain)**

3. **Using the inverse filter to restore a motion-blurred image**

- First create a motion blur kernel of a given shape.
- Convolve the kernel with an input image in the frequency domain.
- Get the motion-blurred image in the spatial domain with IDFT.
- Compute the inverse filter kernel and convolve with the blurred image in the frequency domain.
- Get the convolved image back in the spatial domain.
- Plot all the images and kernels in the frequency domain.

The following code block shows the python code:

im = rgb2gray(imread('../images/madurga.jpg')) # create the motion blur kernel size = 21 kernel = np.zeros((size, size)) kernel[int((size-1)/2), :] = np.ones(size) kernel = kernel / size kernel = np.pad(kernel, (((im.shape[0]-size)//2,(im.shape[0]-size)//2+1), ((im.shape[1]-size)//2,(im.shape[1]-size)//2+1)), padwithzeros) freq = fp.fft2(im) freq_kernel = fp.fft2(fp.ifftshift(kernel)) convolved1 = freq1*freq_kernel1 im_blur = fp.ifft2(convolved1).real im_blur = im_blur / np.max(im_blur) epsilon = 10**-6 freq = fp.fft2(im_blur) freq_kernel = 1 / (epsilon + freq_kernel1) convolved = freq*freq_kernel im_restored = fp.ifft2(convolved).real im_restored = im_restored / np.max(im_restored) plt.figure(figsize=(18,12)) plt.subplot(221) plt.imshow(im) plt.title('Original image', size=20) plt.axis('off') plt.subplot(222) plt.imshow(im_blur) plt.title('Blurred image with motion blur kernel', size=20) plt.axis('off') plt.subplot(223) plt.imshow(im_restored) plt.title('Restored image with inverse filter', size=20) plt.axis('off') plt.subplot(224) plt.imshow(im_restored - im) plt.title('Diff restored &amp;amp;amp;amp;amp; original image', size=20) plt.axis('off') plt.show() # Plot the surface of the frequency responses here

**Frequency response of the input image**

**(log) Frequency response of the motion blur kernel (LPF)**

**Input image convolved with the motion blur kernel (frequency domain)**

**(log) Frequency response of the inverse frequency filter kernel (HPF)**

**Motion-blurred image convolved with the inverse frequency filter kernel (frequency domain)**

**4. Impact of noise on the inverse filter**

- Add some random noise to the Lena image.
- Blur the image with a Gaussian kernel.
- Restore the image using inverse filter.

**With the original image**

Let’s first blur and apply the inverse filter on the noiseless blurred image. The following figures show the outputs:

**(log) Frequency response of the input image**

**(log) Frequency response of the Gaussian blur kernel (LPF)**

**(log) Frequency response of the blurred image**

**(log) Frequency response of the inverse kernel (HPF)**

**Frequency response of the output image**

**Adding noise to the original image**

The following python code can be used to add Gaussian noise to an image:

from skimage.util import random_noise im = random_noise(im, var=0.1)

**
**The next figures show the noisy lena image, the blurred image with a Gaussian Kernel and the restored image with the inverse filter. As can be seen, being a high-pass filter, the inverse filter enhances the noise, typically corresponding to high frequencies.

#### 5. Use a notch filter to remove periodic noise from the following half-toned car image.

- Use DFT to obtain the frequency spectrum of the image.
- Block the high frequency components that are most likely responsible fro noise.
- Use IDFT to come back to the spatial domain.

from scipy import fftpack im = imread('images/halftone.png') F1 = fftpack.fft2((im).astype(float)) F2 = fftpack.fftshift(F1) for i in range(60, w, 135): for j in range(100, h, 200): if not (i == 330 and j == 500): F2[i-10:i+10, j-10:j+10] = 0 for i in range(0, w, 135): for j in range(200, h, 200): if not (i == 330 and j == 500): F2[max(0,i-15):min(w,i+15), max(0,j-15):min(h,j+15)] = 0 plt.figure(figsize=(6.66,10)) plt.imshow( (20*np.log10( 0.1 + F2)).astype(int), cmap=plt.cm.gray) plt.show() im1 = fp.ifft2(fftpack.ifftshift(F2)).real plt.figure(figsize=(10,10)) plt.imshow(im1, cmap='gray') plt.axis('off') plt.show()

Frequency response of the input image

**Frequency response of the input image with blocked frequencies with notch**

**Output image**

With a low-pass-filter (LPF):

**Frequency response of the input image with blocked frequencies with LPF**

**Output image**

## Histogram Matching with color images

As described here, here is the algorithm:

- The cumulative histogram is computed for each image dataset, see the figure below.
- For any particular value (x
_{i}) in the input image data to be adjusted has a cumulative histogram value given by G(x_{i}). - This in turn is the cumulative distribution value in the reference (template) image dataset, namely H(x
_{j}). The input data value x_{i}is replaced by x_{j}.

im = imread('images/lena.jpg') im_t = imread('images/vstyle.png') im1 = np.zeros(im.shape).astype(np.uint8) plt.figure(figsize=(20,10)) for i in range(3): c = cdf(im[...,i]) c_t = cdf(im_t[...,i]) im1[...,i] = hist_matching(c, c_t, im[...,i]) # implement this function with the above algorithm c1 = cdf(im1[...,i]) col = 'r' if i == 0 else ('g' if i == 1 else 'b') plt.plot(np.arange(256), c, col + ':', label='input ' + col.upper(), linewidth=5) plt.plot(np.arange(256), c_t, col + '--', label='template ' + col.upper(), linewidth=5) plt.plot(np.arange(256), c1, col + '-', label='output ' + col.upper(), linewidth=2) plt.title('CDF', size=20) plt.legend(prop={'size': 15}) plt.show() plt.figure(figsize=(10,10)) plt.imshow(im1[...,:3]) plt.axis('off') plt.show()

**Input image **

** Template Image**

**Output Image
**

The following figure shows how the histogram of the input image is matched with the histogram of the template image.

Another example:

**Input image **

**Template Image**

**Output Image
**

## Mathematical Morphology

### 1. Automatically cropping an image

- Let’s use the following image. The image has unnecessary white background outside the molecule of the organic compound.
- First convert the image to a binary image and compute the convex hull of the molecule object.
- Use the convex hull image to find the bounding box for cropping.
- Crop the original image with the bounding box.

The next python code shows how to implement the above steps:

from PIL import Image from skimage.io import imread from skimage.morphology import convex_hull_image im = imread('../images/L_2d.jpg') plt.imshow(im) plt.title('input image') plt.show() im1 = 1 - rgb2gray(im) threshold = 0.5 im1[im1 threshold] = 1 chull = convex_hull_image(im1) plt.imshow(chull) plt.title('convex hull in the binary image') plt.show() imageBox = Image.fromarray((chull*255).astype(np.uint8)).getbbox() cropped = Image.fromarray(im).crop(imageBox) cropped.save('L_2d_cropped.jpg') plt.imshow(cropped) plt.title('cropped image') plt.show()

This can also be found here.

### 2. Opening and Closing are Dual operations in mathematical morphology

- Start with a binary image and apply opening operation with some structuring element (e.g., a disk) on it to obtain an output image.
- Invert the image (to change the foreground to background and vice versa) and apply closing operation on it with the same structuring element to obtain another output image.
- Invert the second output image obtained and observe that it’s same as the first output image.
- Thus applying opening operation to the foreground of a binary image is equivalent to applying closing operation to the background of the same image with the same structuring element.

The next python code shows the implementation of the above steps.

from skimage.morphology import binary_opening, binary_closing, disk from skimage.util import invert im = rgb2gray(imread('../new images/circles.jpg')) im[im 0.5] = 1 plt.gray() plt.figure(figsize=(20,10)) plt.subplot(131) plt.imshow(im) plt.title('original', size=20) plt.axis('off') plt.subplot(1,3,2) im1 = binary_opening(im, disk(12)) plt.imshow(im1) plt.title('opening with disk size ' + str(12), size=20) plt.axis('off') plt.subplot(1,3,3) im1 = invert(binary_closing(invert(im), disk(12))) plt.imshow(im1) plt.title('closing with disk size ' + str(12), size=20) plt.axis('off') plt.show()

As can be seen the output images obtained are exactly same.

## Floyd-Steinberg Dithering (to convert a grayscale to a binary image)

The next figure shows the algorithm for error diffusion dithering.

def find_closest_palette_color(oldpixel): return int(round(oldpixel / 255)*255) im = rgb2gray(imread('../my images/godess.jpg'))*255 pixel = np.copy(im) w, h = im.shape for x in range(w): for y in range(h): oldpixel = pixel[x][y] newpixel = find_closest_palette_color(oldpixel) pixel[x][y] = newpixel quant_error = oldpixel - newpixel if x + 1 &amp;amp;amp;amp;lt; w-1: pixel[x + 1][y] = pixel[x + 1][y] + quant_error * 7 / 16 if x &amp;amp;amp;amp;gt; 0 and y &amp;amp;amp;amp;lt; h-1: pixel[x - 1][y + 1] = pixel[x - 1][y + 1] + quant_error * 3 / 16 if y &amp;amp;amp;amp;lt; h-1: pixel[x ][y + 1] = pixel[x ][y + 1] + quant_error * 5 / 16 if x &amp;amp;amp;amp;lt; w-1 and y &amp;amp;amp;amp;lt; h-1: pixel[x + 1][y + 1] = pixel[x + 1][y + 1] + quant_error * 1 / 16 plt.figure(figsize=(10,20)) plt.imshow(pixel, cmap='gray') plt.axis('off') plt.show()

**The input image (gray-scale)**

**The output Image (binary)**

The next animation shows how an another input grayscale image gets converted to output binary image using the error diffusion dithering.

## Sharpen a color image

- First blur the image with an LPF (e.g., Gaussian Filter).
- Compute the detail image as the difference between the original and the blurred image.
- Now the sharpened image can be computed as a linear combination of the original image and the detail image. The next figure illustrates the concept.

The next python code shows how this can be implemented in python:

from scipy import misc, ndimage import matplotlib.pyplot as plt import numpy as np def rgb2gray(im): return np.clip(0.2989 * im[...,0] + 0.5870 * im[...,1] + 0.1140 * im[...,2], 0, 1) im = misc.imread('../my images/me.jpg')/255 im_blurred = ndimage.gaussian_filter(im, (5,5,0)) im_detail = np.clip(im - im_blurred, 0, 1) fig, axes = plt.subplots(nrows=2, ncols=3, sharex=True, sharey=True, figsize=(15, 15)) axes = axes.ravel() axes[0].imshow(im) axes[0].set_title('Original image', size=15) axes[1].imshow(im_blurred) axes[1].set_title('Blurred image, sigma=5', size=15) axes[2].imshow(im_detail) axes[2].set_title('Detail image', size=15) alpha = [1, 5, 10] for i in range(3): im_sharp = np.clip(im + alpha[i]*im_detail, 0, 1) axes[3+i].imshow(im_sharp) axes[3+i].set_title('Sharpened image, alpha=' + str(alpha[i]), size=15) for ax in axes: ax.axis('off') fig.tight_layout() plt.show()

The next figure shows the output of the above code block. As cane be seen, the output gets more sharpened as the value of alpha gets increased.

The next animation shows how the image gets more and more sharpened with increasing alpha.

## Edge Detection with LOG and Zero-Crossing Algorithm by Marr and Hildreth

The following figure shows LOG filter and its DOG approximation.

In order to detect edges as a binary image, finding the zero-crossings in the LoG-convolved image was proposed by Marr and Hildreth. Identification of the edge pixels can be done by viewing the sign of the LoG-smoothed image by defining it as a binary image, the algorithm is as follows:

**Algorithm to compute the zero-crossing **

- First convert the LOG-convolved image to a binary image, by replacing the pixel values by 1 for positive values and 0 for negative values.
- In order to compute the zero crossing pixels, we need to simply look at the boundaries of the non-zero regions in this binary image.
- Boundaries can be found by finding any non-zero pixel that has an immediate neighbor which is is zero.
- Hence, for each pixel, if it is non-zero, consider its 8 neighbors, if any of the neighboring pixels is zero, the pixel can be identified as an edge.

The next python code and the output images / animations generated show how to detect the edges from the zebra image with LOG + zero-crossings:

from scipy import ndimage, misc import matplotlib.pyplot as plt from scipy.misc import imread from skimage.color import rgb2gray def any_neighbor_zero(img, i, j): for k in range(-1,2): for l in range(-1,2): if img[i+k, j+k] == 0: return True return False def zero_crossing(img): img[img > 0] = 1 img[img 0 and any_neighbor_zero(img, i, j): out_img[i,j] = 255 return out_img img = rgb2gray(imread('../images/zebras.jpg')) fig = plt.figure(figsize=(25,15)) plt.gray() # show the filtered result in grayscale for sigma in range(2,10, 2): plt.subplot(2,2,sigma/2) result = ndimage.gaussian_laplace(img, sigma=sigma) plt.imshow(zero_crossing(result)) plt.axis('off') plt.title('LoG with zero-crossing, sigma=' + str(sigma), size=30) plt.tight_layout() plt.show()

**Original Input Image**

**Output with edges detected with LOG + zero-crossing at different sigma scales**

With another input image

**Output with edges detected with LOG + zero-crossing at different sigma scales**

## Constructing the Gaussian Pyramid with scikit-image transform module’s reduce function and Laplacian Pyramid from the Gaussian Pyramid and the expand function

The *Gaussian Pyramid* can be computed with the following steps:

- Start with the original image.
- Iteratively compute the image at each level of the pyramid first by smoothing the image (with gaussian filter) and then downsampling it .
- Stop at a level where the image size becomes sufficiently small (e.g., 1×1).

The *Laplacian Pyramid* can be computed with the following steps:

- Start with the Gaussian Pyramid and with the smallest image.
- Iteratively compute the difference image in between the image at the current level and the image obtained by first upsampling and then smoothing the image (with gaussian filter) from the previous level of the Gaussian Pyramid.
- Stop at a level where the image size becomes equal to the original image size.

The next python code shows how to create a Gaussian Pyramid from an image.

import numpy as np import matplotlib.pyplot as plt from skimage.io import imread from skimage.color import rgb2gray from skimage.transform import pyramid_reduce, pyramid_expand, resize def get_gaussian_pyramid(image): rows, cols, dim = image.shape gaussian_pyramid = [image] while rows&amp;gt; 1 and cols &amp;gt; 1: image = pyramid_reduce(image, downscale=2) gaussian_pyramid.append(image) print(image.shape) rows //= 2 cols //= 2 return gaussian_pyramid def get_laplacian_pyramid(gaussian_pyramid): laplacian_pyramid = [gaussian_pyramid[len(gaussian_pyramid)-1]] for i in range(len(gaussian_pyramid)-2, -1, -1): image = gaussian_pyramid[i] - resize(pyramid_expand(gaussian_pyramid[i+1]), gaussian_pyramid[i].shape) laplacian_pyramid.append(np.copy(image)) laplacian_pyramid = laplacian_pyramid[::-1] return laplacian_pyramid image = imread('../images/antelops.jpeg') gaussian_pyramid = get_gaussian_pyramid(image) laplacian_pyramid = get_laplacian_pyramid(gaussian_pyramid) w, h = 20, 12 for i in range(3): plt.figure(figsize=(w,h)) p = gaussian_pyramid[i] plt.imshow(p) plt.title(str(p.shape[0]) + 'x' + str(p.shape[1]), size=20) plt.axis('off') w, h = w / 2, h / 2 plt.show() w, h = 10, 6 for i in range(1,4): plt.figure(figsize=(w,h)) p = laplacian_pyramid[i] plt.imshow(rgb2gray(p), cmap='gray') plt.title(str(p.shape[0]) + 'x' + str(p.shape[1]), size=20) plt.axis('off') w, h = w / 2, h / 2 plt.show()

### Some images from the Gaussian Pyramid

### Some images from the Laplacian Pyramid

## Blending images with Gaussian and Laplacian pyramids

Here is the algorithm:

Blending the following input images A, B with mask image M

**Input Image A (Goddess Durga)**

**Input Image B (Lord Shiva)**

Mask Image M

with the following python code creates the output image I shown below

A = imread('../images/madurga1.jpg')/255 B = imread('../images/mahadeb.jpg')/255 M = imread('../images/mask1.jpg')/255 # get the Gaussian and Laplacian pyramids, implement the functions pyramidA = get_laplacian_pyramid(get_gaussian_pyramid(A)) pyramidB = get_laplacian_pyramid(get_gaussian_pyramid(B)) pyramidM = get_gaussian_pyramid(M) pyramidC = [] for i in range(len(pyramidM)): im = pyramidM[i]*pyramidA[i] + (1-pyramidM[i])*pyramidB[i] pyramidC.append(im) # implement the following function to construct an image from its laplacian pyramids I = reconstruct_image_from_laplacian_pyramid(pyramidC)

**Output Image I (Ardhanarishwar)**

The following animation shows how the output image is formed:

Another blending (horror!) example (from prof. dmartin)