Transformations

What is Transformation?

Transformation means to change. Here we mean to make some changes in any given geometric shape.

We use transformations to correct distortions or perspective issues from arising from the point of view an image was captured.

Types of Transformations

Affine Transformations

Translation
Rotation
Scaling

Non Affine / Projective / Perspective Transformations

Homography

You can imagine your viewpoint being skewed.

Difference between Affine and Non Affine Transformations

Projective transformations do not preserve parallelism, length, and angle.
Affine transformations, unlike the projective ones, preserve parallelism.
They both preserve colinearity and incidence.

Affine Transformations

Translation

What is Translation?

Translation an affine transform that simply shifts the position of an image.

Code Implementation

We use cv2.warpAffine to implement these transformations.

import cv2
import numpy as np

image = cv2.imread('images/input.jpg')

# Store height and width of the image
height, width = image.shape[:2]

quarter_height, quarter_width = height/4, width/4

#       | 1 0 Tx |					Tx Represents the shift along the x-axis (horizonal)
#  T  = | 0 1 Ty |					Ty Represents the shift along the y-axis (vertical)

# T is our translation matrix
T = np.float32([[1, 0, quarter_width], [0, 1,quarter_height]])

# We use warpAffine to transform the image using the matrix, T
img_translation = cv2.warpAffine(image, T, (width, height))
cv2.imshow('Translation', img_translation)
cv2.waitKey()
cv2.destroyAllWindows()

1
2
3

# Let's take a look at T

print(T)

Rotation

What is Rotation?

Spining.

Code Implementation

We use cv2.getRotationMatrix2D to implement these transformations.

cv2.getRotationMatrix2D(rotation_center_x, rotation_center_y, angle of rotation, scale)

import cv2
import numpy as np

image = cv2.imread('images/input.jpg')
height, width = image.shape[:2]

# Divide by two to rototate the image around its centre
rotation_matrix = cv2.getRotationMatrix2D((width/2, height/2), 90, .5)

rotated_image = cv2.warpAffine(image, rotation_matrix, (width, height))

cv2.imshow('Rotated Image', rotated_image)
cv2.waitKey()
cv2.destroyAllWindows()

Another method for simple rotations that uses the cv2.transpose function. (Rotate 90 degree to left)

#Other Option to Rotate
img = cv2.imread('images/input.jpg')

rotated_image = cv2.transpose(img)

cv2.imshow('Rotated Image - Method 2', rotated_image)
cv2.waitKey()
cv2.destroyAllWindows()

You can actually flip the image by using cv2.flip function. (Rotate 180 degree)

# Let's now to a horizontal flip.
flipped = cv2.flip(image, 1)
cv2.imshow('Horizontal Flip', flipped) 
cv2.waitKey()
cv2.destroyAllWindows()

Scaling

What is Scaling?

It is basically re-sizing the image.

Interpolation

Interpolation is a method of constucting new data points within the range of a discrete set of know data points.

cv2.INTER_AREA - Good for shrinking or down sampling

cv2.INTER_NEAREST - Fastest

cv2.INTER_LINEAR - Good for zooming or up sampling (default)

cv2.INTER_CUBIC - Better

cv2.INTER_LANCZOS4 - Best

How do I choose an image interpolation method? (Emgu/OpenCV)

Code Implementation

We use cv2.resize to implement resizing.

cv2.resize(image, dsize(output image size), x scale, y scale, interpolation)

import cv2
import numpy as np

# load our input image
image = cv2.imread('images/input.jpg')

# Let's make our image 3/4 of it's original size
image_scaled = cv2.resize(image, None, fx=0.75, fy=0.75)
cv2.imshow('Scaling - Linear Interpolation', image_scaled) 
cv2.waitKey()

# Let's double the size of our image
img_scaled = cv2.resize(image, None, fx=2, fy=2, interpolation = cv2.INTER_CUBIC)
cv2.imshow('Scaling - Cubic Interpolation', img_scaled)
cv2.waitKey()

# Let's skew the re-sizing by setting exact dimensions
img_scaled = cv2.resize(image, (900, 400), interpolation = cv2.INTER_AREA)
cv2.imshow('Scaling - Skewed Size', img_scaled) 
cv2.waitKey()

cv2.destroyAllWindows()

Image Pyramids

Another Scaling method.

useful when scaling images in object detection.

import cv2

image = cv2.imread('images/input.jpg')

smaller = cv2.pyrDown(image)
larger = cv2.pyrUp(smaller)

cv2.imshow('Original', image )

cv2.imshow('Smaller ', smaller )
cv2.imshow('Larger ', larger )
cv2.waitKey(0)
cv2.destroyAllWindows()

Cropping

Cropping images refers to extacting a segment of the image.

How do we implement it?

We will need to use numpy.

1	import numpy as np

Code Implementation

import cv2
import numpy as np

image = cv2.imread('images/input.jpg')
height, width = image.shape[:2]

# Let's get the starting pixel coordiantes (top  left of cropping rectangle)
start_row, start_col = int(height * .25), int(width * .25)

# Let's get the ending pixel coordinates (bottom right)
end_row, end_col = int(height * .75), int(width * .75)

# Simply use indexing to crop out the rectangle we desire
cropped = image[start_row:end_row , start_col:end_col]

cv2.imshow("Original Image", image)
cv2.waitKey(0) 
cv2.imshow("Cropped Image", cropped) 
cv2.waitKey(0) 
cv2.destroyAllWindows()

Operations

Arithmetic Operations

These are simple operations that allow us to directly add or subract to the color intensity.

Calculates the per-element operation of two arrays. The overall effect is increasing or decreasing brightness.

Code Implementation

Again, numpy is required.

import cv2
import numpy as np

image = cv2.imread('images/input.jpg')

# Create a matrix of ones, then multiply it by a scaler of 100 
# This gives a matrix with same dimesions of our image with all values being 100
M = np.ones(image.shape, dtype = "uint8") * 75 
# M = np.ones(image.shape, dtype = "uint8") * 175 
# ^ Try this code!

# We use this to add this matrix M, to our image
# Notice the increase in brightness
added = cv2.add(image, M)
cv2.imshow("Added", added)

# Likewise we can also subtract
# Notice the decrease in brightness
subtracted = cv2.subtract(image, M)
cv2.imshow("Subtracted", subtracted)

cv2.waitKey(0)
cv2.destroyAllWindows()

1 2	M = np.ones(image.shape, dtype = "uint8") * 75 print(M)

Added vs Subtracted

Bitwise Operations / Masking

We use Bitwise operation for masking.

Code Implementation

Again, numpy is required.

import cv2
import numpy as np

# If you're wondering why only two dimensions, well this is a grayscale image, 
# if we doing a colored image, we'd use 
# rectangle = np.zeros((300, 300, 3),np.uint8)

# Making a sqare
square = np.zeros((300, 300), np.uint8)
cv2.rectangle(square, (50, 50), (250, 250), 255, -2)
cv2.imshow("Square", square)
cv2.waitKey(0)

# Making a ellipse
ellipse = np.zeros((300, 300), np.uint8)
cv2.ellipse(ellipse, (150, 150), (150, 150), 30, 0, 180, 255, -1)
cv2.imshow("Ellipse", ellipse)
cv2.waitKey(0)

cv2.destroyAllWindows()

^ image should be in BW. I forgot to change the colormap :baby_chick:

Try Experiment with some bitwise operations

# Shows only where they intersect
And = cv2.bitwise_and(square, ellipse)
cv2.imshow("AND", And)
cv2.waitKey(0)

#^ Press any key to continue...

# Shows where either square or ellipse is 
bitwiseOr = cv2.bitwise_or(square, ellipse)
cv2.imshow("OR", bitwiseOr)
cv2.waitKey(0) 

#^ Press any key to continue...

# Shows where either exist by itself
bitwiseXor = cv2.bitwise_xor(square, ellipse)
cv2.imshow("XOR", bitwiseXor)
cv2.waitKey(0)

#^ Press any key to continue...

# Shows everything that isn't part of the square
bitwiseNot_sq = cv2.bitwise_not(square)
cv2.imshow("NOT - square", bitwiseNot_sq)
cv2.waitKey(0)

#^ Press any key to continue...

### Notice the last operation inverts the image totally

cv2.destroyAllWindows()

Blurring and Sharpening

Convolution

What is Convolution?

A Convolution is a mathematical operation performed on two functions producing a third function which is typically a modified version of one of the original functions.

In Computer Vision we use kernel’s to specify the size over which we run our manipulating function over our image.

$Output\space Image = Image \space \circledast \space Function_{Kernel\space Size}$

Why mention Convolution?

Convolution are found in many applications in science, engineering and mathematics.

For example, In image processing :

In digital image processing convolutional filtering plays an important role in many important algorithms in edge detection and related processes.

Blurring

What is Blurring?

Blurring is an operation where we average the pixels within a region.

Code Implementation

cv2.filter2D(image, -1, kernel)

import cv2
import numpy as np

image = cv2.imread('images/elephant.jpg')
cv2.imshow('Original Image', image)
cv2.waitKey(0)

# Creating our 3 x 3 kernel
kernel_3x3 = np.ones((3, 3), np.float32) / 9

# We use the cv2.fitler2D to conovlve the kernal with an image 
blurred = cv2.filter2D(image, -1, kernel_3x3)
cv2.imshow('3x3 Kernel Blurring', blurred)
cv2.waitKey(0)

# Creating our 7 x 7 kernel
kernel_7x7 = np.ones((7, 7), np.float32) / 49

blurred2 = cv2.filter2D(image, -1, kernel_7x7)
cv2.imshow('7x7 Kernel Blurring', blurred2)
cv2.waitKey(0)

cv2.destroyAllWindows()

Other commonly used blurring methods

cv2.blur - Averaging values over a specified window

cv2.GaussianBlur - Gaussian blurring effect

cv2.medianBlur - Takes Median of all the pixels

cv2.bilateralFilter - Noise removal while keeping edges sharp

import cv2
import numpy as np

image = cv2.imread('images/elephant.jpg')

# Averaging done by convolving the image with a normalized box filter. 
# This takes the pixels under the box and replaces the central element
# Box size needs to odd and positive 
blur = cv2.blur(image, (3,3))
cv2.imshow('Averaging', blur)
cv2.waitKey(0)

# Instead of box filter, gaussian kernel
Gaussian = cv2.GaussianBlur(image, (7,7), 0)
cv2.imshow('Gaussian Blurring', Gaussian)
cv2.waitKey(0)

# Takes median of all the pixels under kernel area and central 
# element is replaced with this median value
median = cv2.medianBlur(image, 5)
cv2.imshow('Median Blurring', median)
cv2.waitKey(0)

# Bilateral is very effective in noise removal while keeping edges sharp
bilateral = cv2.bilateralFilter(image, 9, 75, 75)
cv2.imshow('Bilateral Blurring', bilateral)
cv2.waitKey(0)
cv2.destroyAllWindows()

Image De-noising

Used to get the best optimized noise level. Cellphone cameras and digital cameras will use those algorithms.

cv2.fastNlMeansDenoising() - works with a single grayscale images

cv2.fastNlMeansDenoisingColored() - works with a color image.

cv2.fastNlMeansDenoisingMulti() - works with image sequence captured in short period of time (grayscale images)

cv2.fastNlMeansDenoisingColoredMulti() - same as above, but for color images.

import numpy as np
import cv2

image = cv2.imread('images/elephant.jpg')

# Parameters, after None are - the filter strength 'h' (5-10 is a good range)
# Next is hForColorComponents, set as same value as h again
# 
dst = cv2.fastNlMeansDenoisingColored(image, None, 6, 6, 7, 21)

cv2.imshow('Fast Means Denoising', dst)
cv2.waitKey(0)

cv2.destroyAllWindows()

Sharpening

What is Sharpening?

Sharpening is the opposite of blurring, it stengthes or emphasizing edges in an image.

Code Implementation

By altering our kernels we can implement sharpening, which has the effects of in strengthening or emphasizing edges in an image.

import cv2
import numpy as np

image = cv2.imread('images/input.jpg')
cv2.imshow('Original', image)

# Create our shapening kernel, we don't normalize since the 
# the values in the matrix sum to 1
kernel_sharpening = np.array([[-1,-1,-1], 
                              [-1,9,-1], 
                              [-1,-1,-1]])

# applying different kernels to the input image
sharpened = cv2.filter2D(image, -1, kernel_sharpening)

cv2.imshow('Image Sharpening', sharpened)

cv2.waitKey(0)
cv2.destroyAllWindows()

Thresholding

What is Thresholding?

the act of converting an image to a binary form.

Code Implementation

cv2.threshold(image, ThresholdValue, MaxValue, ThresholdType)

Threshold Types:

cv2.THRESH_BINARY

cv2.THRESH_BINARY_INV

cv2.THRESH_TRUNC

cv2.THRESH_TOZERO

cv2.THRESH_TOZERO_INV

import cv2
import numpy as np

# Load our image as greyscale 
image = cv2.imread('images/gradient.jpg',0)
cv2.imshow('Original', image)

# Values below 127 goes to 0 (black, everything above goes to 255 (white)
ret,thresh1 = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
cv2.imshow('1 Threshold Binary', thresh1)

# Values below 127 go to 255 and values above 127 go to 0 (reverse of above)
ret,thresh2 = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('2 Threshold Binary Inverse', thresh2)

# Values above 127 are truncated (held) at 127 (the 255 argument is unused)
ret,thresh3 = cv2.threshold(image, 127, 255, cv2.THRESH_TRUNC)
cv2.imshow('3 THRESH TRUNC', thresh3)

# Values below 127 go to 0, above 127 are unchanged  
ret,thresh4 = cv2.threshold(image, 127, 255, cv2.THRESH_TOZERO)
cv2.imshow('4 THRESH TOZERO', thresh4)

# Resever of above, below 127 is unchanged, above 127 goes to 0
ret,thresh5 = cv2.threshold(image, 127, 255, cv2.THRESH_TOZERO_INV)
cv2.imshow('5 THRESH TOZERO INV', thresh5)
cv2.waitKey(0) 
    
cv2.destroyAllWindows()

Adaptive thresholding

The Smarter way of doing thresholding.

cv2.aptiveThreshold(image, MaxValue, AdaptiveType, ThresholdType, BlockSizeOddNumber, ConstantThatSubtractedFromMean)

Adaptive Threshold Types:

ADAPTIVE_THRESH_MEAN_C

ADAPTIVE_THRESH_GAUSSIAN_C

THRESH_OTSU

import cv2
import numpy as np

# Load our new image
image = cv2.imread('images/Origin_of_Species.jpg', 0)

cv2.imshow('Original', image)
cv2.waitKey(0) 

# Values below 127 goes to 0 (black, everything above goes to 255 (white)
ret,thresh1 = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
cv2.imshow('Threshold Binary', thresh1)
cv2.waitKey(0) 

# It's good practice to blur images as it removes noise
image = cv2.GaussianBlur(image, (3, 3), 0)

# Using adaptiveThreshold
thresh = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, 
                               cv2.THRESH_BINARY, 3, 5) 
cv2.imshow("Adaptive Mean Thresholding", thresh) 
cv2.waitKey(0) 

_, th2 = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
cv2.imshow("Otsu's Thresholding", thresh) 
cv2.waitKey(0) 

# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(image, (5,5), 0)
_, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
cv2.imshow("Guassian Otsu's Thresholding", thresh) 
cv2.waitKey(0) 

cv2.destroyAllWindows()

Dilation and Erosion, Opening and Closing

What is Dilation and Erosion, Opening and Closing?

TLDR:

Dilation - Add pixels to the boundaries of objects in an image
Erosion - Removes pixels at the boundaries of objects in an image
Opening - Erosion then Dilation
Closing - Dilation then Erosion

In OpenCV

OpenCV recognize WHITE as object itself.

So the effect of dilation and erosion might do the reverse of what you expect.

Code Implementation

import cv2
import numpy as np

image = cv2.imread('images/opencv_inv.png', 0)

cv2.imshow('Original', image)
cv2.waitKey(0)

# Let's define our kernel size
kernel = np.ones((5,5), np.uint8)

# Now we erode
erosion = cv2.erode(image, kernel, iterations = 1)
cv2.imshow('Erosion', erosion)
cv2.waitKey(0)

# 
dilation = cv2.dilate(image, kernel, iterations = 1)
cv2.imshow('Dilation', dilation)
cv2.waitKey(0)

# Opening - Good for removing noise
opening = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)
cv2.imshow('Opening', opening)
cv2.waitKey(0)

# Closing - Good for removing noise
closing = cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel)
cv2.imshow('Closing', closing)
cv2.waitKey(0)


cv2.destroyAllWindows()

Edge Detection & Image Gradients

What is Edge Detection?

Edges can bbbe defined as sudden changes (discontinuities) in an image and they can encode just as much information as pixels.

Edge Detection Algorithms

3 Main Types Of Edge

Sobel - to emphasize vertical or horizontal edges
Laplacian - gets all orientations
Canny - optimal due to low error rate, well defined edges and accurate detection

Canny Edge Detection Algorithm

Developed by John F. Canny in 1986

Applies Gaussian blurring

Finds intensity gradient of the image

Applied non-maximum suppression (removes pixels that are not edges)

Hysteresis - Applies thresholds (if pixel is within the upper and lower thresholds, it is consider an edge)

Code Implementation

import cv2
import numpy as np

image = cv2.imread('images/input.jpg',0)

height, width = image.shape

# Extract Sobel Edges
sobel_x = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=5)
sobel_y = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=5)

cv2.imshow('Original', image)
cv2.waitKey(0)
cv2.imshow('Sobel X', sobel_x)
cv2.waitKey(0)
cv2.imshow('Sobel Y', sobel_y)
cv2.waitKey(0)

sobel_OR = cv2.bitwise_or(sobel_x, sobel_y)
cv2.imshow('sobel_OR', sobel_OR)
cv2.waitKey(0)

laplacian = cv2.Laplacian(image, cv2.CV_64F)
cv2.imshow('Laplacian', laplacian)
cv2.waitKey(0)





##  Then, we need to provide two values: threshold1 and threshold2. Any gradient value larger than threshold2
# is considered to be an edge. Any value below threshold1 is considered not to be an edge. 
#Values in between threshold1 and threshold2 are either classiﬁed as edges or non-edges based on how their 
#intensities are “connected”. In this case, any gradient values below 60 are considered non-edges
#whereas any values above 120 are considered edges.


# Canny Edge Detection uses gradient values as thresholds
# The first threshold gradient
canny = cv2.Canny(image, 50, 120)
cv2.imshow('Canny', canny)
cv2.waitKey(0)

cv2.destroyAllWindows()

Mapping of points between images

Non Affine / Projective / Perspective Transformations

Getting Perspective Transform

import cv2
import numpy as np
import matplotlib.pyplot as plt

image = cv2.imread('images/scan.jpg')

cv2.imshow('Original', image)
cv2.waitKey(0)

# Cordinates of the 4 corners of the original image
points_A = np.float32([[320,15], [700,215], [85,610], [530,780]])

# Cordinates of the 4 corners of the desired output
# We use a ratio of an A4 Paper 1 : 1.41
points_B = np.float32([[0,0], [420,0], [0,594], [420,594]])
 
# Use the two sets of four points to compute 
# the Perspective Transformation matrix, M    
M = cv2.getPerspectiveTransform(points_A, points_B)
 
warped = cv2.warpPerspective(image, M, (420,594))
 
cv2.imshow('warpPerspective', warped)
cv2.waitKey(0)
cv2.destroyAllWindows()

Affine Transform

In affine transforms you only need 3 coordinates to obtain the correct transform.

import cv2
import numpy as np
import matplotlib.pyplot as plt

image = cv2.imread('images/ex2.jpg')
rows,cols,ch = image.shape

cv2.imshow('Original', image)
cv2.waitKey(0)

# Cordinates of the 4 corners of the original image
points_A = np.float32([[320,15], [700,215], [85,610]])

# Cordinates of the 4 corners of the desired output
# We use a ratio of an A4 Paper 1 : 1.41
points_B = np.float32([[0,0], [420,0], [0,594]])
 
# Use the two sets of four points to compute 
# the Perspective Transformation matrix, M    
M = cv2.getAffineTransform(points_A, points_B)

warped = cv2.warpAffine(image, M, (cols, rows))
 
cv2.imshow('warpPerspective', warped)
cv2.waitKey(0)
cv2.destroyAllWindows()