Tutorial: Finding Rectangles with NO OpenCV

Task: For any given image write a program to find all rectangular objects which are darker than the background. Draw the located rectangles on the output image.
Method: Linear Hough transform with advanced filtering
Restrictions: Solid colour rectangles only, Many wide-bordered rectangles slow down the program
Tutorial type: Algorithm oriented, written in Python
Requirements: Python 2.7, Numpy 1.9.2, Matplotlib 1.4.3, OpenCV 2.3 (used for Canny only!)

Intro

The tutorial code is written in Python, but it's well readable and can be implemented in Matlab, R, JAVA or any other programming language. In order to understand the subject more all the functions are built bottom-up with no outside resources used. Some examples of how the program performs are presented below:

  • Example1
  • Example2
  • Example3
  • Example4

To find partly not visible rectangles there is possibility to slightly modify the code.

Prepare Image

First we open an image, convert it to grayscale, blur and find edges. This is the only place where we will be using OpenCV. Image blurring and finding edges are crucial to the success of finding rectangles.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import matplotlib.pyplot as plt    #Used for plotting
import matplotlib.image as mpimg   #Used for image read
import cv2                         #Used for Canny only!
import numpy as np                 #Basic calculations library

import customFunctions             #We will define some functions ourselves

image_rgb = mpimg.imread('rectangles/rectangles.jpg')    #Reading image to array
image_gray = customFunctions.rgb2gray(image_rgb)         #Converting rgb to gray

image_gray = customFunctions.blurImage(image_gray)       #Blurring gray image with custom function similar to OpenCV.filter2d
                                                         #find it in paragraph Other Custom Functions Used
edged = cv2.Canny(image_gray, 30, 130)                   #Finding edges, This is the only place we use OpenCV

#Sometimes images have unnecessary lines at the edges and we don't whant to find them
##Zero out the border 5px
borderLen = 5                         #The width to zero out the borders, counted in pixels
lenx, leny = edged.shape

edged[0:borderLen,0:leny] = 0
edged[lenx-borderLen:lenx,0:leny] = 0
edged[0:lenx,0:borderLen] = 0
edged[0:lenx,leny-borderLen:leny] = 0
plt.imshow(edged)                     #Take a look at the edged picture

##Save edged picture to jpg
fig, ax1 = plt.subplots(ncols=1, nrows=1, figsize=(8, 4))
ax1.set_axis_off()
ax1.imshow(edged, cmap="bone")
fig.savefig("Rectangles_edged.jpg")

Here is the saved output files for both original and edged versions.

  • Rectangles
  • Rectangles_Edged

Hough Transform and Clustering

Hough Transform

For Hough transformation we will define our own function:

1
2
3
4
5
6
rho,theta = customFunctions.houghLines(edged,
                                       rho_res=1,
                                       theta_res=1,
                                       thresholdVotes=30,
                                       filterMultiple=5,
                                       thresholdPixels=0)

The full function looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
def houghLines(edged,rho_res,theta_res,thresholdVotes,filterMultiple,thresholdPixels=0):
    
    rows, columns = edged.shape
    theta = np.linspace(-90.0, 0.0, np.ceil(90.0/theta_res) + 1.0)
    theta = np.concatenate((theta, -theta[len(theta)-2::-1]))
    
    #defining empty Matrix in Hough space, where x is for theta and y is x*cos(theta)+y*sin(theta)
    diagonal = np.sqrt((rows - 1)**2 + (columns - 1)**2)
    q = np.ceil(diagonal/rho_res)
    nrho = 2*q + 1
    rho = np.linspace(-q*rho_res, q*rho_res, nrho)
    houghMatrix = np.zeros((len(rho), len(theta)))
    
    #Here we populate houghMatrix
    for rowId in range(rows):                               #for each x in edged
        for colId in range(columns):                        #for each y in edged
          if edged[rowId, colId]>thresholdPixels:           #edged has values 0 or 255 in our case
            #for each theta we calculate rhoVal, then locate it in Hough space plane
            for thId in range(len(theta)):
              rhoVal = colId*np.cos(theta[thId]*np.pi/180.0) + \
                  rowId*np.sin(theta[thId]*np.pi/180)
              rhoIdx = np.nonzero(np.abs(rho-rhoVal) == np.min(np.abs(rho-rhoVal)))[0]
              houghMatrix[rhoIdx[0], thId] += 1  
            
   
   #cluster and filter multiple dots in Houghs plane
    if filterMultiple>0:
        clusterDiameter=filterMultiple
        values=np.transpose(np.array(np.nonzero(houghMatrix>thresholdVotes)))
        filterArray=[]
        filterArray.append(0)
        totalArray=[]
        for i in range (0, len(values)):
            if i in filterArray[1::]:
                continue
            tempArray=[i]
            for j in range (i+1, len(values)):
                if j in filterArray[1::]:
                    continue
                for k in range (0, len(tempArray)):
                    if getLength(values[tempArray[k]],values[j])<clusterDiameter:
                        filterArray.append(j)
                        tempArray.append(j)
                        break
            totalArray.append(tempArray)
        
        #leave the highest value in each cluster
        for i in range (0, len(totalArray)):
             for j in range (0, len(totalArray[i])):
                 if j==0:
                     highest=houghMatrix[values[totalArray[i][j]][0],values[totalArray[i][j]][1]]
                     ii=i
                     jj=j
                 else:
                     if houghMatrix[values[totalArray[i][j]][0],values[totalArray[i][j]][1]]>=highest:
                         highest=houghMatrix[values[totalArray[i][j]][0],values[totalArray[i][j]][1]]
                         houghMatrix[values[totalArray[ii][jj]][0],values[totalArray[ii][jj]][1]]=0
                         ii=i
                         jj=j
                     else:
                         houghMatrix[values[totalArray[i][j]][0],values[totalArray[i][j]][1]]=0
                    
    return (np.where(houghMatrix>thresholdVotes)[0]-q)*rho_res, theta[np.where(houghMatrix>thresholdVotes)[1]]*np.pi/180.0

Some questions that may arise from the code:
Why Hough matrix has such dimensions? Dimension 1 ranges are defined by maximum value for sin(x)+cos(x) which is sqrt(2). So it's maximum is (numColumnsImage+numRowsImage)*sqrt(2). Dimension 2 is theta which ranges from -90 to 90 in degrees. Every dimension has it's resolution which is taken as parameter to the function, which divides the number of points we will count.
Why we use thresholdPixels? We want to consider only points that are edges and in some cases edged image may be not boolean and we want to filter it by threshold.
Why we use filtering? Well, it's unnecessary, but it speeds up the program significantly. On the other hand it sometimes might produce wrong results, for example when we have multiple rectangles in the image, where their contour lines are very nearby in Hough space (same angle, or nearby r value [r=x*cos(th)+y*sin(th)]).
Next chapter will describe filtering in details.

Clustering and Filtering

We have 7 real lines in our image. We may choose appropriate thresholdVotes parameter to filter out unnecessary points in Hough plane, but usually the process is automatic and the program does not know the correct thresholdVotes value. For this reason we got 1060 of points after Hough transform and only 26 points after using clustering and filtering. So we have 19 false-positives for now. Points higher than thresholdVotes in Hough plane before and after filtering look like that:

  • Hough_Points_Before_Filtering
  • Hough_Points_Filtered

The results on the image looks like this - found lines before and after filtering:
  • Found_Lines_Before_Filtering
  • Found_Lines_Filtered

For easy plotting found lines in Hough space knowing rho and theta we define our own function plotHoughLines:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def plotHoughLines(rho,theta,image):
  
    a = np.cos(theta)
    b = np.sin(theta)
    x0 = a*rho
    y0 = b*rho

    fig2, ax1 = plt.subplots(ncols=1, nrows=1)
    ax1.imshow(image)
    
    for i in range (0, len(rho)):   
        ax1.plot( [x0[i] + 1000*(-b[i]), x0[i] - 1000*(-b[i])],
                  [y0[i] + 1000*(a[i]), y0[i] - 1000*(a[i])], 
                  'xb-',linewidth=3)
    
    ax1.set_ylim([image.shape[0],0])
    ax1.set_xlim([0,image.shape[1]])
    
    plt.show()

Find Corners from Hough Lines

Continue the main program using found rho and theta for every found line. Next steps are to find all parallel lines, to pair them, find perpendicular pairs, and solve linear equations to find corners at each intersection.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
##Parameters for finding parallel lines
theta2=theta+np.pi/2
theta3=theta-np.pi/2
difference=np.pi/90
differenceRho=2

#find parallel pairs
accumParallel=[]
for i in range (0, len(theta)):
    for j in range (i+1, len(theta)):
        if theta[j]>(theta[i]-difference) and theta[j]<(theta[i]+difference):
             if rho[j]<(rho[i]-differenceRho) or rho[j]>(rho[i]+differenceRho):
                accumParallel.append([i,j])

fourLines=[]

#find perpendicular pairs for parallel pairs
for i in range (0, len(accumParallel)):
    for j in range (1, len(accumParallel)):
        if     (theta2[accumParallel[j][0]]>(theta[accumParallel[i][0]]-difference) and theta2[accumParallel[j][0]]<(theta[accumParallel[i][0]]+difference)) \
            or (theta2[accumParallel[j][1]]>(theta[accumParallel[i][0]]-difference) and theta2[accumParallel[j][1]]<(theta[accumParallel[i][0]]+difference)) \
            or (theta2[accumParallel[j][0]]>(theta[accumParallel[i][1]]-difference) and theta2[accumParallel[j][0]]<(theta[accumParallel[i][1]]+difference)) \
            or (theta2[accumParallel[j][1]]>(theta[accumParallel[i][1]]-difference) and theta2[accumParallel[j][1]]<(theta[accumParallel[i][1]]+difference)) \
            or (theta3[accumParallel[j][0]]>(theta[accumParallel[i][0]]-difference) and theta3[accumParallel[j][0]]<(theta[accumParallel[i][0]]+difference)) \
            or (theta3[accumParallel[j][1]]>(theta[accumParallel[i][0]]-difference) and theta3[accumParallel[j][1]]<(theta[accumParallel[i][0]]+difference)) \
            or (theta3[accumParallel[j][0]]>(theta[accumParallel[i][1]]-difference) and theta3[accumParallel[j][0]]<(theta[accumParallel[i][1]]+difference)) \
            or (theta3[accumParallel[j][1]]>(theta[accumParallel[i][1]]-difference) and theta3[accumParallel[j][1]]<(theta[accumParallel[i][1]]+difference)):
                fourLines.append(np.concatenate([accumParallel[i],accumParallel[j]],1))
    
fourLines=customFunctions.unique(fourLines)

#solve 4 of equations for 4 corners
#rho_j = x cos(theta_j) + y sin(theta_j)
#rho_k = x cos(theta_k) + y sin(theta_k)
corners=[]
for quads in range (0, len(fourLines)):
    cornersTemp=[]
    for lines in range (0,4):
        if lines in [0,1]:
            linesi=lines
            nexti=2
        if lines == 2:
            linesi=0
            nexti=3
        if lines == 3:
            linesi=1
            nexti=1
        b=np.array([rho[fourLines[quads][linesi]],rho[fourLines[quads][linesi+nexti]]])
        a=np.array([[np.cos(theta[fourLines[quads][linesi]]),
                     np.sin(theta[fourLines[quads][linesi]])],
                    [np.cos(theta[fourLines[quads][linesi+nexti]]),
                     np.sin(theta[fourLines[quads][linesi+nexti]])]])
        ans = np.linalg.solve(a, b)
        cornersTemp.append([int(ans[0]),int(ans[1])])
    corners.append(cornersTemp)

#reorder corners
corners=customFunctions.reorderPoints(corners)

Filter Wrong and Duplicate Rectangles

This part consists of four types of filtering to avoid false-positives. We will discuss each of them one by one.

Remove if small: This is the most simple check, where the rectangle is deleted if any of it's sides is shorter than specific value. In our case we choose it to be 20 pixels.
Remove if standard deviation inside rectangle is high: If standard deviation is high, that may mean that rectangles has ended somewhere inside area of our corners. But this check is the one which restricts the code work only with solid rectangles. 

Remove if brighter: The requirement by task is fulfilled by checking the outer color of the rectangle. It is completed by selecting four points outside rectangle in the line from the midpoint to all for middles of sides.
Remove if many in same place: After all the filtering there still is a problem where many rectangles are correctly found, but the lines differs by few pixels. We run extra check to leave only rectangles best fitting the edged image.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
#check if valid rectangles
for i in range (len(corners)-1,-1,-1):

    minx=np.min(np.array(corners[i])[:,0])
    maxx=np.max(np.array(corners[i])[:,0])
    miny=np.min(np.array(corners[i])[:,1])
    maxy=np.max(np.array(corners[i])[:,1]) 
    
    height=customFunctions.getLength(corners[i][0],corners[i][1])
    width=customFunctions.getLength(corners[i][2],corners[i][1])
    
    #remove wrongs if small
    if height<20 or width<20 or maxy-miny<10 or maxx-minx<10:
        del corners[i]
        continue
    
    
    xlin=np.array(np.linspace(corners[i][0][0],corners[i][2][0],20)).astype(int)
    ylin=np.array(np.linspace(corners[i][0][1],corners[i][2][1],20)).astype(int)
    xlin2=np.array(np.linspace(corners[i][1][0],corners[i][3][0],20)).astype(int)
    ylin2=np.array(np.linspace(corners[i][1][1],corners[i][3][1],20)).astype(int)
    
    #remove wrong rectangle if std
    std=np.std(np.concatenate([image_gray[(ylin[2:-2]),(xlin[2:-2])],image_gray[(ylin2[2:-2]),(xlin2[2:-2])]]))
    if std>7:
        del corners[i]
        continue
    
    #remove wrongs if brighter
    averageInside=np.average(np.concatenate([image_gray[(ylin[2:-2]),(xlin[2:-2])],image_gray[(ylin2[2:-2]),(xlin2[2:-2])]]))
    corners[i].append(corners[i][0])
    delete=0
    pixelsFromBorder=5
    middlePoint=np.array([(corners[i][0][0]+corners[i][2][0])/2,(corners[i][0][1]+corners[i][2][1])/2])
    for j in range (0,4):        
        y=pixelsFromBorder*np.sin(customFunctions.getAngle([0,0],[1,0],(np.array(corners[i][j+1])+np.array(corners[i][j]))/2-middlePoint,False))
        x=pixelsFromBorder*np.cos(customFunctions.getAngle([0,0],[1,0.01],(np.array(corners[i][j+1])+np.array(corners[i][j]))/2-middlePoint,False))
        a=(np.array(corners[i][j+1])+np.array(corners[i][j]))/2+[int(x),int(y)]
        if image_gray[a[1],a[0]]-averageInside<5:
            delete=1
    if delete==1:
        del corners[i]
        continue
#        
#Removing duplicates
#Here we remove duplicate rectangles by finding the overlapping ones
sumi=np.zeros(len(corners))
middlePoint=np.zeros((len(corners),2))
for i in range (len(corners)-1,-1,-1):
    middlePoint[i]=np.array([(corners[i][0][0]+corners[i][2][0])/2,(corners[i][0][1]+corners[i][2][1])/2])
    #check for edges
    for j in range (0,4):
        x1=corners[i][j][0]
        x2=corners[i][j+1][0]
        y1=corners[i][j][1]
        y2=corners[i][j+1][1]
        if x1!=x2:
            m = (y2-y1)/(1.0*(x2-x1))
            q = y2-m*x2
            x = np.linspace(np.min([x1,x2]),np.min([np.max([x1,x2]),edged.shape[1]-5]),np.absolute(x2-x1)+1)
            y = m*x+q
        else:
            y = np.linspace(np.min([y1,y2]),np.min([np.max([y1,y2]),edged.shape[1]-5]),np.max([y1,y2])-np.min([y1,y2])+1)
            x = x1*np.ones(len(y))
        sumi[i] += np.sum(edged[np.round(y).astype(int),np.round(x).astype(int)])/255.0

maxDistance=10
corners2=[]
corners2.append(len(corners)-1)
for i in range (len(corners)-2,-1,-1):
    found=0
    for j in range (len(corners2)-1,-1,-1):
        if customFunctions.getLength(middlePoint[corners2[j]],middlePoint[i])<=maxDistance:
            found=1
            if sumi[i]>sumi[corners2[j]]:
                corners2[j]=i
    if found==0:
        corners2.append(i)
        
fig2, ax1 = plt.subplots(ncols=1, nrows=1, figsize=(8, 4))
ax1.imshow(image_rgb)
ax1.set_axis_off()
for i in corners2:   
    corners[i].append(corners[i][0])
    for j in range (0,4):
        ax1.plot( [corners[i][j][0],corners[i][j+1][0]],[corners[i][j][1],corners[i][j+1][1]], 'xb-',linewidth=3)

ax1.set_ylim([image_gray.shape[0],0])
ax1.set_xlim([0,image_gray.shape[1]])
fig2.savefig("Example4.jpg")
if len(corners)>0:
    plt.show()
else:
    print ('No rectangles were found')

 Here is the final result:

  • Rectangles_Found

Other Custom Functions Used

Here we define some other custom function that we used in the tutorial.

Length between two points (given coordinates) in Cartesian plane

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def getLength(startPoint,secondPoint):
    ##Inputs:
    #startPoint - [x,y]
    #secondPoint - [x,y]

    ##Outputs:
    #lenv - length between two points

    v1x=secondPoint[0]-startPoint[0]
    v1y=secondPoint[1]-startPoint[1]
    lenv=np.sqrt(v1x*v1x+v1y*v1y)
    return lenv

 Find all unique rows in a list by first reordering each row ascending

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def unique(a):
    ##Inputs:
    #a - list of 1xN arrays

    ##Outputs:
    #b - filtered array

    #Example
    # a=[array([ 1,  3, 12, 17]),
    #    array([ 1,  3, 17, 12]),
    #    array([ 1,  3, 18, 20])]
    # b -> [array([ 1,  3, 12, 17]),
    #       array([ 1,  3, 18, 20])]


    b=np.array(a)
    a=np.sort(np.array(a))
    order = np.lexsort(a.T)
    a = a[order]
    b = b[order]
    diff = np.diff(a, axis=0)
    ui = np.ones(len(a), 'bool')
    ui[1:] = (diff != 0).any(axis=1) 
    return b[ui]

Reorder corners of quadrangle in such a way that nearby corners follows each other

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def reorderPoints(corners):
    ##Inputs:
    #corners - list of corners (look at example)

    ##Outputs:
    #array - reordered corners array

    #Example
    #corenrs=[[153, 104], [255, 98], [178, 144], [231, 58]]
    #array -> [[153, 104], [178, 144], [255, 98], [231, 58]]
    
    array=[]
    for i in range (0, len(corners)):
        tempArray=[]
        length1=getLength(corners[i][0],corners[i][1])
        length2=getLength(corners[i][0],corners[i][2])
        length3=getLength(corners[i][0],corners[i][3])
        lenArr=np.array([length1,length2,length3])
        tempArray.append(corners[i][0])
        tempArray.append(corners[i][1+np.where(np.array(lenArr)==np.min(lenArr))[0][0]])
        lenArr[np.where(np.array(lenArr)==np.min(lenArr))[0][0]]+=-0.00001 #in case of square
        tempArray.append(corners[i][1+np.where(np.array(lenArr)==np.max(lenArr))[0][0]])
        tempArray.append(corners[i][1+np.where(np.array(lenArr)==np.median(lenArr))[0][0]])
        array.append(tempArray)
    return array

Get angle in radians between two vectors defined by points in coordinate system (absolute value or signed value)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def getAngle(startPoint,secondPoint,thirdPoint, absol=True):
    #Gets angle between vectors (startPoint,secondPoint) and vector
    #(secondPoint,thirdPoint)
    
    ##Inputs:
    #startPoint - [x,y]
    #secondPoint - [x,y]
    #thirdPoint - [x,y]

    ##Outputs:
    #angle - angle between two vectors in radians 

    
    v1x=secondPoint[0]-startPoint[0]
    v1y=secondPoint[1]-startPoint[1]
    v2x=thirdPoint[0]-startPoint[0]
    v2y=thirdPoint[1]-startPoint[1]
    
    lenv1=np.sqrt(v1x*v1x+v1y*v1y)
    lenv2=np.sqrt(v2x*v2x+v2y*v2y)
    
    angle=np.arccos((v1x*v2x+v1y*v2y)/(lenv1*lenv2))
    
    a=1
    if absol==False:
        a = np.sign((v1x) * (v2y) - (v1y) * (v2x))
    
    if np.absolute(angle) < 0.02:
        angle=0
    return a*angle

Convert rgb image to grayscale

1
2
3
4
5
6
def rgb2gray(image_rgb):

    r, g, b = image_rgb[:,:,0], image_rgb[:,:,1], image_rgb[:,:,2]
    image_gray = np.round(0.2989 * r + 0.5870 * g + 0.1140 * b).astype(np.uint8)

    return image_gray

Blur image using flat 2x2 kernel (averaging)

1
2
3
4
5
6
7
8
9
def blurImage(image_gray):
    kernel = np.ones((2,2),np.float32)/4                     #Blurring kernel
    #We will skip first line and first column to keep it more simple, it's zeroed out anyway.
    #now for every pixel we change it with the average of 4 pixels(as the kernel): itself, pixel to left
    #pixel up, and pixel up-left. It drifts some edges one pixel to bottom-down, but it does not matter as
    #long as we use edged picture for the future work
    res=sp.convolve2d(image_gray,kernel,mode='same')
    
    return np.round(res).astype(np.uint8)

Download the Code

The source code is available on GitHub.