In the previous article, to detect the skew we used a method that works but not for all documents. If the document had different characteristics, such as a larger contrast or a single block of text, the threshold used in the function did not detect lines of text. In addition, the search for horizontal or vertical structural elements for inclinations of the order of 40 to 60 degrees in the clockwise and counter-clockwise direction failed.
if, Hough transform is a great to find lines in an image, we have to vary the parameters to get enough lines to compute an accurate skew angle.
In this article, we change the method to detect the skew and correct it with the Hough transform but without morphological operations.
Since we do not know the inclination and its direction, we have to make operating assumptions. Between a first document that would have an inclination of 45 ° and a second document that would have an inclination of -135 ° we could not determine the rotation to be carried out without analyzing the orientation of the text.
We assume that the majority of the documents will have an inclination between -45 and 45 degrees with text orientation right side up. If the skew is outside those limits, the document could be upside down after the skew correction. The analysis of the orientation of the document will later make it possible to restore it.
The following graph shows the principles that will be used to detect the inclination of a document.
For each line, if the angle theta given by the Houghline function is:
Here below the script called page_findskew.py where the method is implemented.
The usage of it will be as follow : python page_findskew.py --image image --output folder
In the original script the lines 1-21 are comments.
# import the necessary packages
import argparse # argument parser
import numpy as np # fundamental package for scientific computing
import math # mathematical functions defined by the C standard
import cv2 # computer vision image processing
import matplotlib.pyplot as plt # 2D plotting
import os # operating system dependent functionality
import h7as.utils as h7as # Some functions defined as this study progresses.
We have put shared functions like convert_to_grayscale or puttext_with_bgcolor in module “h7as.Utils”.
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
help="path to the input image")
ap.add_argument("-o", "--output", required=True,
help="path to the output folder")
args = vars(ap.parse_args())
We find the usual arguments, the path to the image to be loaded and the path to the folder to save the corrected image.
# load the image
img = cv2.imread(args["image"])
if img is not None:
rc, skew, imgenhanced = h7as.find_skew_and_straighten(img, skew_limit=np.pi/4)
# display orginal image and enhanced one side by side
# create the figure and the axes
f, axarr = plt.subplots(1, 2)
plt.suptitle("Handwritten from registers of civil status - image enhancement - angle %.4f" % (math.degrees(skew) - 90))
# adjust spacing
print("Image not found: ", args["image"])
We start by loading the image then we check if the image has been found.
After this check, we call the find function and skew correction with the following parameters:
Either the search is carried out between pi/4 – 3 pi/4, or between 0 and pi.
If everything went fine, we save the image straighten and display the result as below.
Let’s go a little further and look at the find_skew_and_straighten function.
# function to find the skew of an image if any and untilt it
def find_skew_and_straighten(img, skew_limit=np.pi/4):
# find skew
rc, skew = find_skew(img, skew_limit=np.pi/4)
# set rotation angle
theta_to_rotate = (math.degrees(skew) - 90)
# rotate the image if skewed
if skew != 0.0:
(h, w) = img.shape[:2]
center = (w // 2, h // 2)
map_matrix = cv2.getRotationMatrix2D(center, theta_to_rotate, 1.0)
imgrotated = cv2.warpAffine(img, map_matrix, (w, h),
imgrotated = img
return rc, skew, imgrotated
This function call another function called find_skew with the same parameters, then rotate the image if the skew found is not zero. The angle for the rotation is calculated in degrees relative to the desired orientation of pi / 2.
Afterwards it returns a status code, the angle in radian and the image rotated or not.
The opencv 3.3 documentation contains all the explanations to set the rotation matrix and the affine transformation.
We choose to replicate the border when rotating the image.
Now, let’s go through the main function for tilt search: find_skew.
# function to find the skew of an image if any
def find_skew(img, upper_threshold=600, angle_res=np.pi/180, skew_limit=np.pi/4):
# angle in radian
angle = 0.
# convert the image to gray scale
imggrayed = convert_to_grayscale(img)
# adaptive thresholding on reverse gray scale image
# ADAPTIVE_THRESH_MEAN_C : threshold value is the mean of neighbourhood area.
# Block Size - It decides the size of neighbourhood area.
# C - It is just a constant which is subtracted from the mean or weighted mean calculated.
imgbw = cv2.adaptiveThreshold(~imggrayed, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 15, -2)
# variables initialization
left_tendency, right_tendency, no_tendency = 0,0,0
anglestoleft, anglestoright = 0.,0.
lhorizontal_count, lvertical_count = 0, 0
half_pi = math.pi / 2
quarter_pi = math.pi / 4
three_quarter_pi = 3 * (math.pi / 4)
# look for lines
for tresh in range(upper_threshold, 100, -10):
lines = cv2.HoughLines(imgbw,1,angle_res,tresh)
if lines is not None and len(lines ) > 20:
# find if the rotation is clockwise or anti clockwise.
for line in lines:
for rho,theta in line:
# is it a vertical line or horizontal one ?
# this is just for information, not used
if theta >= quarter_pi and theta <= three_quarter_pi:
lvertical_count += 1
elif theta < quarter_pi or theta > three_quarter_pi:
lhorizontal_count += 1
# is the skew clockwise or counterclockwise ?
if theta in (0.,half_pi,math.pi,three_quarter_tau):
no_tendency += 1
indicator += "-N"
elif theta > three_quarter_pi:
elif theta > half_pi:
left_tendency += 1
anglestoleft += theta
right_tendency += 1
anglestoright += theta
# compute the mean of the angle to rotate the image depending on left and right tendency
mean_anglestoleft = anglestoleft / left_tendency if left_tendency > 0 else 0
mean_anglestoright = anglestoright / right_tendency if right_tendency > 0 else 0
# images with skew greater than 45° or less than -45°
# could be up side down after rotation if skew_limit == np.pi / 4
# if skew_limit == np.pi all skew clockwise will be up side down
if no_tendency > right_tendency and \
no_tendency > left_tendency:
angle = half_pi
elif right_tendency > left_tendency:
angle = mean_anglestoright
if skew_limit == np.pi / 4:
angle = mean_anglestoleft
elif mean_anglestoleft > math.pi:
angle = math.pi - mean_anglestoleft
angle = (mean_anglestoleft - mean_anglestoright) / 2
except TypeError as te:
print('lines is empty. No lines found.')
return False, angle
return True, angle
This function takes four parameters:
Lines 7 to 13, we convert the image to a binary one for the Hough transform function.
Lines 7, we convert the image in gray scale and Lines 13 we apply an adaptive threshold on the inverted gray scale image.
Lines 16-18, we initialize our variables:
Lines 25-28, we loop on the HoughLines function until we find at least 20 lines which is enough to find a mean skew angle. The accumulator threshold is decrease by 10 at each iteration and we leave this loop if we have not found more than 20 lines when reaching the lower limit of 100. Is it a greedy function? Certainly. We will fine tune it later, if needed.
In our case, we found mostly more than 20 lines at the first iteration. When testing this function in some cases, we had to reach a lower limit of 110 to find what we were looking for. You could notice that with this code we could get less than 20 lines at the last iteration. Does it matter?
Below we show an example of what we get.
Lines 31-77, we analyze the results return by the HoughLines function, if any. If the latter function doesn’t return any lines we leave the find_skew function returning False and an angle of zero.
Lines 32-46, we iterate through the result and count up lines against our distribution schema shown above.
Lines 49-50, we compute the average of the angle to right and to the left. We could use other computation to get a more accuracy result. Will see later if we really need a better accuracy.
Lines 55-74, the result of the function is calculated in relation to the number of horizontal and vertical lines found and the estimated direction of rotation, and of course to our limits. As we said before, the result could be at the end an upside down document. the next function to find the right orientation will solve this issue.
This solution is a little more flexible than the previous one, but we could certainly find a better one. For the time being, it meets our objective as part of our study.
In our next article, we will see a solution to determine the orientation of a document.
 Joost van Beusekom, Faisal Shafait, Thomas M. Breuel, Combined Orientation and Skew Detection Using Geometric Text-Line Modeling, Technical University of Kaiserslautern.
 – Dhaval Salvi. Document Image Analysis Techniques for Handwritten Text Segmentation, Document Image Rectification and Digital Collation. University of South Carolina, 2014. English.
 – Dan S. Bloomberg, Gary E. Kopec and Lakshmi Dasari. Measuring document image skew and orientation. Xerox Palo Alto Research Center, Palo Alto,CA 94304, 1994. English.
 – Dan S. Bloomberg. Analysis of Document Skew. Leptonica, 2002. English.
 IAM handwriting database – http://www.fki.inf.unibe.ch/databases/iam-handwriting-database
 HASYv2 – handwritten symbol database – https://zenodo.org/record/259444#.WdUj3CVPlPY
 Rimes handwritten database – http://www.a2ialab.com/doku.php?id=rimes_database:start
 Bentham handwritten database – http://www.transcriptorium.eu/~tsdata/BenthamR0/