Article

Skew detection n°2

In the previous article, to detect the skew we used a method that works but not for all documents. If the document had different characteristics, such as a larger contrast or a single block of text, the threshold used in the function did not detect lines of text. In addition, the search for horizontal or vertical structural elements for inclinations of the order of 40 to 60 degrees in the clockwise and counter-clockwise direction failed.

if, Hough transform is a great to find lines in an image, we have to vary the parameters to get enough lines to compute an accurate skew angle.

In this article, we change the method to detect the skew and correct it with the Hough transform but without morphological operations.

Hababa, Yemen 2000

Hababa, Yemen, 2000

Since we do not know the inclination and its direction, we have to make operating assumptions. Between a first document that would have an inclination of 45 ° and a second document that would have an inclination of -135 ° we could not determine the rotation to be carried out without analyzing the orientation of the text.

We assume that the majority of the documents will have an inclination between -45 and 45 degrees with text orientation right side up. If the skew is outside those limits, the document could be upside down after the skew correction. The analysis of the orientation of the document will later make it possible to restore it.

The following graph shows the principles that will be used to detect the inclination of a document.

FindSkew

Figure 1 – Lines classification principles

For each line, if the angle theta given by the Houghline function is:

  • less than pi/2 then the skew is classified as clockwise, else counterclockwise.
  • less than pi/4 or more than 3 pi/4 then the line is classified as horizontal,
  • greater than pi/4 and less than 3 pi/4 then the line is classified as vertical.

Here below the script called page_findskew.py  where the method is implemented.

The usage of it will be as follow : python page_findskew.py --image image --output folder

In the original script the lines 1-21 are comments.

We have put shared functions like convert_to_grayscale or puttext_with_bgcolor in module “h7as.Utils”.

We find the usual arguments, the path to the image to be loaded and the path to the folder to save the corrected image.

We start by loading the image then we check if the image has been found.

After this check, we call the find function and skew correction with the following parameters:

  • the image
  • the tilt limit as pi / 4 or pi.

Either the search is carried out between pi/4 – 3 pi/4, or between 0 and pi.

If everything went fine, we save the image straighten and display the result as below.

Figure 3 - Findskew

Figure 3 – Image before and after processing.

Let’s go a little further and look at the find_skew_and_straighten function.

This function call another function called find_skew with the same parameters, then rotate the image if the skew found is not zero. The angle for the rotation is calculated in degrees relative to the desired orientation of pi / 2.

Afterwards it returns a status code, the angle in radian and the image rotated or not.

The opencv 3.3 documentation contains all the explanations to set the rotation matrix and the affine transformation.

We choose to replicate the border when rotating the image.

Now, let’s go through the main function for tilt search: find_skew.

This function takes four parameters:

  • The image to process,
  • The accumulator starting threshold to search for lines withe the Hough transform, (default value 600)
  • The angle resolution of the accumulator in radians. (default value pi / 180)
  • The skew limit searched. (default value pi / 4)

Lines 7 to 13, we convert the image to a binary one for the Hough transform function.

Lines 7, we convert the image in gray scale and Lines 13 we apply an adaptive threshold on the inverted gray scale image.

Lines 16-18, we initialize our variables:

  • left_tendency, right_tendency are accumulators to count the clockwise and counterclockwise lines, 
  • anglestoleft, anglestoright sum the clockwise and counterclockwise theta given by the HoughLines function, 
  • lhorizontal_count, lvertical_count are accumulators to count vertical and horizontal lines. 

Lines 25-28, we loop on the HoughLines function until we find at least 20 lines which is enough to find a mean skew angle. The accumulator threshold is decrease by 10 at each iteration and we leave this loop if we have not found more than 20 lines when reaching the lower limit of 100. Is it a greedy function? Certainly. We will fine tune it later, if needed.

In our case, we found mostly more than 20 lines at the first iteration. When testing this function in some cases, we had to reach a lower limit of 110 to find what we were looking for. You could notice that with this code we could get less than 20 lines at the last iteration.  Does it matter?

Below we show an example of what we get.

Figure 4 - HoughLines result

Figure 4 – Original and the lines found with HoughLines function.

Lines 31-77, we analyze the results return by the HoughLines function, if any. If the latter function doesn’t return any lines we leave the find_skew function returning False and an angle of zero.

Lines 32-46, we iterate through the result and count up lines against our distribution schema shown above.

Lines 49-50, we compute the average of the angle to right and to the left. We could use other computation to get a more accuracy result. Will see later if we really need a better accuracy.

Lines 55-74, the result of the function is calculated in relation to the number of horizontal and vertical lines found and the estimated direction of rotation, and of course to our limits. As we said before, the result could be at the end an upside down document. the next function to find the right orientation will solve this issue.

This solution is a little more flexible than the previous one, but we could certainly find a better one. For the time being, it meets our objective as part of our study.

In our next article, we will see a solution to determine the orientation of a document.

References

[1] Joost van Beusekom, Faisal Shafait, Thomas M. Breuel, Combined Orientation and Skew Detection Using Geometric Text-Line Modeling, Technical University of Kaiserslautern.

[2] – Dhaval Salvi. Document Image Analysis Techniques for Handwritten Text Segmentation, Document Image Rectification and Digital Collation. University of South Carolina, 2014. English.

[3] – Dan S. Bloomberg, Gary E. Kopec and Lakshmi Dasari. Measuring document image skew and orientation. Xerox Palo Alto Research Center, Palo Alto,CA 94304, 1994. English.

[4] – Dan S. Bloomberg. Analysis of Document Skew. Leptonica, 2002. English.

Resources

[1] IAM handwriting database – http://www.fki.inf.unibe.ch/databases/iam-handwriting-database

[2] HASYv2 – handwritten symbol database – https://zenodo.org/record/259444#.WdUj3CVPlPY

[3] Rimes handwritten database – http://www.a2ialab.com/doku.php?id=rimes_database:start

[4] Bentham handwritten database – http://www.transcriptorium.eu/~tsdata/BenthamR0/

Navigation

Social Media