Abstract |
: |
In this research work, we propose to identify an imaginary line called baseline threading through the entire stretch of text-line, with reference to which the location of vertical extents of Persian characters could be accurately interpreted. Depending upon the curvedness of the handwritten Persian text-line the baseline also would be curved. In this research a novel piece-wise painting scheme is proposed to prepare patches of black and white blocks all along the text-line, identify some candidate points, regress a curve through these candidate points to trace the baseline which is subsequently stretched straight horizontally and subsequently we de-tilt the characters to align the text-line with the horizontal imaginary baseline properly. The proposed algorithm is evaluated with 108 Persian handwritten text-lines containing 3612 subwords. Experimental analysis showed that 91.2% of the subwords are accurately aligned. Further, the proposed scheme is tested with another dataset containing 600 text-lines [13] and more accurate results are achieved when compared with the results reported in state of the art for the same dataset. The effectiveness of aligning text-lines linearly is demonstrated through OCRing for readability of tilted printed English text-lines and corresponding transformed text-lines, which are obtained using the proposed procedure.
|