|
ABSTRACT
Title |
: |
Segmentation of Telugu Touching Conjunct Consonants Using Overlapping Bounding Boxes |
Authors |
: |
J. Bharathi, Dr. P. Chandrasekar Reddy |
Keywords |
: |
Conjunct Consonants, Overlapping Bounding Boxes, Connected Components, Telugu OCR, Touching Point Identifier, Partial Projection Profile. |
Issue Date |
: |
June 2013. |
Abstract |
: |
Telugu is an ancient historic language. It is spoken by about 84.6 million people of Andhra Pradesh. The script has circular orthography with few horizontal and slant strokes. Huge literature exists for this language in printed form which needs to be preserved by scanning and converting it into editable form. Segmentation of touching characters is a major issue in any OCR system. Segmenting the words into individual glyphs by Connected Component Analysis yields poor results due to touching characters. Touching conjunct consonants is the major component which needs to be properly addressed for improving the accuracy of an OCR system. In this paper an overlapping bounding box approach is presented for segmenting the conjunct consonants along with an algorithm for identifying the correct touching location. An accuracy rate of 91.27% is achieved. |
Page(s) |
: |
538-546 |
ISSN |
: |
0975–3397 |
Source |
: |
Vol. 5, Issue.06 |
|