|
ABSTRACT
| Title |
: |
Discovering suffixes: A Case Study for Marathi Language |
| Authors |
: |
Mudassar M. Majgaonker, Tanveer J Siddiqui |
| Keywords |
: |
component; Marathi morphology, Marathi stemmer,
Unsupervised stemmer, Rule-based stemmer, Natural language
processing |
| Issue Date |
: |
November 2010 |
| Abstract |
: |
Suffix stripping is a pre-processing step required in a
number of natural language processing applications. Stemmer is
a tool used to perform this step. This paper presents and
evaluates a rule-based and an unsupervised Marathi stemmer.
The rule-based stemmer uses a set of manually extracted suffix
stripping rules whereas the unsupervised approach learns
suffixes automatically from a set of words extracted from raw
Marathi text. The performance of both the stemmers has been
compared on a test dataset consisting of 1500 manually stemmed
word.
|
| Page(s) |
: |
2716-2720 |
| ISSN |
: |
0975–3397 |
| Source |
: |
Vol. 2, Issue.8 |
|