ABSTRACT

Title	:	Discovering suffixes: A Case Study for Marathi Language
Authors	:	Mudassar M. Majgaonker, Tanveer J Siddiqui
Keywords	:	component; Marathi morphology, Marathi stemmer, Unsupervised stemmer, Rule-based stemmer, Natural language processing
Issue Date	:	November 2010
Abstract	:	Suffix stripping is a pre-processing step required in a number of natural language processing applications. Stemmer is a tool used to perform this step. This paper presents and evaluates a rule-based and an unsupervised Marathi stemmer. The rule-based stemmer uses a set of manually extracted suffix stripping rules whereas the unsupervised approach learns suffixes automatically from a set of words extracted from raw Marathi text. The performance of both the stemmers has been compared on a test dataset consisting of 1500 manually stemmed word.
Page(s)	:	2716-2720
ISSN	:	0975–3397
Source	:	Vol. 2, Issue.8