Item Details

Print View

Using N-Grams to Process Hindi Queries With Transliteration Variations

Natrajan, Anand; Powell, Allison; French, James
Format
Report
Author
Natrajan, Anand
Powell, Allison
French, James
Abstract
Retrieval systems based on N-grams have been used as alternatives to word-based systems. N-grams offer a language-independent technique that allows retrieval based on portions of words. A query that contains misspellings or differences in transliteration can defeat word-based systems. N-gram systems are more resistant to these problems. We present a retrieval system based on N-grams that uses a collection of Hindi songs. Within this retrieval system, we study the effect of varying N on retrievability. Additionally, we present an alternative spell-checking tool based on N- grams. We conclude with a discussion of the number of N-grams produced by different values of N for different languages and a discussion of the choice of N.
Language
English
Date Received
2012-10-29
Published
University of Virginia, Department of Computer Science, 1997
Published Date
1997
Collection
Libra Open Repository
In CopyrightIn Copyright
▾See more
▴See less

Availability

Access Online