Information retrieval on Turkish texts
Type of publication: | Article |
Citation: | |
Publication status: | Accepted |
Journal: | J of American Society for Information Science & Technology |
Volume: | 59 |
Year: | 2008 |
Pages: | 407-421 |
Key (?): | key1 |
DOI: | 10.1002/asi.20750 |
Abstract: | In this study, we investigate information retrieval (IR) on Turkish texts using a large-scale test collection that contains 408,305 documents and 72 ad hoc queries. We examine the effects of several stemming options and query-document matching functions on retrieval performance. We show that a simple word truncation approach, a word truncation approach that uses language-dependent corpus statistics, and an elaborate lemmatizer-based stemmer provide similar retrieval effectiveness in Turkish IR. We investigate the effects of a range of search conditions on the retrieval performance; these include scalability issues, query and document length effects, and the use of stopword list in indexing. |
Keywords: | |
Authors | |
Added by: | [] |
Total mark: | 5 |
Attachments
|
|
Notes
|
|
|
|
Topics
|
|
|