Stemming and N-gram matching for term conflation in Turkish texts

One of the main problems involved in the use of free text for indexing and retrieval is the variation in word forms that is likely to be encountered. The most common type of variations are spelling errors, alternative spellings, multi-word concepts, transliteration, affixes and abbreviations. One wa...

Full description

Saved in:
Bibliographic Details
Main Authors: F. Çuna Ekmekçioglu, Michael F. Lynch, Peter Willett
Format: Article
Language:English
Published: University of Borås 1996-01-01
Series:Information Research: An International Electronic Journal
Subjects:
Online Access:http://informationr.net/ir/2-2/paper13.html
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832570992397910016
author F. Çuna Ekmekçioglu
Michael F. Lynch
Peter Willett
author_facet F. Çuna Ekmekçioglu
Michael F. Lynch
Peter Willett
author_sort F. Çuna Ekmekçioglu
collection DOAJ
description One of the main problems involved in the use of free text for indexing and retrieval is the variation in word forms that is likely to be encountered. The most common type of variations are spelling errors, alternative spellings, multi-word concepts, transliteration, affixes and abbreviations. One way to alleviate this problem is to use a conflation algorithm, a computational procedure that is designed to bring together words that are semantically related, and to reduce them to a single form for retrieval purposes. In this paper, we discuss the use of conflation techniques for Turkish text databases.
format Article
id doaj-art-d7f316f0ade64b1e905f493f19b43bda
institution Kabale University
issn 1368-1613
language English
publishDate 1996-01-01
publisher University of Borås
record_format Article
series Information Research: An International Electronic Journal
spelling doaj-art-d7f316f0ade64b1e905f493f19b43bda2025-02-02T13:23:46ZengUniversity of BoråsInformation Research: An International Electronic Journal1368-16131996-01-012213Stemming and N-gram matching for term conflation in Turkish textsF. Çuna EkmekçiogluMichael F. LynchPeter WillettOne of the main problems involved in the use of free text for indexing and retrieval is the variation in word forms that is likely to be encountered. The most common type of variations are spelling errors, alternative spellings, multi-word concepts, transliteration, affixes and abbreviations. One way to alleviate this problem is to use a conflation algorithm, a computational procedure that is designed to bring together words that are semantically related, and to reduce them to a single form for retrieval purposes. In this paper, we discuss the use of conflation techniques for Turkish text databases.http://informationr.net/ir/2-2/paper13.htmlfree textindexingretrievalinformation retrievalword formsspelling errorsalternative spellingsmulti-word conceptstransliterationaffixesabbreviationsconflation algorithmTurkish
spellingShingle F. Çuna Ekmekçioglu
Michael F. Lynch
Peter Willett
Stemming and N-gram matching for term conflation in Turkish texts
Information Research: An International Electronic Journal
free text
indexing
retrieval
information retrieval
word forms
spelling errors
alternative spellings
multi-word concepts
transliteration
affixes
abbreviations
conflation algorithm
Turkish
title Stemming and N-gram matching for term conflation in Turkish texts
title_full Stemming and N-gram matching for term conflation in Turkish texts
title_fullStr Stemming and N-gram matching for term conflation in Turkish texts
title_full_unstemmed Stemming and N-gram matching for term conflation in Turkish texts
title_short Stemming and N-gram matching for term conflation in Turkish texts
title_sort stemming and n gram matching for term conflation in turkish texts
topic free text
indexing
retrieval
information retrieval
word forms
spelling errors
alternative spellings
multi-word concepts
transliteration
affixes
abbreviations
conflation algorithm
Turkish
url http://informationr.net/ir/2-2/paper13.html
work_keys_str_mv AT fcunaekmekcioglu stemmingandngrammatchingfortermconflationinturkishtexts
AT michaelflynch stemmingandngrammatchingfortermconflationinturkishtexts
AT peterwillett stemmingandngrammatchingfortermconflationinturkishtexts