Processing morphological variants in searches of Latin text

A characteristic of natural-language text databases is that a user must be able to specify all of the variant forms of each query word if high recall is to be achieved. The most common type of word variants are those arising from morphology and thus most retrieval systems provide facilities for user...

Full description

Saved in:

Bibliographic Details
Main Authors:	Mark Greengrass, Alexander M. Robertson, Robyn Schinke, Peter Willett
Format:	Article
Language:	English
Published:	University of Borås 1996-01-01
Series:	Information Research: An International Electronic Journal
Subjects:	natural language text databases query words recall word variants morphology retrieval systems truncation information retrieval IR stemming algorithms stemmers suffixes humanities Latin
Online Access:	http://informationr.net/ir/2-1/paper10.html
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832569870903934976
author	Mark Greengrass Alexander M. Robertson Robyn Schinke Peter Willett
author_facet	Mark Greengrass Alexander M. Robertson Robyn Schinke Peter Willett
author_sort	Mark Greengrass
collection	DOAJ
description	A characteristic of natural-language text databases is that a user must be able to specify all of the variant forms of each query word if high recall is to be achieved. The most common type of word variants are those arising from morphology and thus most retrieval systems provide facilities for user-controlled right-hand (and occasionally left-hand) truncation to allow the retrieval of all words with the same root. A stemming algorithm, or stemmer, is a computational procedure that reduces all words with the same root to a single form by stripping the root of its derivational and inflectional affixes. In most cases, only suffixes are stripped so that a stemmer provides an automatic equivalent of manual, right-hand truncation. Thus far, most work on stemmers has focused on present-day languages, but the increasing user of computers in the humanities has resulted in a need for comparable tools to facilitate searching in historical text databases. This paper summarises some of the initial results of a project here in Sheffield to develop such tools for databases of Latin text.
format	Article
id	doaj-art-6c019c74aacb4a27865a221562b2fe01
institution	Kabale University
issn	1368-1613
language	English
publishDate	1996-01-01
publisher	University of Borås
record_format	Article
series	Information Research: An International Electronic Journal
spelling	doaj-art-6c019c74aacb4a27865a221562b2fe012025-02-02T19:08:02ZengUniversity of BoråsInformation Research: An International Electronic Journal1368-16131996-01-012110Processing morphological variants in searches of Latin textMark GreengrassAlexander M. RobertsonRobyn SchinkePeter WillettA characteristic of natural-language text databases is that a user must be able to specify all of the variant forms of each query word if high recall is to be achieved. The most common type of word variants are those arising from morphology and thus most retrieval systems provide facilities for user-controlled right-hand (and occasionally left-hand) truncation to allow the retrieval of all words with the same root. A stemming algorithm, or stemmer, is a computational procedure that reduces all words with the same root to a single form by stripping the root of its derivational and inflectional affixes. In most cases, only suffixes are stripped so that a stemmer provides an automatic equivalent of manual, right-hand truncation. Thus far, most work on stemmers has focused on present-day languages, but the increasing user of computers in the humanities has resulted in a need for comparable tools to facilitate searching in historical text databases. This paper summarises some of the initial results of a project here in Sheffield to develop such tools for databases of Latin text.http://informationr.net/ir/2-1/paper10.htmlnatural languagetext databasesquery wordsrecallword variantsmorphologyretrieval systemstruncationinformation retrievalIRstemming algorithmsstemmerssuffixeshumanitiesLatin
spellingShingle	Mark Greengrass Alexander M. Robertson Robyn Schinke Peter Willett Processing morphological variants in searches of Latin text Information Research: An International Electronic Journal natural language text databases query words recall word variants morphology retrieval systems truncation information retrieval IR stemming algorithms stemmers suffixes humanities Latin
title	Processing morphological variants in searches of Latin text
title_full	Processing morphological variants in searches of Latin text
title_fullStr	Processing morphological variants in searches of Latin text
title_full_unstemmed	Processing morphological variants in searches of Latin text
title_short	Processing morphological variants in searches of Latin text
title_sort	processing morphological variants in searches of latin text
topic	natural language text databases query words recall word variants morphology retrieval systems truncation information retrieval IR stemming algorithms stemmers suffixes humanities Latin
url	http://informationr.net/ir/2-1/paper10.html
work_keys_str_mv	AT markgreengrass processingmorphologicalvariantsinsearchesoflatintext AT alexandermrobertson processingmorphologicalvariantsinsearchesoflatintext AT robynschinke processingmorphologicalvariantsinsearchesoflatintext AT peterwillett processingmorphologicalvariantsinsearchesoflatintext

Processing morphological variants in searches of Latin text

Similar Items