Multilingual person name recognition and transliteration

We present an exploratory tool that extracts person names from multilingual news collections, matches name variants referring to the same person, and infers relationships between people based on the co-occurrence of their names in related news. A novel feature is the matching of name variants across...

Full description

Saved in:
Bibliographic Details
Main Authors: Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, Irina Temnikova, Anna Widiger
Format: Article
Language:English
Published: Cercle linguistique du Centre et de l'Ouest - CerLICO 2005-12-01
Series:Corela
Subjects:
Online Access:https://journals.openedition.org/corela/1219
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850264910793015296
author Bruno Pouliquen
Ralf Steinberger
Camelia Ignat
Irina Temnikova
Anna Widiger
author_facet Bruno Pouliquen
Ralf Steinberger
Camelia Ignat
Irina Temnikova
Anna Widiger
author_sort Bruno Pouliquen
collection DOAJ
description We present an exploratory tool that extracts person names from multilingual news collections, matches name variants referring to the same person, and infers relationships between people based on the co-occurrence of their names in related news. A novel feature is the matching of name variants across languages and writing systems, including names written with the Greek, Cyrillic and Arabic writing system. Due to our highly multilingual setting, we use an internal standard representation for name representation and matching, instead of adopting the traditional bilingual approach to transliteration. This work is part of a news analysis system that clusters an average of 25,000 news articles per day to detect related news within the same and across different languages.
format Article
id doaj-art-ed8783bd069b41dfa9966fbdcd68045b
institution OA Journals
issn 1638-573X
language English
publishDate 2005-12-01
publisher Cercle linguistique du Centre et de l'Ouest - CerLICO
record_format Article
series Corela
spelling doaj-art-ed8783bd069b41dfa9966fbdcd68045b2025-08-20T01:54:34ZengCercle linguistique du Centre et de l'Ouest - CerLICOCorela1638-573X2005-12-01210.4000/corela.1219Multilingual person name recognition and transliterationBruno PouliquenRalf SteinbergerCamelia IgnatIrina TemnikovaAnna WidigerWe present an exploratory tool that extracts person names from multilingual news collections, matches name variants referring to the same person, and infers relationships between people based on the co-occurrence of their names in related news. A novel feature is the matching of name variants across languages and writing systems, including names written with the Greek, Cyrillic and Arabic writing system. Due to our highly multilingual setting, we use an internal standard representation for name representation and matching, instead of adopting the traditional bilingual approach to transliteration. This work is part of a news analysis system that clusters an average of 25,000 news articles per day to detect related news within the same and across different languages.https://journals.openedition.org/corela/1219entités nomméestranslittérationextraction d’informationmultilinguismerepérage multilingue d’entités nomméestraitement automatique (du langage)
spellingShingle Bruno Pouliquen
Ralf Steinberger
Camelia Ignat
Irina Temnikova
Anna Widiger
Multilingual person name recognition and transliteration
Corela
entités nommées
translittération
extraction d’information
multilinguisme
repérage multilingue d’entités nommées
traitement automatique (du langage)
title Multilingual person name recognition and transliteration
title_full Multilingual person name recognition and transliteration
title_fullStr Multilingual person name recognition and transliteration
title_full_unstemmed Multilingual person name recognition and transliteration
title_short Multilingual person name recognition and transliteration
title_sort multilingual person name recognition and transliteration
topic entités nommées
translittération
extraction d’information
multilinguisme
repérage multilingue d’entités nommées
traitement automatique (du langage)
url https://journals.openedition.org/corela/1219
work_keys_str_mv AT brunopouliquen multilingualpersonnamerecognitionandtransliteration
AT ralfsteinberger multilingualpersonnamerecognitionandtransliteration
AT cameliaignat multilingualpersonnamerecognitionandtransliteration
AT irinatemnikova multilingualpersonnamerecognitionandtransliteration
AT annawidiger multilingualpersonnamerecognitionandtransliteration