Multilingual person name recognition and transliteration
We present an exploratory tool that extracts person names from multilingual news collections, matches name variants referring to the same person, and infers relationships between people based on the co-occurrence of their names in related news. A novel feature is the matching of name variants across...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Cercle linguistique du Centre et de l'Ouest - CerLICO
2005-12-01
|
| Series: | Corela |
| Subjects: | |
| Online Access: | https://journals.openedition.org/corela/1219 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850264910793015296 |
|---|---|
| author | Bruno Pouliquen Ralf Steinberger Camelia Ignat Irina Temnikova Anna Widiger |
| author_facet | Bruno Pouliquen Ralf Steinberger Camelia Ignat Irina Temnikova Anna Widiger |
| author_sort | Bruno Pouliquen |
| collection | DOAJ |
| description | We present an exploratory tool that extracts person names from multilingual news collections, matches name variants referring to the same person, and infers relationships between people based on the co-occurrence of their names in related news. A novel feature is the matching of name variants across languages and writing systems, including names written with the Greek, Cyrillic and Arabic writing system. Due to our highly multilingual setting, we use an internal standard representation for name representation and matching, instead of adopting the traditional bilingual approach to transliteration. This work is part of a news analysis system that clusters an average of 25,000 news articles per day to detect related news within the same and across different languages. |
| format | Article |
| id | doaj-art-ed8783bd069b41dfa9966fbdcd68045b |
| institution | OA Journals |
| issn | 1638-573X |
| language | English |
| publishDate | 2005-12-01 |
| publisher | Cercle linguistique du Centre et de l'Ouest - CerLICO |
| record_format | Article |
| series | Corela |
| spelling | doaj-art-ed8783bd069b41dfa9966fbdcd68045b2025-08-20T01:54:34ZengCercle linguistique du Centre et de l'Ouest - CerLICOCorela1638-573X2005-12-01210.4000/corela.1219Multilingual person name recognition and transliterationBruno PouliquenRalf SteinbergerCamelia IgnatIrina TemnikovaAnna WidigerWe present an exploratory tool that extracts person names from multilingual news collections, matches name variants referring to the same person, and infers relationships between people based on the co-occurrence of their names in related news. A novel feature is the matching of name variants across languages and writing systems, including names written with the Greek, Cyrillic and Arabic writing system. Due to our highly multilingual setting, we use an internal standard representation for name representation and matching, instead of adopting the traditional bilingual approach to transliteration. This work is part of a news analysis system that clusters an average of 25,000 news articles per day to detect related news within the same and across different languages.https://journals.openedition.org/corela/1219entités nomméestranslittérationextraction d’informationmultilinguismerepérage multilingue d’entités nomméestraitement automatique (du langage) |
| spellingShingle | Bruno Pouliquen Ralf Steinberger Camelia Ignat Irina Temnikova Anna Widiger Multilingual person name recognition and transliteration Corela entités nommées translittération extraction d’information multilinguisme repérage multilingue d’entités nommées traitement automatique (du langage) |
| title | Multilingual person name recognition and transliteration |
| title_full | Multilingual person name recognition and transliteration |
| title_fullStr | Multilingual person name recognition and transliteration |
| title_full_unstemmed | Multilingual person name recognition and transliteration |
| title_short | Multilingual person name recognition and transliteration |
| title_sort | multilingual person name recognition and transliteration |
| topic | entités nommées translittération extraction d’information multilinguisme repérage multilingue d’entités nommées traitement automatique (du langage) |
| url | https://journals.openedition.org/corela/1219 |
| work_keys_str_mv | AT brunopouliquen multilingualpersonnamerecognitionandtransliteration AT ralfsteinberger multilingualpersonnamerecognitionandtransliteration AT cameliaignat multilingualpersonnamerecognitionandtransliteration AT irinatemnikova multilingualpersonnamerecognitionandtransliteration AT annawidiger multilingualpersonnamerecognitionandtransliteration |