Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features

Near-duplicate image retrieval is a classical research problem in computer vision toward many applications such as image annotation and content-based image retrieval. On the web, near-duplication is more prevalent in queries for celebrities and historical figures which are of particular interest to...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fengcai Qiao, Cheng Wang, Xin Zhang, Hui Wang
Format:	Article
Language:	English
Published:	Wiley 2013-01-01
Series:	The Scientific World Journal
Online Access:	http://dx.doi.org/10.1155/2013/795408
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832551204147691520
author	Fengcai Qiao Cheng Wang Xin Zhang Hui Wang
author_facet	Fengcai Qiao Cheng Wang Xin Zhang Hui Wang
author_sort	Fengcai Qiao
collection	DOAJ
description	Near-duplicate image retrieval is a classical research problem in computer vision toward many applications such as image annotation and content-based image retrieval. On the web, near-duplication is more prevalent in queries for celebrities and historical figures which are of particular interest to the end users. Existing methods such as bag-of-visual-words (BoVW) solve this problem mainly by exploiting purely visual features. To overcome this limitation, this paper proposes a novel text-based data-driven reranking framework, which utilizes textual features and is combined with state-of-art BoVW schemes. Under this framework, the input of the retrieval procedure is still only a query image. To verify the proposed approach, a dataset of 2 million images of 1089 different celebrities together with their accompanying texts is constructed. In addition, we comprehensively analyze the different categories of near duplication observed in our constructed dataset. Experimental results on this dataset show that the proposed framework can achieve higher mean average precision (mAP) with an improvement of 21% on average in comparison with the approaches based only on visual features, while does not notably prolong the retrieval time.
format	Article
id	doaj-art-988e36e35a2d46c69deb324e4dbacf36
institution	Kabale University
issn	1537-744X
language	English
publishDate	2013-01-01
publisher	Wiley
record_format	Article
series	The Scientific World Journal
spelling	doaj-art-988e36e35a2d46c69deb324e4dbacf362025-02-03T06:04:42ZengWileyThe Scientific World Journal1537-744X2013-01-01201310.1155/2013/795408795408Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual FeaturesFengcai Qiao0Cheng Wang1Xin Zhang2Hui Wang3College of Information Systems and Management, National University of Defense Technology, Changsha 410073, ChinaCollege of Information Systems and Management, National University of Defense Technology, Changsha 410073, ChinaCollege of Information Systems and Management, National University of Defense Technology, Changsha 410073, ChinaCollege of Information Systems and Management, National University of Defense Technology, Changsha 410073, ChinaNear-duplicate image retrieval is a classical research problem in computer vision toward many applications such as image annotation and content-based image retrieval. On the web, near-duplication is more prevalent in queries for celebrities and historical figures which are of particular interest to the end users. Existing methods such as bag-of-visual-words (BoVW) solve this problem mainly by exploiting purely visual features. To overcome this limitation, this paper proposes a novel text-based data-driven reranking framework, which utilizes textual features and is combined with state-of-art BoVW schemes. Under this framework, the input of the retrieval procedure is still only a query image. To verify the proposed approach, a dataset of 2 million images of 1089 different celebrities together with their accompanying texts is constructed. In addition, we comprehensively analyze the different categories of near duplication observed in our constructed dataset. Experimental results on this dataset show that the proposed framework can achieve higher mean average precision (mAP) with an improvement of 21% on average in comparison with the approaches based only on visual features, while does not notably prolong the retrieval time.http://dx.doi.org/10.1155/2013/795408
spellingShingle	Fengcai Qiao Cheng Wang Xin Zhang Hui Wang Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features The Scientific World Journal
title	Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_full	Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_fullStr	Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_full_unstemmed	Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_short	Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_sort	large scale near duplicate celebrity web images retrieval using visual and textual features
url	http://dx.doi.org/10.1155/2013/795408
work_keys_str_mv	AT fengcaiqiao largescalenearduplicatecelebritywebimagesretrievalusingvisualandtextualfeatures AT chengwang largescalenearduplicatecelebritywebimagesretrievalusingvisualandtextualfeatures AT xinzhang largescalenearduplicatecelebritywebimagesretrievalusingvisualandtextualfeatures AT huiwang largescalenearduplicatecelebritywebimagesretrievalusingvisualandtextualfeatures

Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features

Similar Items