A topic clustering approach to finding similar questions from large question and answer archives.

With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a la...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wei-Nan Zhang, Ting Liu, Yang Yang, Liujuan Cao, Yu Zhang, Rongrong Ji
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2014-01-01
Series:	PLoS ONE
Online Access:	https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0071511&type=printable
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850189824295698432
author	Wei-Nan Zhang Ting Liu Yang Yang Liujuan Cao Yu Zhang Rongrong Ji
author_facet	Wei-Nan Zhang Ting Liu Yang Yang Liujuan Cao Yu Zhang Rongrong Ji
author_sort	Wei-Nan Zhang
collection	DOAJ
description	With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods.
format	Article
id	doaj-art-781b2aff15f74e32b624ca11056c794d
institution	OA Journals
issn	1932-6203
language	English
publishDate	2014-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj-art-781b2aff15f74e32b624ca11056c794d2025-08-20T02:15:30ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0193e7151110.1371/journal.pone.0071511A topic clustering approach to finding similar questions from large question and answer archives.Wei-Nan ZhangTing LiuYang YangLiujuan CaoYu ZhangRongrong JiWith the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0071511&type=printable
spellingShingle	Wei-Nan Zhang Ting Liu Yang Yang Liujuan Cao Yu Zhang Rongrong Ji A topic clustering approach to finding similar questions from large question and answer archives. PLoS ONE
title	A topic clustering approach to finding similar questions from large question and answer archives.
title_full	A topic clustering approach to finding similar questions from large question and answer archives.
title_fullStr	A topic clustering approach to finding similar questions from large question and answer archives.
title_full_unstemmed	A topic clustering approach to finding similar questions from large question and answer archives.
title_short	A topic clustering approach to finding similar questions from large question and answer archives.
title_sort	topic clustering approach to finding similar questions from large question and answer archives
url	https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0071511&type=printable
work_keys_str_mv	AT weinanzhang atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT tingliu atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yangyang atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT liujuancao atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yuzhang atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT rongrongji atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT weinanzhang topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT tingliu topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yangyang topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT liujuancao topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yuzhang topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT rongrongji topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives

A topic clustering approach to finding similar questions from large question and answer archives.

Similar Items