A topic clustering approach to finding similar questions from large question and answer archives.
With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a la...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2014-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0071511&type=printable |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850189824295698432 |
|---|---|
| author | Wei-Nan Zhang Ting Liu Yang Yang Liujuan Cao Yu Zhang Rongrong Ji |
| author_facet | Wei-Nan Zhang Ting Liu Yang Yang Liujuan Cao Yu Zhang Rongrong Ji |
| author_sort | Wei-Nan Zhang |
| collection | DOAJ |
| description | With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods. |
| format | Article |
| id | doaj-art-781b2aff15f74e32b624ca11056c794d |
| institution | OA Journals |
| issn | 1932-6203 |
| language | English |
| publishDate | 2014-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-781b2aff15f74e32b624ca11056c794d2025-08-20T02:15:30ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0193e7151110.1371/journal.pone.0071511A topic clustering approach to finding similar questions from large question and answer archives.Wei-Nan ZhangTing LiuYang YangLiujuan CaoYu ZhangRongrong JiWith the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0071511&type=printable |
| spellingShingle | Wei-Nan Zhang Ting Liu Yang Yang Liujuan Cao Yu Zhang Rongrong Ji A topic clustering approach to finding similar questions from large question and answer archives. PLoS ONE |
| title | A topic clustering approach to finding similar questions from large question and answer archives. |
| title_full | A topic clustering approach to finding similar questions from large question and answer archives. |
| title_fullStr | A topic clustering approach to finding similar questions from large question and answer archives. |
| title_full_unstemmed | A topic clustering approach to finding similar questions from large question and answer archives. |
| title_short | A topic clustering approach to finding similar questions from large question and answer archives. |
| title_sort | topic clustering approach to finding similar questions from large question and answer archives |
| url | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0071511&type=printable |
| work_keys_str_mv | AT weinanzhang atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT tingliu atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yangyang atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT liujuancao atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yuzhang atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT rongrongji atopicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT weinanzhang topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT tingliu topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yangyang topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT liujuancao topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT yuzhang topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives AT rongrongji topicclusteringapproachtofindingsimilarquestionsfromlargequestionandanswerarchives |