Novel kernel function for computing the similarity of text

To enhance the performance of detecting similar documents,a novel kernel function named S_Wang kernel was constructed.Based on the actual situation of computing text similarity,the S_Wang kernel was newly bu lt with consideration of the Euclidean distance and angle between vectors that represented t...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiu-hong WANG, Shi-guang JU
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2012-12-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2012.12.006/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841539862004498432
author Xiu-hong WANG
Shi-guang JU
author_facet Xiu-hong WANG
Shi-guang JU
author_sort Xiu-hong WANG
collection DOAJ
description To enhance the performance of detecting similar documents,a novel kernel function named S_Wang kernel was constructed.Based on the actual situation of computing text similarity,the S_Wang kernel was newly bu lt with consideration of the Euclidean distance and angle between vectors that represented the text documents to be compared.It was proved that the function could be constructed as a kernel function according to Mercer theorem.Experimental verification of the performance of the kernels in the text document similarity calculation was provided.The results show that the S_Wang kernel is significantly better than the precision and F1 performance of other kernels like Cauchy kernel,Latent Semantic Kernel (LSK) and CLA kernel.S_Wang kernel is suitable for text similarity computation.
format Article
id doaj-art-319536049aa44f8a95594f0b9c62fb88
institution Kabale University
issn 1000-436X
language zho
publishDate 2012-12-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-319536049aa44f8a95594f0b9c62fb882025-01-14T06:33:15ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2012-12-0133434859666077Novel kernel function for computing the similarity of textXiu-hong WANGShi-guang JUTo enhance the performance of detecting similar documents,a novel kernel function named S_Wang kernel was constructed.Based on the actual situation of computing text similarity,the S_Wang kernel was newly bu lt with consideration of the Euclidean distance and angle between vectors that represented the text documents to be compared.It was proved that the function could be constructed as a kernel function according to Mercer theorem.Experimental verification of the performance of the kernels in the text document similarity calculation was provided.The results show that the S_Wang kernel is significantly better than the precision and F1 performance of other kernels like Cauchy kernel,Latent Semantic Kernel (LSK) and CLA kernel.S_Wang kernel is suitable for text similarity computation.http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2012.12.006/information retrievaltext similaritykernel functionS_Wang kernelLSKCauchy kernelCLA kernel
spellingShingle Xiu-hong WANG
Shi-guang JU
Novel kernel function for computing the similarity of text
Tongxin xuebao
information retrieval
text similarity
kernel function
S_Wang kernel
LSK
Cauchy kernel
CLA kernel
title Novel kernel function for computing the similarity of text
title_full Novel kernel function for computing the similarity of text
title_fullStr Novel kernel function for computing the similarity of text
title_full_unstemmed Novel kernel function for computing the similarity of text
title_short Novel kernel function for computing the similarity of text
title_sort novel kernel function for computing the similarity of text
topic information retrieval
text similarity
kernel function
S_Wang kernel
LSK
Cauchy kernel
CLA kernel
url http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2012.12.006/
work_keys_str_mv AT xiuhongwang novelkernelfunctionforcomputingthesimilarityoftext
AT shiguangju novelkernelfunctionforcomputingthesimilarityoftext