HitIct:Chinese corpus for the evaluation of lossless compression algorithms

HitIct, a Chinese corpus for the evaluation of lossless compression algorithms based on ANSI code, was proposed.In accordance with the principle of application representativeness, Complementary principle and openness principle, a large number of candidate files were obtained from the Internet, and t...

Full description

Saved in:
Bibliographic Details
Main Authors: CHANG Wei-ling1, YUN Xiao-chun2, FANG Bin-xing1, WANG Shu-peng2
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2009-01-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/74651782/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841537619606896640
author CHANG Wei-ling1
YUN Xiao-chun2
FANG Bin-xing1
WANG Shu-peng2
author_facet CHANG Wei-ling1
YUN Xiao-chun2
FANG Bin-xing1
WANG Shu-peng2
author_sort CHANG Wei-ling1
collection DOAJ
description HitIct, a Chinese corpus for the evaluation of lossless compression algorithms based on ANSI code, was proposed.In accordance with the principle of application representativeness, Complementary principle and openness principle, a large number of candidate files were obtained from the Internet, and then average compression ratio, average correlation coefficient, compression ratio correlation coefficient and standard deviation were used to select the files that give the most accurate indication of the overall performance of compression algorithms.Experimental results show that this collection has a good representativeness and stability, and can be used as the supplementary test set of the main benchmark for comparing compression methods.
format Article
id doaj-art-695d72bf03454bada5a333caa7ad6cb4
institution Kabale University
issn 1000-436X
language zho
publishDate 2009-01-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-695d72bf03454bada5a333caa7ad6cb42025-01-14T08:29:49ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2009-01-0130424774651782HitIct:Chinese corpus for the evaluation of lossless compression algorithmsCHANG Wei-ling1YUN Xiao-chun2FANG Bin-xing1WANG Shu-peng2HitIct, a Chinese corpus for the evaluation of lossless compression algorithms based on ANSI code, was proposed.In accordance with the principle of application representativeness, Complementary principle and openness principle, a large number of candidate files were obtained from the Internet, and then average compression ratio, average correlation coefficient, compression ratio correlation coefficient and standard deviation were used to select the files that give the most accurate indication of the overall performance of compression algorithms.Experimental results show that this collection has a good representativeness and stability, and can be used as the supplementary test set of the main benchmark for comparing compression methods.http://www.joconline.com.cn/zh/article/74651782/data compressioncorpuslossless compression
spellingShingle CHANG Wei-ling1
YUN Xiao-chun2
FANG Bin-xing1
WANG Shu-peng2
HitIct:Chinese corpus for the evaluation of lossless compression algorithms
Tongxin xuebao
data compression
corpus
lossless compression
title HitIct:Chinese corpus for the evaluation of lossless compression algorithms
title_full HitIct:Chinese corpus for the evaluation of lossless compression algorithms
title_fullStr HitIct:Chinese corpus for the evaluation of lossless compression algorithms
title_full_unstemmed HitIct:Chinese corpus for the evaluation of lossless compression algorithms
title_short HitIct:Chinese corpus for the evaluation of lossless compression algorithms
title_sort hitict chinese corpus for the evaluation of lossless compression algorithms
topic data compression
corpus
lossless compression
url http://www.joconline.com.cn/zh/article/74651782/
work_keys_str_mv AT changweiling1 hitictchinesecorpusfortheevaluationoflosslesscompressionalgorithms
AT yunxiaochun2 hitictchinesecorpusfortheevaluationoflosslesscompressionalgorithms
AT fangbinxing1 hitictchinesecorpusfortheevaluationoflosslesscompressionalgorithms
AT wangshupeng2 hitictchinesecorpusfortheevaluationoflosslesscompressionalgorithms