ScBlkCom: An Integrated Compression Algorithm for Single-Cell RNA Sequencing Data
High-throughput sequencing advancements have shifted genomic project bottlenecks from data generation to computational storage and analysis. Single-cell RNA-seq (scRNA-seq) data exhibits unique structural features, including extensive labeled sequence identifiers, which conventional compression tool...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
EDP Sciences
2025-01-01
|
| Series: | BIO Web of Conferences |
| Online Access: | https://www.bio-conferences.org/articles/bioconf/pdf/2025/25/bioconf_icbb2025_03016.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850267912264220672 |
|---|---|
| author | Zhang Yuanxin Lu Yang Sun Xiao Fan Jue |
| author_facet | Zhang Yuanxin Lu Yang Sun Xiao Fan Jue |
| author_sort | Zhang Yuanxin |
| collection | DOAJ |
| description | High-throughput sequencing advancements have shifted genomic project bottlenecks from data generation to computational storage and analysis. Single-cell RNA-seq (scRNA-seq) data exhibits unique structural features, including extensive labeled sequence identifiers, which conventional compression tools fail to optimize. This study proposes ScBlkCom, a specialized compression scheme for scRNA-seq data. The method partitions sequencing data into distinct blocks and applies tailored compression strategies: differential encoding for numerical attributes, Huffman coding for categorical labels, and context-adaptive encoding for sequence identifiers. Experiments demonstrate ScBlkCom achieves 84.29% higher compression gain compared to single-module approaches and outperforms generic tools (e.g., GZIP, BZIP2) by 6.44% in compression ratio, while maintaining stable processing speeds. This block-wise adaptive framework effectively addresses scRNA-seq data redundancy, offering enhanced storage efficiency for large-scale single-cell studies. |
| format | Article |
| id | doaj-art-568aef49223f4860b0be845a94a7d5ff |
| institution | OA Journals |
| issn | 2117-4458 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | EDP Sciences |
| record_format | Article |
| series | BIO Web of Conferences |
| spelling | doaj-art-568aef49223f4860b0be845a94a7d5ff2025-08-20T01:53:36ZengEDP SciencesBIO Web of Conferences2117-44582025-01-011740301610.1051/bioconf/202517403016bioconf_icbb2025_03016ScBlkCom: An Integrated Compression Algorithm for Single-Cell RNA Sequencing DataZhang Yuanxin0Lu Yang1Sun Xiao2Fan Jue3School of Biological Science and Medical Engineering, Southeast UniversityXinge Yuan Biotechnology Co., Ltd.School of Biological Science and Medical Engineering, Southeast UniversityXinge Yuan Biotechnology Co., Ltd.High-throughput sequencing advancements have shifted genomic project bottlenecks from data generation to computational storage and analysis. Single-cell RNA-seq (scRNA-seq) data exhibits unique structural features, including extensive labeled sequence identifiers, which conventional compression tools fail to optimize. This study proposes ScBlkCom, a specialized compression scheme for scRNA-seq data. The method partitions sequencing data into distinct blocks and applies tailored compression strategies: differential encoding for numerical attributes, Huffman coding for categorical labels, and context-adaptive encoding for sequence identifiers. Experiments demonstrate ScBlkCom achieves 84.29% higher compression gain compared to single-module approaches and outperforms generic tools (e.g., GZIP, BZIP2) by 6.44% in compression ratio, while maintaining stable processing speeds. This block-wise adaptive framework effectively addresses scRNA-seq data redundancy, offering enhanced storage efficiency for large-scale single-cell studies.https://www.bio-conferences.org/articles/bioconf/pdf/2025/25/bioconf_icbb2025_03016.pdf |
| spellingShingle | Zhang Yuanxin Lu Yang Sun Xiao Fan Jue ScBlkCom: An Integrated Compression Algorithm for Single-Cell RNA Sequencing Data BIO Web of Conferences |
| title | ScBlkCom: An Integrated Compression Algorithm for Single-Cell RNA Sequencing Data |
| title_full | ScBlkCom: An Integrated Compression Algorithm for Single-Cell RNA Sequencing Data |
| title_fullStr | ScBlkCom: An Integrated Compression Algorithm for Single-Cell RNA Sequencing Data |
| title_full_unstemmed | ScBlkCom: An Integrated Compression Algorithm for Single-Cell RNA Sequencing Data |
| title_short | ScBlkCom: An Integrated Compression Algorithm for Single-Cell RNA Sequencing Data |
| title_sort | scblkcom an integrated compression algorithm for single cell rna sequencing data |
| url | https://www.bio-conferences.org/articles/bioconf/pdf/2025/25/bioconf_icbb2025_03016.pdf |
| work_keys_str_mv | AT zhangyuanxin scblkcomanintegratedcompressionalgorithmforsinglecellrnasequencingdata AT luyang scblkcomanintegratedcompressionalgorithmforsinglecellrnasequencingdata AT sunxiao scblkcomanintegratedcompressionalgorithmforsinglecellrnasequencingdata AT fanjue scblkcomanintegratedcompressionalgorithmforsinglecellrnasequencingdata |