Chromosome-level genome assembly and annotation of Gypsophila vaccaria

Abstract Gypsophila vaccaria Sm., a member of the Caryophyllaceae family, is known for its dry mature seeds, which are widely used in traditional Chinese medicine as “Wang Bu Liu Xing”. This study presents a high-quality, chromosome-scale genome assembly of G. vaccaria, integrating Hi-C technology w...

Full description

Saved in:
Bibliographic Details
Main Authors: Chaoqiang Zhang, Jiayin Zhang, Bin Yang, Yunchen Zhao, Liang Yin, Enjun Wang, Yaqiu Zhao, Jinglong Li
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-05121-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849731944383774720
author Chaoqiang Zhang
Jiayin Zhang
Bin Yang
Yunchen Zhao
Liang Yin
Enjun Wang
Yaqiu Zhao
Jinglong Li
author_facet Chaoqiang Zhang
Jiayin Zhang
Bin Yang
Yunchen Zhao
Liang Yin
Enjun Wang
Yaqiu Zhao
Jinglong Li
author_sort Chaoqiang Zhang
collection DOAJ
description Abstract Gypsophila vaccaria Sm., a member of the Caryophyllaceae family, is known for its dry mature seeds, which are widely used in traditional Chinese medicine as “Wang Bu Liu Xing”. This study presents a high-quality, chromosome-scale genome assembly of G. vaccaria, integrating Hi-C technology with PacBio and Illumina sequencing data. The final assembled genome measures 1.09 Gb in total length, with a contig N50 of 9.73 Mb and a scaffold N50 of 73.3 Mb, and complete benchmarking universal single-copy orthologs (BUSCO) for the genome and protein modes were 95.9% and 94.9%. Notably, 99.93% of the sequences are anchored to 15 pseudo-chromosomes. A total of 21,795 protein-coding genes were predicted, and repetitive elements were found to constitute 80.43% of the assembled genome. This chromosome-level genome assembly serves as an invaluable resource for future research, including functional genomics and molecular breeding of G. vaccaria.
format Article
id doaj-art-8010643dc51442d9b67ebb2667df6d12
institution DOAJ
issn 2052-4463
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-8010643dc51442d9b67ebb2667df6d122025-08-20T03:08:22ZengNature PortfolioScientific Data2052-44632025-05-011211910.1038/s41597-025-05121-6Chromosome-level genome assembly and annotation of Gypsophila vaccariaChaoqiang Zhang0Jiayin Zhang1Bin Yang2Yunchen Zhao3Liang Yin4Enjun Wang5Yaqiu Zhao6Jinglong Li7College of Life Sciences and Engineering, Key Laboratory of Hexi Corridor Resources Utilization of Gansu, Hexi UniversityMinistry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, Institute of Biodiversity Science, School of Life Sciences, Fudan UniversityCollege of Life Sciences and Engineering, Key Laboratory of Hexi Corridor Resources Utilization of Gansu, Hexi UniversityCollege of Agriculture and Ecological Engineering, Hexi UniversityCollege of Agriculture and Ecological Engineering, Hexi UniversityCollege of Agriculture and Ecological Engineering, Hexi UniversityState Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical SciencesState Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Key Laboratory of Herbage and Endemic Crop Biology, Ministry of Education, School of Life Sciences, Inner Mongolia UniversityAbstract Gypsophila vaccaria Sm., a member of the Caryophyllaceae family, is known for its dry mature seeds, which are widely used in traditional Chinese medicine as “Wang Bu Liu Xing”. This study presents a high-quality, chromosome-scale genome assembly of G. vaccaria, integrating Hi-C technology with PacBio and Illumina sequencing data. The final assembled genome measures 1.09 Gb in total length, with a contig N50 of 9.73 Mb and a scaffold N50 of 73.3 Mb, and complete benchmarking universal single-copy orthologs (BUSCO) for the genome and protein modes were 95.9% and 94.9%. Notably, 99.93% of the sequences are anchored to 15 pseudo-chromosomes. A total of 21,795 protein-coding genes were predicted, and repetitive elements were found to constitute 80.43% of the assembled genome. This chromosome-level genome assembly serves as an invaluable resource for future research, including functional genomics and molecular breeding of G. vaccaria.https://doi.org/10.1038/s41597-025-05121-6
spellingShingle Chaoqiang Zhang
Jiayin Zhang
Bin Yang
Yunchen Zhao
Liang Yin
Enjun Wang
Yaqiu Zhao
Jinglong Li
Chromosome-level genome assembly and annotation of Gypsophila vaccaria
Scientific Data
title Chromosome-level genome assembly and annotation of Gypsophila vaccaria
title_full Chromosome-level genome assembly and annotation of Gypsophila vaccaria
title_fullStr Chromosome-level genome assembly and annotation of Gypsophila vaccaria
title_full_unstemmed Chromosome-level genome assembly and annotation of Gypsophila vaccaria
title_short Chromosome-level genome assembly and annotation of Gypsophila vaccaria
title_sort chromosome level genome assembly and annotation of gypsophila vaccaria
url https://doi.org/10.1038/s41597-025-05121-6
work_keys_str_mv AT chaoqiangzhang chromosomelevelgenomeassemblyandannotationofgypsophilavaccaria
AT jiayinzhang chromosomelevelgenomeassemblyandannotationofgypsophilavaccaria
AT binyang chromosomelevelgenomeassemblyandannotationofgypsophilavaccaria
AT yunchenzhao chromosomelevelgenomeassemblyandannotationofgypsophilavaccaria
AT liangyin chromosomelevelgenomeassemblyandannotationofgypsophilavaccaria
AT enjunwang chromosomelevelgenomeassemblyandannotationofgypsophilavaccaria
AT yaqiuzhao chromosomelevelgenomeassemblyandannotationofgypsophilavaccaria
AT jinglongli chromosomelevelgenomeassemblyandannotationofgypsophilavaccaria