A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study

Abstract Background The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a singl...

Full description

Saved in:
Bibliographic Details
Main Authors: Harpreet Kaur, Laura M. Shannon, Deborah A. Samac
Format: Article
Language:English
Published: BMC 2024-10-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-024-10931-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850180084220035072
author Harpreet Kaur
Laura M. Shannon
Deborah A. Samac
author_facet Harpreet Kaur
Laura M. Shannon
Deborah A. Samac
author_sort Harpreet Kaur
collection DOAJ
description Abstract Background The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes. Main body In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline. Conclusion Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.
format Article
id doaj-art-0180ce902b7e4f3ab8fa7ddcbe0714f4
institution OA Journals
issn 1471-2164
language English
publishDate 2024-10-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj-art-0180ce902b7e4f3ab8fa7ddcbe0714f42025-08-20T02:18:19ZengBMCBMC Genomics1471-21642024-10-0125113310.1186/s12864-024-10931-wA stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case studyHarpreet Kaur0Laura M. Shannon1Deborah A. Samac2Department of Horticultural Science, University of MinnesotaDepartment of Horticultural Science, University of MinnesotaUSDA-ARS, Plant Science Research UnitAbstract Background The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes. Main body In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline. Conclusion Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.https://doi.org/10.1186/s12864-024-10931-wCrop pangenomePolyploidsAutotetraploidAlfalfaGraph-based pangenome
spellingShingle Harpreet Kaur
Laura M. Shannon
Deborah A. Samac
A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study
BMC Genomics
Crop pangenome
Polyploids
Autotetraploid
Alfalfa
Graph-based pangenome
title A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study
title_full A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study
title_fullStr A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study
title_full_unstemmed A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study
title_short A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study
title_sort stepwise guide for pangenome development in crop plants an alfalfa medicago sativa case study
topic Crop pangenome
Polyploids
Autotetraploid
Alfalfa
Graph-based pangenome
url https://doi.org/10.1186/s12864-024-10931-w
work_keys_str_mv AT harpreetkaur astepwiseguideforpangenomedevelopmentincropplantsanalfalfamedicagosativacasestudy
AT lauramshannon astepwiseguideforpangenomedevelopmentincropplantsanalfalfamedicagosativacasestudy
AT deborahasamac astepwiseguideforpangenomedevelopmentincropplantsanalfalfamedicagosativacasestudy
AT harpreetkaur stepwiseguideforpangenomedevelopmentincropplantsanalfalfamedicagosativacasestudy
AT lauramshannon stepwiseguideforpangenomedevelopmentincropplantsanalfalfamedicagosativacasestudy
AT deborahasamac stepwiseguideforpangenomedevelopmentincropplantsanalfalfamedicagosativacasestudy