De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcription

Abstract Background The plasmodium interspersed repeats (pir) multigene family is found across malaria parasite genomes, first discovered in the human-infecting species Plasmodium vivax, where they were initially named the virs. Their function remains unknown, although studies have suggested a role...

Full description

Saved in:
Bibliographic Details
Main Authors: Timothy S. Little, Deirdre A. Cunningham, George K. Christophides, Adam James Reid, Jean Langhorne
Format: Article
Language:English
Published: BMC 2025-05-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-025-11752-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849688385086554112
author Timothy S. Little
Deirdre A. Cunningham
George K. Christophides
Adam James Reid
Jean Langhorne
author_facet Timothy S. Little
Deirdre A. Cunningham
George K. Christophides
Adam James Reid
Jean Langhorne
author_sort Timothy S. Little
collection DOAJ
description Abstract Background The plasmodium interspersed repeats (pir) multigene family is found across malaria parasite genomes, first discovered in the human-infecting species Plasmodium vivax, where they were initially named the virs. Their function remains unknown, although studies have suggested a role in virulence of the asexual blood stages. Sub-families of the P. vivax pir/virs have been identified, and are found in isolates from across the world, however their transcription at different localities and in different stages of the life cycle have not been quantified. Multiple transcriptomic studies of the parasite have been conducted, but many map the pir reads to existing reference genomes (as part of standard bioinformatic practice), which may miss members of the multigene family due to its inherent variability. This obscures our understanding of how the pir sub-families in P. vivax may be contributing to human/vector infection. Results To overcome the issue of hidden pir diversity from utilising a reference genome, we employed de novo transcriptome assembly to construct the pir ‘reference’ of different parasite isolates from published and novel RNAseq datasets. For this purpose, a pipeline was written in Nextflow, and first tested on data from the rodent-infecting P. c. chabaudi parasite to ascertain its efficacy on a sample with a full, genome-based set of pir gene sequences. The pipeline assembled hundreds of pirs from the studies included. By performing BLAST sequence identity comparisons with reference genome pirs (including P. vivax and related species) we found a clustered network of transcripts which corresponded well with prior sub-family annotations, albeit requiring some updated nomenclature. Mapping the RNAseq datasets to the de novo transcriptome references revealed that the transcription of these updated pir gene sub-families is generally consistent across the different geographical regions. From this transcriptional quantification, a time course of mosquito bloodmeals (after feeding on an infected patient) highlighted the first evidence of ookinete stage pir transcription in a human-infective malaria parasite. Conclusions De novo transcriptome assembly is a valuable tool for understanding highly variable multigene families from Plasmodium spp., and with pipeline software these can be applied more easily and at scale. Despite a global distribution, P. vivax has a conserved pir sub-family structure—both in terms of genome copy number and transcription. We suggest that this indicates important roles of the distinct sub-families, or a genetic mechanism maintaining their preservation. Furthermore, a burst of pir transcription in the mosquito stages of development is the first glint of ookinete pir expression for a human-infective malaria parasite, suggesting a role for the gene family at a new stage of the lifecycle.
format Article
id doaj-art-b8f994f2e037401fbdf5c8c623bd901d
institution DOAJ
issn 1471-2164
language English
publishDate 2025-05-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj-art-b8f994f2e037401fbdf5c8c623bd901d2025-08-20T03:22:01ZengBMCBMC Genomics1471-21642025-05-0126111710.1186/s12864-025-11752-1De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcriptionTimothy S. Little0Deirdre A. Cunningham1George K. Christophides2Adam James Reid3Jean Langhorne4The Francis Crick InstituteThe Francis Crick InstituteDepartment of Life Sciences, Imperial College LondonThe Gurdon Institute, University of CambridgeThe Francis Crick InstituteAbstract Background The plasmodium interspersed repeats (pir) multigene family is found across malaria parasite genomes, first discovered in the human-infecting species Plasmodium vivax, where they were initially named the virs. Their function remains unknown, although studies have suggested a role in virulence of the asexual blood stages. Sub-families of the P. vivax pir/virs have been identified, and are found in isolates from across the world, however their transcription at different localities and in different stages of the life cycle have not been quantified. Multiple transcriptomic studies of the parasite have been conducted, but many map the pir reads to existing reference genomes (as part of standard bioinformatic practice), which may miss members of the multigene family due to its inherent variability. This obscures our understanding of how the pir sub-families in P. vivax may be contributing to human/vector infection. Results To overcome the issue of hidden pir diversity from utilising a reference genome, we employed de novo transcriptome assembly to construct the pir ‘reference’ of different parasite isolates from published and novel RNAseq datasets. For this purpose, a pipeline was written in Nextflow, and first tested on data from the rodent-infecting P. c. chabaudi parasite to ascertain its efficacy on a sample with a full, genome-based set of pir gene sequences. The pipeline assembled hundreds of pirs from the studies included. By performing BLAST sequence identity comparisons with reference genome pirs (including P. vivax and related species) we found a clustered network of transcripts which corresponded well with prior sub-family annotations, albeit requiring some updated nomenclature. Mapping the RNAseq datasets to the de novo transcriptome references revealed that the transcription of these updated pir gene sub-families is generally consistent across the different geographical regions. From this transcriptional quantification, a time course of mosquito bloodmeals (after feeding on an infected patient) highlighted the first evidence of ookinete stage pir transcription in a human-infective malaria parasite. Conclusions De novo transcriptome assembly is a valuable tool for understanding highly variable multigene families from Plasmodium spp., and with pipeline software these can be applied more easily and at scale. Despite a global distribution, P. vivax has a conserved pir sub-family structure—both in terms of genome copy number and transcription. We suggest that this indicates important roles of the distinct sub-families, or a genetic mechanism maintaining their preservation. Furthermore, a burst of pir transcription in the mosquito stages of development is the first glint of ookinete pir expression for a human-infective malaria parasite, suggesting a role for the gene family at a new stage of the lifecycle.https://doi.org/10.1186/s12864-025-11752-1MalariaVivaxTranscriptomicsPirMultigene
spellingShingle Timothy S. Little
Deirdre A. Cunningham
George K. Christophides
Adam James Reid
Jean Langhorne
De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcription
BMC Genomics
Malaria
Vivax
Transcriptomics
Pir
Multigene
title De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcription
title_full De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcription
title_fullStr De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcription
title_full_unstemmed De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcription
title_short De novo assembly of plasmodium interspersed repeat (pir) genes from Plasmodium vivax RNAseq data suggests geographic conservation of sub-family transcription
title_sort de novo assembly of plasmodium interspersed repeat pir genes from plasmodium vivax rnaseq data suggests geographic conservation of sub family transcription
topic Malaria
Vivax
Transcriptomics
Pir
Multigene
url https://doi.org/10.1186/s12864-025-11752-1
work_keys_str_mv AT timothyslittle denovoassemblyofplasmodiuminterspersedrepeatpirgenesfromplasmodiumvivaxrnaseqdatasuggestsgeographicconservationofsubfamilytranscription
AT deirdreacunningham denovoassemblyofplasmodiuminterspersedrepeatpirgenesfromplasmodiumvivaxrnaseqdatasuggestsgeographicconservationofsubfamilytranscription
AT georgekchristophides denovoassemblyofplasmodiuminterspersedrepeatpirgenesfromplasmodiumvivaxrnaseqdatasuggestsgeographicconservationofsubfamilytranscription
AT adamjamesreid denovoassemblyofplasmodiuminterspersedrepeatpirgenesfromplasmodiumvivaxrnaseqdatasuggestsgeographicconservationofsubfamilytranscription
AT jeanlanghorne denovoassemblyofplasmodiuminterspersedrepeatpirgenesfromplasmodiumvivaxrnaseqdatasuggestsgeographicconservationofsubfamilytranscription