Pytrf: a python package for finding tandem repeats from genomic sequences
Abstract Background Tandem repeats (TRs) are major sources of genetic variation and important genetic markers. Their expansions are not only involved in gene expression regulation but also associated with many nervous system diseases and cancers. However, there is a lack of an efficient tandem repea...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-06-01
|
| Series: | BMC Bioinformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12859-025-06168-3 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Background Tandem repeats (TRs) are major sources of genetic variation and important genetic markers. Their expansions are not only involved in gene expression regulation but also associated with many nervous system diseases and cancers. However, there is a lack of an efficient tandem repeat identification tool for seamless integration with larger bioinformatics programs developed with the popular Python language. Results We introduce pytrf, a Python package for identification of both exact and approximate TRs from genomic sequences. It allows seamless embedding into other programs developed by Python or using in Python interactive environment and Jupyter notebooks. It also provides command line tools for assisting users to find tandem repeats from FASTA/Q files. Compared to other tools, the pytrf shows the highest performance in aspect of running time with comparable peak memory usage. Conclusions Pytrf provides simple interfaces and command line tools to facilitate identification of tandem repeats from genomic sequences. Pytrf can easily be installed from PyPI ( https://pypi.org/project/pytrf ) and the source code is freely available at https://github.com/lmdu/pytrf . |
|---|---|
| ISSN: | 1471-2105 |