An alignment-free method for phylogeny estimation using maximum likelihood

Abstract Background While alignment has traditionally been the primary approach for establishing homology prior to phylogenetic inference, alignment-free methods offer a simplified alternative, particularly beneficial when handling genome-wide data involving long sequences and complex events such as...

Full description

Saved in:
Bibliographic Details
Main Authors: Tasfia Zahin, Md. Hasin Abrar, Mizanur Rahman Jewel, Tahrina Tasnim, Md. Shamsuzzoha Bayzid, Atif Rahman
Format: Article
Language:English
Published: BMC 2025-03-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-025-06080-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849762457709445120
author Tasfia Zahin
Md. Hasin Abrar
Mizanur Rahman Jewel
Tahrina Tasnim
Md. Shamsuzzoha Bayzid
Atif Rahman
author_facet Tasfia Zahin
Md. Hasin Abrar
Mizanur Rahman Jewel
Tahrina Tasnim
Md. Shamsuzzoha Bayzid
Atif Rahman
author_sort Tasfia Zahin
collection DOAJ
description Abstract Background While alignment has traditionally been the primary approach for establishing homology prior to phylogenetic inference, alignment-free methods offer a simplified alternative, particularly beneficial when handling genome-wide data involving long sequences and complex events such as rearrangements. Moreover, alignment-free methods become crucial for data types like genome skims, where assembly is impractical. However, despite these benefits, alignment-free techniques have not gained widespread acceptance since they lack the accuracy of alignment-based techniques, primarily due to their reliance on simplified models of pairwise distance calculation. Results Here, we present a likelihood based alignment-free technique for phylogenetic tree construction. We encode the presence or absence of k-mers in genome sequences in a binary matrix, and estimate phylogenetic trees using a maximum likelihood approach. A likelihood based alignment-free method for phylogeny estimation is implemented for the first time in a software named Peafowl, which is available at: https://github.com/hasin-abrar/Peafowl-repo . We analyze the performance of our method on seven real datasets and compare the results with the state of the art alignment-free methods. Conclusions Results suggest that our method is competitive with existing alignment-free tools. This indicates that maximum likelihood based alignment-free methods may in the future be refined to outperform alignment-free methods relying on distance calculation as has been the case in the alignment-based setting.
format Article
id doaj-art-138a6bd42e3d4f599051503ebc5f6571
institution DOAJ
issn 1471-2105
language English
publishDate 2025-03-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj-art-138a6bd42e3d4f599051503ebc5f65712025-08-20T03:05:44ZengBMCBMC Bioinformatics1471-21052025-03-0126111410.1186/s12859-025-06080-wAn alignment-free method for phylogeny estimation using maximum likelihoodTasfia Zahin0Md. Hasin Abrar1Mizanur Rahman Jewel2Tahrina Tasnim3Md. Shamsuzzoha Bayzid4Atif Rahman5Department of Computer Science and Engineering, Bangladesh University of Engineering and TechnologyDepartment of Computer Science and Engineering, Bangladesh University of Engineering and TechnologyDepartment of Computer Science and Engineering, Bangladesh University of Engineering and TechnologyDepartment of Computer Science and Engineering, Bangladesh University of Engineering and TechnologyDepartment of Computer Science and Engineering, Bangladesh University of Engineering and TechnologyDepartment of Computer Science and Engineering, Bangladesh University of Engineering and TechnologyAbstract Background While alignment has traditionally been the primary approach for establishing homology prior to phylogenetic inference, alignment-free methods offer a simplified alternative, particularly beneficial when handling genome-wide data involving long sequences and complex events such as rearrangements. Moreover, alignment-free methods become crucial for data types like genome skims, where assembly is impractical. However, despite these benefits, alignment-free techniques have not gained widespread acceptance since they lack the accuracy of alignment-based techniques, primarily due to their reliance on simplified models of pairwise distance calculation. Results Here, we present a likelihood based alignment-free technique for phylogenetic tree construction. We encode the presence or absence of k-mers in genome sequences in a binary matrix, and estimate phylogenetic trees using a maximum likelihood approach. A likelihood based alignment-free method for phylogeny estimation is implemented for the first time in a software named Peafowl, which is available at: https://github.com/hasin-abrar/Peafowl-repo . We analyze the performance of our method on seven real datasets and compare the results with the state of the art alignment-free methods. Conclusions Results suggest that our method is competitive with existing alignment-free tools. This indicates that maximum likelihood based alignment-free methods may in the future be refined to outperform alignment-free methods relying on distance calculation as has been the case in the alignment-based setting.https://doi.org/10.1186/s12859-025-06080-wPhylogeneticsAlignment-freek-merLikelihood
spellingShingle Tasfia Zahin
Md. Hasin Abrar
Mizanur Rahman Jewel
Tahrina Tasnim
Md. Shamsuzzoha Bayzid
Atif Rahman
An alignment-free method for phylogeny estimation using maximum likelihood
BMC Bioinformatics
Phylogenetics
Alignment-free
k-mer
Likelihood
title An alignment-free method for phylogeny estimation using maximum likelihood
title_full An alignment-free method for phylogeny estimation using maximum likelihood
title_fullStr An alignment-free method for phylogeny estimation using maximum likelihood
title_full_unstemmed An alignment-free method for phylogeny estimation using maximum likelihood
title_short An alignment-free method for phylogeny estimation using maximum likelihood
title_sort alignment free method for phylogeny estimation using maximum likelihood
topic Phylogenetics
Alignment-free
k-mer
Likelihood
url https://doi.org/10.1186/s12859-025-06080-w
work_keys_str_mv AT tasfiazahin analignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT mdhasinabrar analignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT mizanurrahmanjewel analignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT tahrinatasnim analignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT mdshamsuzzohabayzid analignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT atifrahman analignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT tasfiazahin alignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT mdhasinabrar alignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT mizanurrahmanjewel alignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT tahrinatasnim alignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT mdshamsuzzohabayzid alignmentfreemethodforphylogenyestimationusingmaximumlikelihood
AT atifrahman alignmentfreemethodforphylogenyestimationusingmaximumlikelihood