MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.

We have developed the MC64-ClustalWP2 as a new implementation of the Clustal W algorithm, integrating a novel parallelization strategy and significantly increasing the performance when aligning long sequences in architectures with many cores. It must be stressed that in such a process, the detailed...

Full description

Saved in:
Bibliographic Details
Main Authors: David Díaz, Francisco J Esteban, Pilar Hernández, Juan Antonio Caballero, Antonio Guevara, Gabriel Dorado, Sergio Gálvez
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0094044&type=printable
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850026497176240128
author David Díaz
Francisco J Esteban
Pilar Hernández
Juan Antonio Caballero
Antonio Guevara
Gabriel Dorado
Sergio Gálvez
author_facet David Díaz
Francisco J Esteban
Pilar Hernández
Juan Antonio Caballero
Antonio Guevara
Gabriel Dorado
Sergio Gálvez
author_sort David Díaz
collection DOAJ
description We have developed the MC64-ClustalWP2 as a new implementation of the Clustal W algorithm, integrating a novel parallelization strategy and significantly increasing the performance when aligning long sequences in architectures with many cores. It must be stressed that in such a process, the detailed analysis of both the software and hardware features and peculiarities is of paramount importance to reveal key points to exploit and optimize the full potential of parallelism in many-core CPU systems. The new parallelization approach has focused into the most time-consuming stages of this algorithm. In particular, the so-called progressive alignment has drastically improved the performance, due to a fine-grained approach where the forward and backward loops were unrolled and parallelized. Another key approach has been the implementation of the new algorithm in a hybrid-computing system, integrating both an Intel Xeon multi-core CPU and a Tilera Tile64 many-core card. A comparison with other Clustal W implementations reveals the high-performance of the new algorithm and strategy in many-core CPU architectures, in a scenario where the sequences to align are relatively long (more than 10 kb) and, hence, a many-core GPU hardware cannot be used. Thus, the MC64-ClustalWP2 runs multiple alignments more than 18x than the original Clustal W algorithm, and more than 7x than the best x86 parallel implementation to date, being publicly available through a web service. Besides, these developments have been deployed in cost-effective personal computers and should be useful for life-science researchers, including the identification of identities and differences for mutation/polymorphism analyses, biodiversity and evolutionary studies and for the development of molecular markers for paternity testing, germplasm management and protection, to assist breeding, illegal traffic control, fraud prevention and for the protection of the intellectual property (identification/traceability), including the protected designation of origin, among other applications.
format Article
id doaj-art-9d64e280bc2b4717a61bfff5fe2914e2
institution DOAJ
issn 1932-6203
language English
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-9d64e280bc2b4717a61bfff5fe2914e22025-08-20T03:00:31ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0194e9404410.1371/journal.pone.0094044MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.David DíazFrancisco J EstebanPilar HernándezJuan Antonio CaballeroAntonio GuevaraGabriel DoradoSergio GálvezWe have developed the MC64-ClustalWP2 as a new implementation of the Clustal W algorithm, integrating a novel parallelization strategy and significantly increasing the performance when aligning long sequences in architectures with many cores. It must be stressed that in such a process, the detailed analysis of both the software and hardware features and peculiarities is of paramount importance to reveal key points to exploit and optimize the full potential of parallelism in many-core CPU systems. The new parallelization approach has focused into the most time-consuming stages of this algorithm. In particular, the so-called progressive alignment has drastically improved the performance, due to a fine-grained approach where the forward and backward loops were unrolled and parallelized. Another key approach has been the implementation of the new algorithm in a hybrid-computing system, integrating both an Intel Xeon multi-core CPU and a Tilera Tile64 many-core card. A comparison with other Clustal W implementations reveals the high-performance of the new algorithm and strategy in many-core CPU architectures, in a scenario where the sequences to align are relatively long (more than 10 kb) and, hence, a many-core GPU hardware cannot be used. Thus, the MC64-ClustalWP2 runs multiple alignments more than 18x than the original Clustal W algorithm, and more than 7x than the best x86 parallel implementation to date, being publicly available through a web service. Besides, these developments have been deployed in cost-effective personal computers and should be useful for life-science researchers, including the identification of identities and differences for mutation/polymorphism analyses, biodiversity and evolutionary studies and for the development of molecular markers for paternity testing, germplasm management and protection, to assist breeding, illegal traffic control, fraud prevention and for the protection of the intellectual property (identification/traceability), including the protected designation of origin, among other applications.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0094044&type=printable
spellingShingle David Díaz
Francisco J Esteban
Pilar Hernández
Juan Antonio Caballero
Antonio Guevara
Gabriel Dorado
Sergio Gálvez
MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.
PLoS ONE
title MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.
title_full MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.
title_fullStr MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.
title_full_unstemmed MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.
title_short MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.
title_sort mc64 clustalwp2 a highly parallel hybrid strategy to align multiple sequences in many core architectures
url https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0094044&type=printable
work_keys_str_mv AT daviddiaz mc64clustalwp2ahighlyparallelhybridstrategytoalignmultiplesequencesinmanycorearchitectures
AT franciscojesteban mc64clustalwp2ahighlyparallelhybridstrategytoalignmultiplesequencesinmanycorearchitectures
AT pilarhernandez mc64clustalwp2ahighlyparallelhybridstrategytoalignmultiplesequencesinmanycorearchitectures
AT juanantoniocaballero mc64clustalwp2ahighlyparallelhybridstrategytoalignmultiplesequencesinmanycorearchitectures
AT antonioguevara mc64clustalwp2ahighlyparallelhybridstrategytoalignmultiplesequencesinmanycorearchitectures
AT gabrieldorado mc64clustalwp2ahighlyparallelhybridstrategytoalignmultiplesequencesinmanycorearchitectures
AT sergiogalvez mc64clustalwp2ahighlyparallelhybridstrategytoalignmultiplesequencesinmanycorearchitectures