Detection of circular permutations by Protein Language Models

Protein circular permutations are crucial for understanding protein evolution and functionality. Traditional detection methods face challenges: sequence-based approaches struggle with detecting distant homologs, while structure-based approaches are limited by the need for structure generation and of...

Full description

Saved in:
Bibliographic Details
Main Authors: Yue Hu, Bin Huang, Chun Zi Zang, Jia Jie Xu
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037024004525
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841558851431694336
author Yue Hu
Bin Huang
Chun Zi Zang
Jia Jie Xu
author_facet Yue Hu
Bin Huang
Chun Zi Zang
Jia Jie Xu
author_sort Yue Hu
collection DOAJ
description Protein circular permutations are crucial for understanding protein evolution and functionality. Traditional detection methods face challenges: sequence-based approaches struggle with detecting distant homologs, while structure-based approaches are limited by the need for structure generation and often treat proteins as rigid bodies. Protein Language Model-based alignment tools have shown advantages in utilizing sequence information to overcome the challenges of detecting distant homologs without requiring structural input. However, many current Protein Language Model-based alignment methods, which rely on sequence alignment algorithms like the Smith-Waterman algorithm, face significant difficulties when dealing with circular permutation (CP) due to their dependency on linear sequence order. This sequence order dependency makes them unsuitable for accurately detecting CP. Our approach, named plmCP, combines classical genetic principles with modern alignment techniques leveraging Protein Language Models to address these limitations. By integrating genetic knowledge, the plmCP method avoids the sequence order dependency, allowing for effective detection of circular permutations and contributing significantly to protein research and engineering by embracing structural flexibility.
format Article
id doaj-art-8ffc9437e5004964bd3da06d79d4a984
institution Kabale University
issn 2001-0370
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj-art-8ffc9437e5004964bd3da06d79d4a9842025-01-06T04:08:36ZengElsevierComputational and Structural Biotechnology Journal2001-03702025-01-0127214220Detection of circular permutations by Protein Language ModelsYue Hu0Bin Huang1Chun Zi Zang2Jia Jie Xu3School of Bioengineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong 250300, China; Kyiv College, Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong 250300, China; Corresponding author at: School of Bioengineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong 250300, ChinaSchool of Life Sciences, Yunnan Normal University, Kunming, Yunnan 650500, China; Corresponding author.Kyiv College, Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong 250300, ChinaSchool of Bioengineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong 250300, ChinaProtein circular permutations are crucial for understanding protein evolution and functionality. Traditional detection methods face challenges: sequence-based approaches struggle with detecting distant homologs, while structure-based approaches are limited by the need for structure generation and often treat proteins as rigid bodies. Protein Language Model-based alignment tools have shown advantages in utilizing sequence information to overcome the challenges of detecting distant homologs without requiring structural input. However, many current Protein Language Model-based alignment methods, which rely on sequence alignment algorithms like the Smith-Waterman algorithm, face significant difficulties when dealing with circular permutation (CP) due to their dependency on linear sequence order. This sequence order dependency makes them unsuitable for accurately detecting CP. Our approach, named plmCP, combines classical genetic principles with modern alignment techniques leveraging Protein Language Models to address these limitations. By integrating genetic knowledge, the plmCP method avoids the sequence order dependency, allowing for effective detection of circular permutations and contributing significantly to protein research and engineering by embracing structural flexibility.http://www.sciencedirect.com/science/article/pii/S2001037024004525Circular Permutation; Protein Language Models; Protein Structure Alignment
spellingShingle Yue Hu
Bin Huang
Chun Zi Zang
Jia Jie Xu
Detection of circular permutations by Protein Language Models
Computational and Structural Biotechnology Journal
Circular Permutation; Protein Language Models; Protein Structure Alignment
title Detection of circular permutations by Protein Language Models
title_full Detection of circular permutations by Protein Language Models
title_fullStr Detection of circular permutations by Protein Language Models
title_full_unstemmed Detection of circular permutations by Protein Language Models
title_short Detection of circular permutations by Protein Language Models
title_sort detection of circular permutations by protein language models
topic Circular Permutation; Protein Language Models; Protein Structure Alignment
url http://www.sciencedirect.com/science/article/pii/S2001037024004525
work_keys_str_mv AT yuehu detectionofcircularpermutationsbyproteinlanguagemodels
AT binhuang detectionofcircularpermutationsbyproteinlanguagemodels
AT chunzizang detectionofcircularpermutationsbyproteinlanguagemodels
AT jiajiexu detectionofcircularpermutationsbyproteinlanguagemodels