Lattice protein folding with variational annealing

Understanding the principles of protein folding is a cornerstone of computational biology, with implications for drug design, bioengineering, and the understanding of fundamental biological processes. Lattice protein folding models offer a simplified yet powerful framework for studying the complexit...

Full description

Saved in:
Bibliographic Details
Main Authors: Shoummo A Khandoker, Estelle M Inack, Mohamed Hibat-Allah
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:Machine Learning: Science and Technology
Subjects:
Online Access:https://doi.org/10.1088/2632-2153/adf376
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849396926682759168
author Shoummo A Khandoker
Estelle M Inack
Mohamed Hibat-Allah
author_facet Shoummo A Khandoker
Estelle M Inack
Mohamed Hibat-Allah
author_sort Shoummo A Khandoker
collection DOAJ
description Understanding the principles of protein folding is a cornerstone of computational biology, with implications for drug design, bioengineering, and the understanding of fundamental biological processes. Lattice protein folding models offer a simplified yet powerful framework for studying the complexities of protein folding, enabling the exploration of energetically optimal folds under constrained conditions. However, finding these optimal folds is a computationally challenging combinatorial optimization problem. In this work, we introduce a novel upper-bound training scheme that employs masking to identify the lowest-energy folds in two-dimensional hydrophobic-polar lattice protein folding. By leveraging dilated recurrent neural networks (RNNs) integrated with an annealing process driven by temperature-like fluctuations, our method accurately predicts optimal folds for benchmark systems of up to 60 beads. Our approach also effectively masks invalid folds from being sampled without compromising the autoregressive sampling properties of RNNs. This scheme is generalizable to three spatial dimensions and can be extended to lattice protein models with larger alphabets. Our findings emphasize the potential of advanced machine learning techniques in tackling complex protein folding problems and a broader class of constrained combinatorial optimization challenges.
format Article
id doaj-art-96a8df40809f411dad78cbdea73a71bd
institution Kabale University
issn 2632-2153
language English
publishDate 2025-01-01
publisher IOP Publishing
record_format Article
series Machine Learning: Science and Technology
spelling doaj-art-96a8df40809f411dad78cbdea73a71bd2025-08-20T03:39:11ZengIOP PublishingMachine Learning: Science and Technology2632-21532025-01-016303502310.1088/2632-2153/adf376Lattice protein folding with variational annealingShoummo A Khandoker0https://orcid.org/0000-0001-7367-4485Estelle M Inack1https://orcid.org/0000-0002-4672-5512Mohamed Hibat-Allah2https://orcid.org/0000-0002-5298-8589Department of Computer Science , Indiana University Bloomington, Bloomington, IN 47405, United States of AmericaPerimeter Institute for Theoretical Physics , Waterloo, Ontario, Canada; yiyaniQ , Toronto, Ontario, Canada; Department of Physics and Astronomy, University of Waterloo , Waterloo, Ontario, CanadaPerimeter Institute for Theoretical Physics , Waterloo, Ontario, Canada; Department of Applied Mathematics, University of Waterloo , Waterloo, Ontario, Canada; Vector Institute , Toronto, Ontario, CanadaUnderstanding the principles of protein folding is a cornerstone of computational biology, with implications for drug design, bioengineering, and the understanding of fundamental biological processes. Lattice protein folding models offer a simplified yet powerful framework for studying the complexities of protein folding, enabling the exploration of energetically optimal folds under constrained conditions. However, finding these optimal folds is a computationally challenging combinatorial optimization problem. In this work, we introduce a novel upper-bound training scheme that employs masking to identify the lowest-energy folds in two-dimensional hydrophobic-polar lattice protein folding. By leveraging dilated recurrent neural networks (RNNs) integrated with an annealing process driven by temperature-like fluctuations, our method accurately predicts optimal folds for benchmark systems of up to 60 beads. Our approach also effectively masks invalid folds from being sampled without compromising the autoregressive sampling properties of RNNs. This scheme is generalizable to three spatial dimensions and can be extended to lattice protein models with larger alphabets. Our findings emphasize the potential of advanced machine learning techniques in tackling complex protein folding problems and a broader class of constrained combinatorial optimization challenges.https://doi.org/10.1088/2632-2153/adf376lattice protein foldingconstrained combinatorial optimizationrecurrent neural networksautoregressive modelsvariational annealing
spellingShingle Shoummo A Khandoker
Estelle M Inack
Mohamed Hibat-Allah
Lattice protein folding with variational annealing
Machine Learning: Science and Technology
lattice protein folding
constrained combinatorial optimization
recurrent neural networks
autoregressive models
variational annealing
title Lattice protein folding with variational annealing
title_full Lattice protein folding with variational annealing
title_fullStr Lattice protein folding with variational annealing
title_full_unstemmed Lattice protein folding with variational annealing
title_short Lattice protein folding with variational annealing
title_sort lattice protein folding with variational annealing
topic lattice protein folding
constrained combinatorial optimization
recurrent neural networks
autoregressive models
variational annealing
url https://doi.org/10.1088/2632-2153/adf376
work_keys_str_mv AT shoummoakhandoker latticeproteinfoldingwithvariationalannealing
AT estelleminack latticeproteinfoldingwithvariationalannealing
AT mohamedhibatallah latticeproteinfoldingwithvariationalannealing