Active learning-assisted directed evolution

Abstract Directed evolution (DE) is a powerful tool to optimize protein fitness for a specific application. However, DE can be inefficient when mutations exhibit non-additive, or epistatic, behavior. Here, we present Active Learning-assisted Directed Evolution (ALDE), an iterative machine learning-a...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jason Yang, Ravi G. Lal, James C. Bowden, Raul Astudillo, Mikhail A. Hameedi, Sukhvinder Kaur, Matthew Hill, Yisong Yue, Frances H. Arnold
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Nature Communications
Online Access:	https://doi.org/10.1038/s41467-025-55987-8
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832594542937767936
author	Jason Yang Ravi G. Lal James C. Bowden Raul Astudillo Mikhail A. Hameedi Sukhvinder Kaur Matthew Hill Yisong Yue Frances H. Arnold
author_facet	Jason Yang Ravi G. Lal James C. Bowden Raul Astudillo Mikhail A. Hameedi Sukhvinder Kaur Matthew Hill Yisong Yue Frances H. Arnold
author_sort	Jason Yang
collection	DOAJ
description	Abstract Directed evolution (DE) is a powerful tool to optimize protein fitness for a specific application. However, DE can be inefficient when mutations exhibit non-additive, or epistatic, behavior. Here, we present Active Learning-assisted Directed Evolution (ALDE), an iterative machine learning-assisted DE workflow that leverages uncertainty quantification to explore the search space of proteins more efficiently than current DE methods. We apply ALDE to an engineering landscape that is challenging for DE: optimization of five epistatic residues in the active site of an enzyme. In three rounds of wet-lab experimentation, we improve the yield of a desired product of a non-native cyclopropanation reaction from 12% to 93%. We also perform computational simulations on existing protein sequence-fitness datasets to support our argument that ALDE can be more effective than DE. Overall, ALDE is a practical and broadly applicable strategy to unlock improved protein engineering outcomes.
format	Article
id	doaj-art-42793e27879147558f1a8afcca84d70e
institution	Kabale University
issn	2041-1723
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Nature Communications
spelling	doaj-art-42793e27879147558f1a8afcca84d70e2025-01-19T12:31:36ZengNature PortfolioNature Communications2041-17232025-01-0116111210.1038/s41467-025-55987-8Active learning-assisted directed evolutionJason Yang0Ravi G. Lal1James C. Bowden2Raul Astudillo3Mikhail A. Hameedi4Sukhvinder Kaur5Matthew Hill6Yisong Yue7Frances H. Arnold8Division of Chemistry and Chemical Engineering, California Institute of TechnologyDivision of Chemistry and Chemical Engineering, California Institute of TechnologyDivision of Engineering and Applied Sciences, California Institute of TechnologyDivision of Engineering and Applied Sciences, California Institute of TechnologyDivision of Biology and Biological Engineering, California Institute of TechnologyElegen CorpElegen CorpDivision of Engineering and Applied Sciences, California Institute of TechnologyDivision of Chemistry and Chemical Engineering, California Institute of TechnologyAbstract Directed evolution (DE) is a powerful tool to optimize protein fitness for a specific application. However, DE can be inefficient when mutations exhibit non-additive, or epistatic, behavior. Here, we present Active Learning-assisted Directed Evolution (ALDE), an iterative machine learning-assisted DE workflow that leverages uncertainty quantification to explore the search space of proteins more efficiently than current DE methods. We apply ALDE to an engineering landscape that is challenging for DE: optimization of five epistatic residues in the active site of an enzyme. In three rounds of wet-lab experimentation, we improve the yield of a desired product of a non-native cyclopropanation reaction from 12% to 93%. We also perform computational simulations on existing protein sequence-fitness datasets to support our argument that ALDE can be more effective than DE. Overall, ALDE is a practical and broadly applicable strategy to unlock improved protein engineering outcomes.https://doi.org/10.1038/s41467-025-55987-8
spellingShingle	Jason Yang Ravi G. Lal James C. Bowden Raul Astudillo Mikhail A. Hameedi Sukhvinder Kaur Matthew Hill Yisong Yue Frances H. Arnold Active learning-assisted directed evolution Nature Communications
title	Active learning-assisted directed evolution
title_full	Active learning-assisted directed evolution
title_fullStr	Active learning-assisted directed evolution
title_full_unstemmed	Active learning-assisted directed evolution
title_short	Active learning-assisted directed evolution
title_sort	active learning assisted directed evolution
url	https://doi.org/10.1038/s41467-025-55987-8
work_keys_str_mv	AT jasonyang activelearningassisteddirectedevolution AT raviglal activelearningassisteddirectedevolution AT jamescbowden activelearningassisteddirectedevolution AT raulastudillo activelearningassisteddirectedevolution AT mikhailahameedi activelearningassisteddirectedevolution AT sukhvinderkaur activelearningassisteddirectedevolution AT matthewhill activelearningassisteddirectedevolution AT yisongyue activelearningassisteddirectedevolution AT francesharnold activelearningassisteddirectedevolution

Active learning-assisted directed evolution

Similar Items