Active learning-assisted directed evolution
Abstract Directed evolution (DE) is a powerful tool to optimize protein fitness for a specific application. However, DE can be inefficient when mutations exhibit non-additive, or epistatic, behavior. Here, we present Active Learning-assisted Directed Evolution (ALDE), an iterative machine learning-a...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-025-55987-8 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832594542937767936 |
---|---|
author | Jason Yang Ravi G. Lal James C. Bowden Raul Astudillo Mikhail A. Hameedi Sukhvinder Kaur Matthew Hill Yisong Yue Frances H. Arnold |
author_facet | Jason Yang Ravi G. Lal James C. Bowden Raul Astudillo Mikhail A. Hameedi Sukhvinder Kaur Matthew Hill Yisong Yue Frances H. Arnold |
author_sort | Jason Yang |
collection | DOAJ |
description | Abstract Directed evolution (DE) is a powerful tool to optimize protein fitness for a specific application. However, DE can be inefficient when mutations exhibit non-additive, or epistatic, behavior. Here, we present Active Learning-assisted Directed Evolution (ALDE), an iterative machine learning-assisted DE workflow that leverages uncertainty quantification to explore the search space of proteins more efficiently than current DE methods. We apply ALDE to an engineering landscape that is challenging for DE: optimization of five epistatic residues in the active site of an enzyme. In three rounds of wet-lab experimentation, we improve the yield of a desired product of a non-native cyclopropanation reaction from 12% to 93%. We also perform computational simulations on existing protein sequence-fitness datasets to support our argument that ALDE can be more effective than DE. Overall, ALDE is a practical and broadly applicable strategy to unlock improved protein engineering outcomes. |
format | Article |
id | doaj-art-42793e27879147558f1a8afcca84d70e |
institution | Kabale University |
issn | 2041-1723 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj-art-42793e27879147558f1a8afcca84d70e2025-01-19T12:31:36ZengNature PortfolioNature Communications2041-17232025-01-0116111210.1038/s41467-025-55987-8Active learning-assisted directed evolutionJason Yang0Ravi G. Lal1James C. Bowden2Raul Astudillo3Mikhail A. Hameedi4Sukhvinder Kaur5Matthew Hill6Yisong Yue7Frances H. Arnold8Division of Chemistry and Chemical Engineering, California Institute of TechnologyDivision of Chemistry and Chemical Engineering, California Institute of TechnologyDivision of Engineering and Applied Sciences, California Institute of TechnologyDivision of Engineering and Applied Sciences, California Institute of TechnologyDivision of Biology and Biological Engineering, California Institute of TechnologyElegen CorpElegen CorpDivision of Engineering and Applied Sciences, California Institute of TechnologyDivision of Chemistry and Chemical Engineering, California Institute of TechnologyAbstract Directed evolution (DE) is a powerful tool to optimize protein fitness for a specific application. However, DE can be inefficient when mutations exhibit non-additive, or epistatic, behavior. Here, we present Active Learning-assisted Directed Evolution (ALDE), an iterative machine learning-assisted DE workflow that leverages uncertainty quantification to explore the search space of proteins more efficiently than current DE methods. We apply ALDE to an engineering landscape that is challenging for DE: optimization of five epistatic residues in the active site of an enzyme. In three rounds of wet-lab experimentation, we improve the yield of a desired product of a non-native cyclopropanation reaction from 12% to 93%. We also perform computational simulations on existing protein sequence-fitness datasets to support our argument that ALDE can be more effective than DE. Overall, ALDE is a practical and broadly applicable strategy to unlock improved protein engineering outcomes.https://doi.org/10.1038/s41467-025-55987-8 |
spellingShingle | Jason Yang Ravi G. Lal James C. Bowden Raul Astudillo Mikhail A. Hameedi Sukhvinder Kaur Matthew Hill Yisong Yue Frances H. Arnold Active learning-assisted directed evolution Nature Communications |
title | Active learning-assisted directed evolution |
title_full | Active learning-assisted directed evolution |
title_fullStr | Active learning-assisted directed evolution |
title_full_unstemmed | Active learning-assisted directed evolution |
title_short | Active learning-assisted directed evolution |
title_sort | active learning assisted directed evolution |
url | https://doi.org/10.1038/s41467-025-55987-8 |
work_keys_str_mv | AT jasonyang activelearningassisteddirectedevolution AT raviglal activelearningassisteddirectedevolution AT jamescbowden activelearningassisteddirectedevolution AT raulastudillo activelearningassisteddirectedevolution AT mikhailahameedi activelearningassisteddirectedevolution AT sukhvinderkaur activelearningassisteddirectedevolution AT matthewhill activelearningassisteddirectedevolution AT yisongyue activelearningassisteddirectedevolution AT francesharnold activelearningassisteddirectedevolution |