Large language models design sequence-defined macromolecules via evolutionary optimization
Abstract We demonstrate the ability of a large language model to perform evolutionary optimization for materials discovery. Anthropic’s Claude 3.5 model outperforms an active learning scheme with handcrafted surrogate models and an evolutionary algorithm in selecting monomer sequences to produce tar...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2024-11-01
|
| Series: | npj Computational Materials |
| Online Access: | https://doi.org/10.1038/s41524-024-01449-6 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849221094600343552 |
|---|---|
| author | Wesley F. Reinhart Antonia Statt |
| author_facet | Wesley F. Reinhart Antonia Statt |
| author_sort | Wesley F. Reinhart |
| collection | DOAJ |
| description | Abstract We demonstrate the ability of a large language model to perform evolutionary optimization for materials discovery. Anthropic’s Claude 3.5 model outperforms an active learning scheme with handcrafted surrogate models and an evolutionary algorithm in selecting monomer sequences to produce targeted morphologies in macromolecular self-assembly. Utilizing pre-trained language models can potentially reduce the need for hyperparameter tuning while offering new capabilities such as self-reflection. The model performs this task effectively with or without context about the task itself, but domain-specific context sometimes results in faster convergence to good solutions. Furthermore, when this context is withheld, the model infers an approximate notion of the task (e.g., calling it a protein folding problem). This work provides evidence of Claude 3.5’s ability to act as an evolutionary optimizer, a recently discovered emergent behavior of large language models, and demonstrates a practical use case in the study and design of soft materials. |
| format | Article |
| id | doaj-art-959e902afa294e5fb418b545b30ec3ed |
| institution | Kabale University |
| issn | 2057-3960 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Computational Materials |
| spelling | doaj-art-959e902afa294e5fb418b545b30ec3ed2024-11-24T12:35:35ZengNature Portfolionpj Computational Materials2057-39602024-11-011011810.1038/s41524-024-01449-6Large language models design sequence-defined macromolecules via evolutionary optimizationWesley F. Reinhart0Antonia Statt1Department of Materials Science and Engineering, Pennsylvania State UniversityDepartment of Materials Science and Engineering, Grainger College of Engineering, University of Illinois Urbana-ChampaignAbstract We demonstrate the ability of a large language model to perform evolutionary optimization for materials discovery. Anthropic’s Claude 3.5 model outperforms an active learning scheme with handcrafted surrogate models and an evolutionary algorithm in selecting monomer sequences to produce targeted morphologies in macromolecular self-assembly. Utilizing pre-trained language models can potentially reduce the need for hyperparameter tuning while offering new capabilities such as self-reflection. The model performs this task effectively with or without context about the task itself, but domain-specific context sometimes results in faster convergence to good solutions. Furthermore, when this context is withheld, the model infers an approximate notion of the task (e.g., calling it a protein folding problem). This work provides evidence of Claude 3.5’s ability to act as an evolutionary optimizer, a recently discovered emergent behavior of large language models, and demonstrates a practical use case in the study and design of soft materials.https://doi.org/10.1038/s41524-024-01449-6 |
| spellingShingle | Wesley F. Reinhart Antonia Statt Large language models design sequence-defined macromolecules via evolutionary optimization npj Computational Materials |
| title | Large language models design sequence-defined macromolecules via evolutionary optimization |
| title_full | Large language models design sequence-defined macromolecules via evolutionary optimization |
| title_fullStr | Large language models design sequence-defined macromolecules via evolutionary optimization |
| title_full_unstemmed | Large language models design sequence-defined macromolecules via evolutionary optimization |
| title_short | Large language models design sequence-defined macromolecules via evolutionary optimization |
| title_sort | large language models design sequence defined macromolecules via evolutionary optimization |
| url | https://doi.org/10.1038/s41524-024-01449-6 |
| work_keys_str_mv | AT wesleyfreinhart largelanguagemodelsdesignsequencedefinedmacromoleculesviaevolutionaryoptimization AT antoniastatt largelanguagemodelsdesignsequencedefinedmacromoleculesviaevolutionaryoptimization |