Interpretable protein-DNA interactions captured by structure-sequence optimization
Sequence-specific DNA recognition underlies essential processes in gene regulation, yet methods for simultaneous predictions of genomic DNA recognition sites and their binding affinity remain lacking. Here, we present the Interpretable protein-DNA Energy Associative (IDEA) model, a residue-level, in...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
eLife Sciences Publications Ltd
2025-07-01
|
| Series: | eLife |
| Subjects: | |
| Online Access: | https://elifesciences.org/articles/105565 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849719738019610624 |
|---|---|
| author | Yafan Zhang Irene Silvernail Zhuyang Lin Xingcheng Lin |
| author_facet | Yafan Zhang Irene Silvernail Zhuyang Lin Xingcheng Lin |
| author_sort | Yafan Zhang |
| collection | DOAJ |
| description | Sequence-specific DNA recognition underlies essential processes in gene regulation, yet methods for simultaneous predictions of genomic DNA recognition sites and their binding affinity remain lacking. Here, we present the Interpretable protein-DNA Energy Associative (IDEA) model, a residue-level, interpretable biophysical model capable of predicting binding sites and affinities of DNA-binding proteins. By fusing structures and sequences of known protein-DNA complexes into an optimized energy model, IDEA enables direct interpretation of physicochemical interactions among individual amino acids and nucleotides. We demonstrate that this energy model can accurately predict DNA recognition sites and their binding strengths across various protein families. Additionally, the IDEA model is integrated into a coarse-grained simulation framework that quantitatively captures the absolute protein-DNA binding free energies. Overall, IDEA provides an integrated computational platform that alleviates experimental costs and biases in assessing DNA recognition and can be utilized for mechanistic studies of various DNA-recognition processes. |
| format | Article |
| id | doaj-art-1086ffac8a0d4af4a10b03bd4ed84eaa |
| institution | DOAJ |
| issn | 2050-084X |
| language | English |
| publishDate | 2025-07-01 |
| publisher | eLife Sciences Publications Ltd |
| record_format | Article |
| series | eLife |
| spelling | doaj-art-1086ffac8a0d4af4a10b03bd4ed84eaa2025-08-20T03:12:05ZengeLife Sciences Publications LtdeLife2050-084X2025-07-011410.7554/eLife.105565Interpretable protein-DNA interactions captured by structure-sequence optimizationYafan Zhang0https://orcid.org/0000-0002-7867-2873Irene Silvernail1https://orcid.org/0009-0003-3070-974XZhuyang Lin2https://orcid.org/0009-0009-0480-7024Xingcheng Lin3https://orcid.org/0000-0002-9378-6174Bioinformatics Research Center, North Carolina State University, Raleigh, United StatesDepartment of Physics, North Carolina State University, Raleigh, United StatesBioinformatics Research Center, North Carolina State University, Raleigh, United StatesBioinformatics Research Center, North Carolina State University, Raleigh, United States; Department of Physics, North Carolina State University, Raleigh, United StatesSequence-specific DNA recognition underlies essential processes in gene regulation, yet methods for simultaneous predictions of genomic DNA recognition sites and their binding affinity remain lacking. Here, we present the Interpretable protein-DNA Energy Associative (IDEA) model, a residue-level, interpretable biophysical model capable of predicting binding sites and affinities of DNA-binding proteins. By fusing structures and sequences of known protein-DNA complexes into an optimized energy model, IDEA enables direct interpretation of physicochemical interactions among individual amino acids and nucleotides. We demonstrate that this energy model can accurately predict DNA recognition sites and their binding strengths across various protein families. Additionally, the IDEA model is integrated into a coarse-grained simulation framework that quantitatively captures the absolute protein-DNA binding free energies. Overall, IDEA provides an integrated computational platform that alleviates experimental costs and biases in assessing DNA recognition and can be utilized for mechanistic studies of various DNA-recognition processes.https://elifesciences.org/articles/105565data-driven modelingstructure-sequence integrationprotein-DNA binding affinity predictiongenomic binding sites predictionssequence-specific simulation |
| spellingShingle | Yafan Zhang Irene Silvernail Zhuyang Lin Xingcheng Lin Interpretable protein-DNA interactions captured by structure-sequence optimization eLife data-driven modeling structure-sequence integration protein-DNA binding affinity prediction genomic binding sites predictions sequence-specific simulation |
| title | Interpretable protein-DNA interactions captured by structure-sequence optimization |
| title_full | Interpretable protein-DNA interactions captured by structure-sequence optimization |
| title_fullStr | Interpretable protein-DNA interactions captured by structure-sequence optimization |
| title_full_unstemmed | Interpretable protein-DNA interactions captured by structure-sequence optimization |
| title_short | Interpretable protein-DNA interactions captured by structure-sequence optimization |
| title_sort | interpretable protein dna interactions captured by structure sequence optimization |
| topic | data-driven modeling structure-sequence integration protein-DNA binding affinity prediction genomic binding sites predictions sequence-specific simulation |
| url | https://elifesciences.org/articles/105565 |
| work_keys_str_mv | AT yafanzhang interpretableproteindnainteractionscapturedbystructuresequenceoptimization AT irenesilvernail interpretableproteindnainteractionscapturedbystructuresequenceoptimization AT zhuyanglin interpretableproteindnainteractionscapturedbystructuresequenceoptimization AT xingchenglin interpretableproteindnainteractionscapturedbystructuresequenceoptimization |