Deep neural networks have an inbuilt Occam’s razor
Abstract The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components for supervised learning, we apply a Bayesian picture based on the func...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-024-54813-x |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832594545589616640 |
---|---|
author | Chris Mingard Henry Rees Guillermo Valle-Pérez Ard A. Louis |
author_facet | Chris Mingard Henry Rees Guillermo Valle-Pérez Ard A. Louis |
author_sort | Chris Mingard |
collection | DOAJ |
description | Abstract The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components for supervised learning, we apply a Bayesian picture based on the functions expressed by a DNN. The prior over functions is determined by the network architecture, which we vary by exploiting a transition between ordered and chaotic regimes. For Boolean function classification, we approximate the likelihood using the error spectrum of functions on data. Combining this with the prior yields an accurate prediction for the posterior, measured for DNNs trained with stochastic gradient descent. This analysis shows that structured data, together with a specific Occam’s razor-like inductive bias towards (Kolmogorov) simple functions that exactly counteracts the exponential growth of the number of functions with complexity, is a key to the success of DNNs. |
format | Article |
id | doaj-art-50237731e90c41d8a3e376afdbb93068 |
institution | Kabale University |
issn | 2041-1723 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj-art-50237731e90c41d8a3e376afdbb930682025-01-19T12:31:24ZengNature PortfolioNature Communications2041-17232025-01-011611910.1038/s41467-024-54813-xDeep neural networks have an inbuilt Occam’s razorChris Mingard0Henry Rees1Guillermo Valle-Pérez2Ard A. Louis3Rudolf Peierls Centre for Theoretical Physics, University of OxfordRudolf Peierls Centre for Theoretical Physics, University of OxfordRudolf Peierls Centre for Theoretical Physics, University of OxfordRudolf Peierls Centre for Theoretical Physics, University of OxfordAbstract The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components for supervised learning, we apply a Bayesian picture based on the functions expressed by a DNN. The prior over functions is determined by the network architecture, which we vary by exploiting a transition between ordered and chaotic regimes. For Boolean function classification, we approximate the likelihood using the error spectrum of functions on data. Combining this with the prior yields an accurate prediction for the posterior, measured for DNNs trained with stochastic gradient descent. This analysis shows that structured data, together with a specific Occam’s razor-like inductive bias towards (Kolmogorov) simple functions that exactly counteracts the exponential growth of the number of functions with complexity, is a key to the success of DNNs.https://doi.org/10.1038/s41467-024-54813-x |
spellingShingle | Chris Mingard Henry Rees Guillermo Valle-Pérez Ard A. Louis Deep neural networks have an inbuilt Occam’s razor Nature Communications |
title | Deep neural networks have an inbuilt Occam’s razor |
title_full | Deep neural networks have an inbuilt Occam’s razor |
title_fullStr | Deep neural networks have an inbuilt Occam’s razor |
title_full_unstemmed | Deep neural networks have an inbuilt Occam’s razor |
title_short | Deep neural networks have an inbuilt Occam’s razor |
title_sort | deep neural networks have an inbuilt occam s razor |
url | https://doi.org/10.1038/s41467-024-54813-x |
work_keys_str_mv | AT chrismingard deepneuralnetworkshaveaninbuiltoccamsrazor AT henryrees deepneuralnetworkshaveaninbuiltoccamsrazor AT guillermovalleperez deepneuralnetworkshaveaninbuiltoccamsrazor AT ardalouis deepneuralnetworkshaveaninbuiltoccamsrazor |