Pierfrancesco Urbani (IPhT)

Fuseau horaire actuel: Le fuseau horaire de votre navigateur est %s, qui est différent de vos paramètres. Voulez-vous modifier le fuseau horaire de votre navigateur ? Oui Fermer

Calendrier: Séminaires

Date: 07.04.2026 10:45 - 11:45

Lieu: Salle 523, couloir 12-13, 5è étage

Description

Separation of timescales controls feature learning and overfitting in large neural networks

To understand the inductive bias and generalization capabilities of large, overparameterized machine learning models, it is essential to analyze the out-of-equilibrium dynamics of their training algorithms. Using dynamical mean field theory we investigate the learning dynamics of large two-layer neural networks. Our findings reveal that, for networks with a large width, the training process exhibits a separation of timescales phenomenon. This leads to several key observations: 1. The emergence of a slow timescale linked to the growth of a carefully defined complexity measure of the network; 2. An inductive bias favoring low complexity when the initial model complexity is sufficiently small; 3. A dynamical decoupling between feature learning and overfitting phases; 4. A non-monotonic trend in test error, characterized by a "feature unlearning" regime at later stages of training.

Laboratoire de Physique Théorique de la Matière Condensée

Pierfrancesco Urbani (IPhT)

Description

Separation of timescales controls feature learning and overfitting in large neural networks