Réunion


Matrices et tenseurs aléatoires pour l'inférence et l'apprentissage

Date : 29 Mai 2026
Horaire : 10h30 - 17h00
Lieu : Amphi Charles Hermite, Institut Henri Poincaré 11 rue Pierre et Marie Curie, Paris

Axes scientifiques :
  • Théorie et méthodes

GdRs impliqués :
Organisateurs :
  • - Jose Henrique De Morais Goulart (IRIT)
  • - Xiaoyi Mai (IMT)

Nous vous rappelons que, afin de garantir l'accès de tous les inscrits aux salles de réunion, l'inscription aux réunions est gratuite mais obligatoire.

Inscriptions

18 personnes membres du GdR IASIS, et 35 personnes non membres du GdR, sont inscrits à cette réunion.

Capacité de la salle : 150 personnes. 97 Places restantes

Annonce

Cette journée scientifique, co-organisée par le GdR IASIS et le RT Mathématiques et Physique axe MEGA, sera consacrée au rôle des matrices aléatoires et des tenseurs aléatoires en inférence statistique et en apprentissage machine. Ces objets aléatoires offrent un cadre théorique puissant et fructueux pour étudier et proposer des estimateurs et des algorithmes adaptés à des données de grande dimension, étant donc en phase avec les régimes typiques des problèmes abordés dans les domaines applicatifs susmentionnés.

Réciproquement, les problématiques issues de l’inférence et de l’apprentissage machine constituent une source riche de nouveaux modèles et de questions théoriques pour l’étude des matrices et tenseurs aléatoires. Elles stimulent le développement de techniques innovantes à l’interface des probabilités, des statistiques, de la physique et de l’optimisation, contribuant ainsi à un dialogue riche et fécond entre théorie et applications.

Orateur.ice.s invité.e.s

  • Hugo Lebeau (INRIA)
  • Bruno Loureiro (DIENS, CNRS)
  • Vanessa Piccolo (IdePHICS, EPFL)

Appel à contributions

Nous invitons les doctorants, post-doctorants et jeunes chercheurs/chercheuses à soumettre une proposition pour présenter leurs travaux lors de l’événement. Les présentations dureront entre 20 et 25 minutes, suivies de 5 à 10 minutes de questions-réponses. Ces interventions seront l’occasion de partager des recherches récentes et de stimuler des discussions autour des défis et des avancées dans la thématique de la journée.

Les présentations sont attendues en anglais (de préférence, pour pouvoir échanger avec les orateurs invités), ainsi que les supports de présentations.

Si vous êtes intéressé(e) à participer, veuillez envoyer un e-mail à henrique.goulart@irit.fr et à raphael.butez@univ-lille.fr d’ici le 30 avril, pour soumettre un titre en un résumé relatifs à votre proposition d’intervention.

Organisateur.ice.s

  • Raphaël Butez (Laboratoire Paul Painlevé)
  • Guillaume Dubach (CMLS)
  • Henrique Goulart (IRIT)
  • Xiaoyi Mai (IMT)

Programme

10h20 - 10h30 : Ouverture de la journée

10h30 - 12h00 : Mini cours par Bruno Loureiro (ENS Paris, CNRS)

12h00 - 14h00 : Pause déjeuner

14h00 - 14h45 : Exposé invité par Vanessa Piccolo (EPFL)

14h45 - 15h15 : Pause café

15h15 - 16h00 : Exposé invité par Hugo Lebeau (INRIA, ENS Lyon)

16h00 - 16h30 : Exposé contribué par Andrea Combette (ENS Lyon)

16h30 - 17h00 : Exposé contribué par Lucas Morisset (X, QRT)

Résumés des contributions

Bruno Loureiro (ENS Paris, CNRS)

Invited mini-course: Some recent developments on random matrix theory for machine learning 

Abstract: In this mini-tutorial, I will review some recent results on the analysis of non-linear models motivated from machine learning using tools from random matrix theory. Starting from the analysis of the two-layer neural networks at initialisation (a.k.a. random features model), we will discuss the notion of Gaussian universality, which allows to effectively treat non-linear functions of random matrix with standard tools. We will then discuss the problem of feature learning, when the network weights are trained and develop correlations with the data, and how this it can be treated with ideas that generalise universality. This will allow us to show the advantage of feature learning over kernel methods. Finally, I will discuss some of the more recent progress concerning the analysis of the spectrum of trained two-layer neural networks.  


Vanessa Piccolo (EPFL)

Invited talk: Heavy-tailed random features models: new spectral phenomena

Abstract: In recent years, models from machine learning have motivated the study of nonlinear random matrices, that is, random matrices involving the entrywise application of a deterministic nonlinear function. In this talk, we will focus on matrices of the form YY* with Y = f(WX). Here, W and X are random rectangular matrices with i.i.d. centered entries, representing the weights and data in a two-layer feed-forward neural network, and f is a nonlinear activation function. This setting is commonly known as the random features model. 

When the entries of both the weights and the inputs are light-tailed, the asymptotic behavior of the eigenvalues is by now well understood and coincides with that of a simple Gaussian-equivalent model. In this talk, I will instead focus on the regime where the weights are heavy-tailed, based on recent joint work with Alice Guionnet. This regime is motivated by empirical observations in trained neural networks, where learned weights often exhibit strong correlations and heavy-tailed distributions. We will show that, in this context, the spectral behavior departs significantly from the light-tailed regime, leading to new spectral phenomena, with a richer combinatorial structure in the moment expansion.


Hugo Lebeau (INRIA, ENS Lyon)

Invited talk: A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation

Abstract: This work presents a comprehensive understanding of the estimation of a planted low-rank signal from a general spiked tensor model near the computational threshold. Relying on standard tools from the theory of large random matrices, we characterize the large-dimensional spectral behavior of the unfoldings of the data tensor and exhibit relevant signal-to-noise ratios governing the detectability of the principal directions of the signal. These results allow to accurately predict the reconstruction performance of truncated multilinear SVD (MLSVD) in the non-trivial regime. This is particularly important since it serves as an initialization of the higher-order orthogonal iteration (HOOI) scheme, whose convergence to the best low-multilinear-rank approximation depends entirely on its initialization. We give a sufficient condition for the convergence of HOOI and show that the number of iterations before convergence tends to 1 in the large-dimensional limit.


Andrea Combette (ENS Lyon)

Contributed talk 1: Initialization at Criticality to Control Property Propagation in Neural Networks

Abstract: Understanding how information propagates in very deep neural networks is essential for designing architectures that remain trainable as depth increases. In our previous work, A New Initialisation to Control Gradients in Sinusoidal Neural Networks (ICLR 2026), we introduced an initialisation scheme for SIREN networks that controls both gradient variance and the Fourier spectrum of the network output. This led us to study the large-depth limit of neural networks more generally and to address the central question of this work: how do correlations, gradients, and the Neural Tangent Kernel spectrum propagate through depth?

To answer this question, we develop a theoretical framework in the thermodynamic sequential limit, corresponding to the infinite-width mean-field regime. This framework provides a unified description of these propagation mechanisms by combining mean-field analysis with tools from free probability, following the approach introduced by Pennington et al. [1].

This framework allowed us to identify an initialisation strategy based on orthogonal weights that applies to a broad class of activation functions. At this initialisation, the network operates at criticality: signal propagation remains non-trivial even at large depth, the first two moments of the NTK spectrum remain stable and gradients decay algebraically as opposed to previous statements. These results provide a principled way to design deep networks whose signal, gradient, and kernel properties are jointly explained by this work.

Reference:

[1] J. Pennington, S. Schoenholz, and S. Ganguli. “Resurrecting the Sigmoid in Deep Learning through Dynamical Isometry.” 


Lucas Morisset (Ecole Polytechnique, QRT)

Contributed talk 2: Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation

Abstract: In this work, we aim to characterize the effect that modern data augmentation schemes have on the generalization error of large deep learning models. While data augmentation is now a standard ingredient of modern machine learning pipelines, its theoretical understanding remains limited. As a tractable testbed, we study random feature regression, which has recently attracted significant interest because it captures several qualitative and quantitative properties of large neural networks. By leveraging tools from random matrix theory, we derive deterministic equivalents for the generalization error under broad augmentation schemes, including settings where the augmented samples are strongly dependent. This allows us to quantify explicitly how data augmentation reshapes the bias-variance trade-off, and to identify regimes in which it improves performance as well as regimes in which it can be detrimental. On the technical side, we develop anisotropic deterministic equivalents for key quantities of the augmented problem, including the resolvent of the sample covariance matrix of the augmented data and the trained linear readout of random feature regression. More broadly, our results provide, to the best of our knowledge, the first sharp precise asymptotic result of generalization in the presence of strong sample dependence.

Joint work with Alain Durmus (Ecole Polytechnique) & Adrien Hardy (QRT).




Les commentaires sont clos.