[Stage M2+PhD] Self-supervised spiking neural networks for object detection and segmentation

★ The internship will take place within the FOX team of CRIStAL Laboratory, of University of Lille.

Advantage : This is primarily a PhD offer, but it is possible to start with Masters-2 internship (however, it is not at all compulsory).

Resume :
The project aims to design a neuromorphic system capable of real-time detection of spatio-temporal patterns within the activity of Spiking Neural Networks (SNNs). SNNs process information through discrete firing events (spikes) over time. This temporal coding enables sparse, event-driven computation, thereby offering exceptional energy efficiency
Conversely, Self-Supervised Learning (SSL) has revolutionized the training of classical artificial networks by eliminating the need for manual annotations, allowing models to learn generalizable representations directly from raw data. Despite its success, SSL has not yet been effectively transposed to the spiking domain. Most prior research relies either on supervised fine-tuning or the adaptation of pretext tasks derived from ANNs, failing to exploit the unique temporal dynamics inherent to SNNs.

More importantly, no prior work has demonstrated a fully self-supervised SNN framework capable of scaling beyond simple, low-resolution tasks (such as MNIST/CIFAR) to more complex contexts involving higher resolutions or dense prediction tasks.

In this project, we will develop a fully self-supervised framework capable of achieving competitive performance during large-scale pre-training (ImageNet-1K) and effectively transferring to major downstream tasks—such as object detection on COCO or semantic segmentation—without relying on annotated supervision [1]. In short, the project is built upon the convergence of three major axes of current research :

We plan to leverage the temporal dynamics of spikes as a natural source of temporal diversity, enabling rich representation learning over time [2].
The biological modeling of spiking neural networks and their temporal dynamics.
We also envision utilizing the temporal dynamics of spikes to achieve temporal alignment.
Furthermore, the proposed model must be compatible with both CNN-based and Vision Transformer-based SNN architectures. It must generalize across both static datasets (e.g., ImageNet-1K, CIFAR-10) and neuromorphic datasets (e.g., CIFAR10-DVS), while ensuring efficient transfer performance to downstream tasks.
The asynchronous event-driven processing inherent to neuromorphic architectures

Within the framework of the self-supervised model defined above, we will explore various deep neural network architectures.
Spiking ResNet : Initially, we will study classical deep Spiking Neural Network (SNN) architectures. In this domain, deep learning methods have recently been introduced to SNNs, and deep SNNs have achieved performance levels close to those of classical Artificial Neural Networks (ANNs) on certain simple classification datasets. We will work on a spiking version of ResNet that enables the actual implementation of residual learning; this is expected to simultaneously address the issues of vanishing or exploding gradients [3].

Spiking Vision Transformers :
Recent research has proposed spike-formed transformers that adapt the Transformer self-attention mechanism to the spiking paradigm for various tasks, such as object tracking, computer vision, depth estimation, and speech recognition.

However, current spike-formed transformers [4] rely predominantly on purely spatial attention, neglecting the intrinsic dynamic and temporal nature of spiking events. The literature indicates that purely spatial attention is limited to vertical (spatial) relationships at each timestep, thereby ignoring object features that evolve over time.

In this project, we will implement spatio-temporal attention, integrating spatial and temporal information simultaneously within the self-attention mechanism. The objective is also to maintain a computational complexity equivalent to that of existing spiking Transformers.

Integrating the predictive coding model into the aforementioned machine learning frameworks, particularly within spiking neural networks. :

Machine learning techniques based on SNNs are designed to consume significantly less energy during information processing, mimicking the biological brain. This efficiency is achieved by emulating biological neural coding through discrete electrical discharges (spikes).

In [5], a variant called « Predictive Coding Light (PCL) » an alternative to classical predictive coding is presented in the context of ANNs. In this research, we plan to examine predictive coding-based techniques in greater detail, specifically seeking to improve the foundations of PCL within a self-supervised learning framework. We will begin by integrating predictive coding approaches (such as PCL) into self-supervised learning models built around Spiking-ResNet and Spiking-Transformer.

Subsequently, we will focus on strengthening and optimizing these PCL approaches within the context of self-supervised learning based on these advanced spiking architectures.

Desired Profile

Final-year Master’s student (M2) or engineering school student, specializing in machine learning, computer vision, or a related field.
Solid knowledge of computer vision, machine learning, and deep learning.
Programming skills (Python).
Ability to work independently, with rigor and a critical mindset.

The internship will take place within the FOX team of the CRIStAL laboratory at the University of Lille.

Internship Location :
CAMPUS Haute-Borne – CNRS IRCICA-IRI-RMN
Parc Scientifique de la Haute Borne
50 Avenue Halley, BP 70478
59658 Villeneuve d’Ascq Cedex, France

Application:

If you are interested in this opportunity, please send the following documents to
Dr. Tanmoy MONDAL (tanmoy.mondal@univ-lille.fr) and
Chaabane DJERABA (chabane.djeraba@univ-lille.fr):

Curriculum Vitae (CV)
Cover letter
Academic transcripts from Bachelor’s / Master’s / Engineering school, including class ranking
Name and contact details of at least one reference who may be contacted if necessary

Références

Anonymous, S4NN : Scalable Self-Supervised Spiking Neural Networks, in Submitted to The Fourteenth International Conference on Learning Representations, 2025
S. Barchid, Avancées en vision neuromorphique : représentation événementielle, réseaux de neurones impulsionnels supervisés et pré-entraînement auto-supervisé, Thèse de doctorat, Université de Lille, 2023.
W. Fang, Z. Yu, Y. Chen, T. Huang, T. Masquelier, and Y. Tian, Deep Residual Learning in Spiking Neural Networks, arXiv [cs.NE]. 2022.
Lee, D., Li, Y., Kim, Y., Xiao, S., Panda, P. (2025). Spiking Transformer with Spatial-Temporal Attention. CVPR, 13948–13958. https ://doi.org/10.1109/CVPR52734.2025.01302
N’dri, A.W., Barbier, T., Teulière, C. et al. Predictive Coding Light. Nat Commun 16, 8880 (2025). https ://doi.org/10.1038/s41467-025-64234-z

Annonce

[Stage M2+PhD] Self-supervised spiking neural networks for object detection and segmentation

IASIS en chiffres

A noter

Cartographie des expertises du GdR

Actus de la communauté

Graphes, la science des liens

Jean-Louis Lacoume (1940-2026)

GreenDays 2026

L’intelligence artificielle pour les sciences

Concours Chercheurs CNRS