Annonce


[StageM2] Object detection in open worlds using Domain generalization and Active learning

22 Janvier 2026


Catégorie : Postes Stagiaires ;


★ The internship will take place at the FOX team of CRIStAL laboratory, at the University of Lille.

Summary :
The success of supervised object detection methods relies on the assumption that the training and test data are drawn from the same distribution. However, in many real-world applications, this assumption is often violated, and object detectors generally suffer from performance degradation due to a phenomenon known as domain shift. These variations arise from environmental fluctuations, leading to differences in contrast, illumination, texture, and other visual properties.

An important research direction aimed at mitigating the impact of domain shift is known as Unsupervised Domain Adaptation (UDA). Given annotated data from a source domain and unannotated data from a target domain, UDA methods aim to align the source and target distributions so that training generalizes well to the target domain.

A clear limitation of UDA methods is that they require prior collection of data (even without annotations) and retraining of the model for each target domain. To address the domain shift problem, a more realistic but also more challenging setting is domain generalization. The goal of domain generalization (DG) is to learn a model that generalizes well from multiple source domains available during training to unseen target domains. Recent studies show that very little work has been devoted to domain generalization for object detection. In this context, we study the problem of generalized object detection from a single source domain (Single-DGOD), where only one source domain is available for training, and the objective is to train a detector capable of generalizing well to multiple unseen target domains.

In addition to that, annotation is costly and time-consuming. Therefore, it is logical to annotate only the images that will provide the greatest benefit when used for training. But how can we determine which images to choose? Given a large pool of unlabeled data, Active Learning (AL) aims to sample the data that would maximally improve model performance if annotated and used for training. While active learning techniques are well established in image classification, designing an effective AL strategy for object detection is significantly more challenging. Indeed, object detection involves both object localization and classification. Quantifying uncertainty jointly over these two tasks is difficult; for example, some objects may be easy to localize but hard to classify. Moreover, measuring similarity is complex when images contain multiple objects with diverse characteristics.

Research Directions :

We propose to address these limitations through the following directions:

  1. Developing a Vision Transformer based model for object detection capable of generalizing to target domains: We will explore Vision Transformer architectures from the perspective of domain generalization, a topic that remains insufficiently explored in the literature. Our approach is inspired by domain generalization (DG) methods for classification, which demonstrate that simulating new domains during training helps disentangle domain-specific features from semantic ones. We will begin by exploiting the idea of augmenting annotated input samples to simulate new domains, thereby increasing the diversity within a single source domain. The augmentation strategy aims to disrupt domain-specific statistical regularities while preserving high-level semantic concepts shared across domains.
  2. A plug-and-play active learning module for selecting data to be annotated:
    Several studies have addressed active learning for object detection, but they typically rely on modifications to the object detector architecture and the training pipeline. Motivated by this limitation, we propose to develop a plug-and-play active learning module for object detection under domain generalization, requiring no specific changes to detector architectures or training pipelines and being compatible with a wide range of object detectors. To this end, we plan to combine sample diversity criteria with uncertainty-based scores associated with unlabeled data.
  3. Ensuring continuous learning when integrating the active learning module into a domain-generalized object detection pipeline:
    Each active learning iteration must incorporate continual learning mechanisms, allowing the framework to progressively assimilate newly annotated data. At the same time, the model must adapt gradually and reliably to new target domains while avoiding catastrophic forgetting of previously acquired knowledge.

Desired Profile :

  • Final-year Master’s student (M2) or engineering student specializing in machine learning, computer vision, or a related field.
  • Knowledge of computer vision, machine learning, and deep learning.
  • Programming skills (Python).
  • Autonomy, rigor, and critical thinking skills.

★ The internship will take place at the FOX team of CRIStAL laboratory, at University of Lille.

Address of the Internship :
CAMPUS Haute-Borne CNRS IRCICA-IRI-RMN
Parc Scientifique de la Haute Borne, 50 Avenue Halley, BP 70478, 59658 Villeneuve d’Ascq Cédex, France.

Candidature :

If this proposal interests you, please send the following documents to Dr. Tanmoy MONDAL (tanmoy.mondal@univ-lille.fr)

  • CV
  • Motivation Letter
  • Transcripts of grades obtained in Bachelor’s/Master’s/Engineering school as well as class ranking
  • Name and contact details of at least one reference person who can be contacted if necessary

References

  1. Ji, Y., Huang, Z., Wang, H., & Lee, Y. J. (n.d.). Customizing Domain Adapters for Domain Generalization. 934–944.
  2. Alijani, S., Fayyad, J., & Najjaran, H. (2024). Vision transformers in domain adaptation and domain generalization: a study of robustness. In Neural Computing and Applications (Vol. 36, Issue 29, pp. 17979–18007).

Les commentaires sont clos.