École d’Été Peyresq 2025
Thème Quantification d’incertitude Le GRETSI et le GdR IASIS organisent depuis 2006 une École d’Été...
15 Janvier 2024
Catégorie : Stagiaire
M2 Master internship
Title: development of spatiotemporal attention mechanisms for enhanced motion segmentation in video sequences
Context and motivation:
Deep learning models proposed for motion segmentation often lack the capability to consider both spatial and temporal information effectively [1-4]. Static image-based models may fail to account for temporal dependencies, while purely temporal models might disregard essential spatial cues. This creates limitations in segmenting objects accurately, especially in scenarios involving object occlusions, motion, and complex interactions.
Proposed Solution:
The proposed internship project aims to bridge the gap by developing a Spatiotemporal Attention Model that jointly considers spatial and temporal information. This model will be integrated into the process of foreground segmentation deep models to improve the quality and robustness of their results. This attention mechanism should adaptively weight the importance of spatial regions and frames within a video sequence. Furthermore, it should highlight regions and frames that are most likely to belong to the foreground.
Training Data, testing, and evaluation:
Requirements:
Prospective interns should have a strong background in computer vision, deep learning, and experience with deep learning frameworks like PyTorch or TensorFlow. The ability to work with video data and experience in video processing is a plus. Strong programming skills in Python are essential.
Practical information:
1.Location: laboratoire CIAD, Montbéliard, France.
2.This internship is remunerated.
Application:
Send a curriculum vitae, referees coordinates, andgrades for the two last years before the 30th of January, 2024 to:
Ibrahim.kajo@utbm.fr
yassine.ruichek@utbm.fr
References:
[1] Zheng, W., Wang, K., & Wang, F. Y. (2020). A novel background subtraction algorithm based on parallel vision and Bayesian GANs. Neurocomputing, 394, 178-200.
[2] Tezcan, O., Ishwar, P., & Konrad, J. (2020). BSUV-Net: A fully-convolutional neural network for background subtraction of unseen videos. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 2774-2783).
[3] Mandal, M., Dhar, V., Mishra, A., Vipparthi, S. K., & Abdel-Mottaleb, M. (2020). 3DCD: Scene independent end-to-end spatiotemporal feature learning framework for change detection in unseen videos. IEEE transactions on image processing, 30, 546-558.
[4] Kajo, I., Kas, M., Ruichek, Y., & Kamel, N. (2023). Tensor based completion meets adversarial learning: A win–win solution for change detection on unseen videos. Computer Vision and Image Understanding, 226, 103584.