École d’Été Peyresq 2025
Thème Quantification d’incertitude Le GRETSI et le GdR IASIS organisent depuis 2006 une École d’Été...
7 Novembre 2024
Catégorie : Stagiaire
M2 internship
Title: Enhancing and Diversifying Low-Resolution Videos using AI-driven Image Generation
Context and motivation:
This internship project aims to develop a method for enhancing and diversifying low-resolution videos, leveraging the power of large image generation models—specifically, Stable Diffusion [1]. Many existing computer vision datasets contain low-quality video footage that limits the effectiveness of model training and real-world applications [2]. Through this project, the intern will focus on utilizing AI to upscale video resolution and enrich visual variety, thus improving both the quality and diversity of these datasets. The core objective is to apply AI-driven image generation techniques to alter scenes within videos, transforming them by adjusting environments, modifying object appearances, and experimenting with different visual styles while maintaining coherence and realism across frames. For example, low-resolution videos depicting common settings like urban areas or natural scenes could be enhanced to high-definition quality, diversified into various conditions, or reimagined with unique visual themes, such as futuristic, seasonal, or artistic transformations.
Proposed Solution:
To achieve these goals, the project will employ ComfyUI [3], an open-source tool that facilitates complex workflows in Stable Diffusion. ComfyUI’s node-based system allows users to construct custom pipelines for image processing and frame transformation, making it a flexible platform for tasks like image upscaling, style transfer, and scene editing. Importantly, ComfyUI enables developers to create customized nodes for specialized processes, allowing the intern to optimize and streamline the video transformation process.
Key Tasks and Responsibilities:
Requirements:
Prospective interns should have a strong background in computer vision, deep learning, and experience with deep learning frameworks like PyTorch or TensorFlow. The ability to work with video data and experience in video processing is a plus. Strong programming skills in Python are essential.
Practical information:
1. Location: laboratoire CIAD, Montbéliard, France.
2. This internship is remunerated.
Application:
Send a curriculum vitae, referees coordinates, and grades for the two last years before the 15th of December, 2024 to:
Ibrahim.kajo@utbm.fr
yassine.ruichek@utbm.fr
References:
[1] Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10684-10695).
[2] Kajo, I., Kas, M., Ruichek, Y., & Kamel, N. (2023). Tensor based completion meets adversarial learning: A win–win solution for change detection on unseen videos. Computer Vision and Image Understanding, 226, 103584.
[3] ComfyUI. Comfyui: The most powerful and modular diffusion model gui, api and backend with a graph/nodes interface. https://github.com/comfyanonymous/ComfyUI, 2023.