★ The internship will take place at the FOX team of CRIStAL laboratory, at the University of Lille.
Summary :
Novel-view synthesis has gained significant momentum in recent years, driven by advances in neural rendering and radiance field representations. Among these approaches, 3D Gaussian Splatting (3DGS) has emerged as a particularly compelling technique, offering an excellent trade-off between visual quality, training efficiency, and real-time rendering performance through an explicit, primitive-based scene representation.
Despite these advantages, 3DGS remains highly dependent on the availability and density of input viewpoints. When observations are sparse or certain viewpoints are missing, the rendering quality degrades significantly, often producing artifacts, blurred structures, or missing content. This limitation becomes especially pronounced in large-scale or real-world captures (e.g., mobile or vehicle-mounted acquisition systems), where exhaustive view coverage is impractical.
Motivation :
The quality of rendered novel views in 3DGS is fundamentally constrained by the lack of geometric and photometric information in under-observed regions. Purely geometric or hierarchical strategies, while effective for scalability and efficiency, cannot fully compensate for missing visual evidence.
Generative models offer a promising alternative: by learning strong priors over scene appearance, they can hallucinate plausible content for unseen viewpoints. However, generative novel-view synthesis typically produces images at limited resolution and may lack the fine-grained details required for high-quality rendering.
To address this challenge, we propose a hybrid pipeline combining generative novel-view synthesis with super-resolution, aiming to recover both global scene consistency and high-frequency details suitable for rendering and visualization.
Objectives of the Internship
The goal of this internship is to explore and develop methods that improve the rendering quality of 3D Gaussian Splatting in sparsely observed viewpoints by leveraging modern generative and super-resolution techniques.
The main objectives are:
- Generative Novel-View Synthesis :
- Design or adapt generative models capable of synthesizing plausible unseen viewpoints from sparse input images or 3DGS representations.
- Ensure geometric and appearance consistency with existing views and Gaussian primitives.
- Super-Resolution for Viewpoint Enhancement :
- Apply and adapt deep learning based super-resolution methods to improve the visual quality of synthesized and low-quality rendered views.
- Focus on enhancing fine details while preserving global appearance and scene coherence.
- Consistency-Aware Rendering Integration :
- Address temporal and photometric inconsistencies arising from independently super-resolving multiple viewpoints.
- Incorporate smooth rendering constraints to ensure stable, visually coherent transitions between viewpoints.
- Evaluation and Perceptual Quality :
- Analyze the limitations of traditional image quality metrics in this context.
- Emphasize perceptual quality and rendering plausibility, aligned with human visual perception.
Desired Profile :
- Final-year Master’s student (M2) or engineering student specializing in machine learning, computer vision and neural rendering, or a related field.
- Knowledge of computer vision, machine learning, and deep learning.
- Generative models (diffusion models, GANs, or neural implicit representations)
- Image super-resolution and perceptual learning
- Programming skills (Python).
- Autonomy, rigor, and critical thinking skills.
★ The internship will take place at the FOX team of CRIStAL laboratory, at University of Lille.
Address of the Internship :
CAMPUS Haute-Borne CNRS IRCICA-IRI-RMN
Parc Scientifique de la Haute Borne, 50 Avenue Halley, BP 70478, 59658 Villeneuve d’Ascq Cédex, France.
Candidature :
If this proposal interests you, please send the following documents to Dr. Tanmoy MONDAL (tanmoy.mondal@univ-lille.fr) and Damien MARCHAL (damien.marchal@univ-lille.fr)
- CV
- Motivation Letter
- Transcripts of grades obtained in Bachelor’s/Master’s/Engineering school as well as class ranking
- Name and contact details of at least one reference person who can be contacted if necessary
References
- Tang, Z., Qiu, Z., Hao, Y., Hong, R., & Yao, T. (2023). 3D Human Pose Estimation with Spatio-Temporal Criss-Cross Attention. 1, 4790–4799.
- Lin, K., Wang, L., & Liu, Z. (2021). End-to-End Human Pose and Mesh Reconstruction with Transformers. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1954–1963..
- Pavllo, D., Feichtenhofer, C., Grangier, D., & Auli, M. (2019). 3D human pose estimation in video with temporal convolutions and semi-supervised training. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June, 7745–7754.
