Stage M2
Location : Clermont-Ferrand, France
Host institute: EnCoV Lab, Faculty of Medicine, University of Clermont Auvergne and CNRS
Duration: 3 to 6 months
Supervisors: Rasoul Sharifian, Dr. Navid Rabbani, Prof. Adrien Bartoli
Stipend: 4.35€ Net per hour, approx 600€ / month
Brief Description of the project:
Depth estimation plays an important role in Minimally-Invasive Surgery (MIS) as it facilitates applications such as Augmented Reality (AR) and lesion measurement. Although recent learning-based Monocular Single-shot Depth Estimation (MoSDE) methods have shown promising results for urban scenes, they suffer from a domain gap in MIS environments [1, 2], including non-constant brightness and organ deformations. One approach to address the MIS challenging conditions is to benefit from stereo MIS datasets, reconstruct dense disparity maps [3], and use them as pseudo ground-truth to train or fine-tune MoSDE in MIS [2]. The intern will work closely with researchers to first explore and evaluate existing stereo depth prediction models, then potentially fine-tune these models with surgical datasets for enhanced performance, and finally use the pseudo ground-truth from stereo to train monocular depth estimators. Practical work with stereo cameras and lidar depth sensors may also be part of the project, providing hands-on experience with real-world applications.
The internship is hosted by the EnCoV research group at the University of Clermont Auvergne and CNRS, providing a collaborative and innovative research environment, in collaboration with SurgAR Company. The successful candidate would have the opportunity to pursue a PhD program or start a working contract with an innovative startup.
References:
[1], Cui, Beilei, et al. « Endodac: Efficient adapting foundation model for self-supervised depth estimation from any endoscopic camera. » International Conference on Medical Image Computing and Computer-Assisted Intervention, 2024.
[2], Budd, Charlie, and Tom Vercauteren. « Transferring Relative Monocular Depth to Surgical Vision with Temporal Consistency. » International Conference on Medical Image Computing and Computer-Assisted Intervention. 2024.
[3], Teed, Zachary, and Jia Deng. « Raft: Recurrent all-pairs field transforms for optical flow. », ECCV 2020
Needed Software skills:
Skills: Computer Vision, Deep Learning Programming Languages: Python Frameworks: PyTorch, OpenCV
Application:
Please send your application, along with your CV, to: rasoul.sharifian@surgar-surgery.com.