[Stage M2] Event-based Human Pose Estimation for Fast and Robust Motion Understanding

GENERAL INFORMATION

Position: End-of study project

Duration: 5 to 6 months, starting in February 2026 (flexible)

Institution: IMT Mines Alès, France (potential collaboration with other institutions)

Supervisor: Hajer Fradi

SUMMARY OF THE PROJECT

Project Keywords: DVS sensors, pose estimation, motion, deep learning, robotics

Project Summary: E-Pose project focuses on developing deep learning methods for human pose estimation and motion analysis using event-based data. The aim is to evaluate the effectiveness of this approach for understanding human movement in complex and dynamic environments.

PROJECT PROPOSAL

General Context

This project will be carried out at the EuroMov laboratory, which conducts research at the intersection of movement, health, and data sciences. The laboratory hosts the Alès Imaging and Human Metrology AIHM platform, which supports experimentation and evaluation related to the capture and modeling of human motion and its environment.

Motivation

Bio-inspired Dynamic Vision Sensors (DVS), also known as event cameras, represent a paradigm shift in visual sensing [1]. Unlike conventional cameras, which capture frames at fixed rates, DVS sensors asynchronously record brightness changes. This alternative way of encoding visual information comes with several advantages, including microsecond-level latency, high dynamic range, robustness to motion blur, better preservation of personal information and lower energy and computational requirements. These properties make DVS cameras particularly suitable for real-time perception in collaborative or dynamic environments.

Although DVS cameras have been widely used in autonomous driving and drone navigation [2,3], their potential in human-centered applications remains underexplored, though interest is growing [4]. This project aims to investigate event-based vision for pose estimation to analyze and understand human motion in complex or dynamic environments. Pose estimation from event data can provide rich cues for shared spatial awareness between humans and robots, enabling more responsive and natural interaction [5, 6].

Objectives

Develop an end-to-end deep learning architecture for human pose estimation using event-based data. Explore and discuss thepotential of fusion with other conventional modalities (RGB, depth, IMU) to improve robustness and generalization. Evaluate the proposed approach on public datasets and, if possible, through real-world experiments, to demonstrate its effectiveness in dynamic or complex environments.

Expected Outcomes

A proof-of-concept pipeline for event-based human pose estimationQuantitative evaluation of the proposed approach using public datasets such as DHP19 or EventHPE.Preliminary experiments on locally captured event data to assess the system robustness in dynamic or real-world conditions.

References

[1] Gallego, G., Delbrück, T., Orchard, G., Bartolozzi, C., Taba, B., Censi, A., … & Scaramuzza, D. (2020). Event-based vision: A survey. IEEE transactions on pattern analysis and machine intelligence, 44(1), 154-180. [2] Fradi, H., & Papadakis, P. (2024, December). Advancing object detection for autonomous vehicles via general purpose event-rgb fusion. In 2024 Eighth IEEE International Conference on Robotic Computing (IRC) (pp. 147-153). IEEE. [3] Souissi, A., Fradi, H., & Papadakis, P. (2025). Towards Event-Driven, End-to-End UAV Tracking Using Deep Reinforcement Learning, ACMMM 2025 Workshop UAVM. [4] Adra, M., Melcarne, S., Mirabet-Herranz, N., & Dugelay, J. L. (2025). Event-based solutions for human-centered applications: A comprehensive review. arXiv preprint arXiv:2502.18490. [5] Cho, H., Kim, T., Jeong, Y., & Yoon, K. J. (2024). A benchmark dataset for event-guided human pose estimation and tracking in extreme conditions. Advances in Neural Information Processing Systems, 37, 134826-134840. [6] Zou, S., Mu, Y., Ji, W., Wang, Z. A., Zuo, X., Wang, S., … & Cheng, L. (2025). Highly Efficient 3D Human Pose Tracking from Events with Spiking Spatiotemporal Transformer. IEEE Transactions on Circuits and Systems for Video Technology.

APPLICATION

Candidate Profile

The candidate is pursuing his/her last year of Master’s or engineer’s degree. The balance between research and development will be determined based on the candidate’s profile.Strong programming skills in Python are required.Experience and strong interest in deep learning frameworks, particularly PyTorch, are expected.Good oral and written communication skills in English

How to apply

Interested candidates with a strong interest in computer vision, deep learning, and event-based human motion analysis, whose profiles meet the requirements, are invited to send their CV and academic transcripts as soon as possible, preferably before October 28, to hajer.fradi@mines-ales.fr.

Annonce

[Stage M2] Event-based Human Pose Estimation for Fast and Robust Motion Understanding

IASIS en chiffres

A noter

Cartographie des expertises du GdR

Actus de la communauté

Workshop Statistical Learning for multi-dimensional SAR imaging

GreenDays 2026

Conférence CNRS Sciences informatiques sur « les nouveaux paradigmes de calcul »

Journée « Capteurs en environnement »

Appel : prix de thèse Gilles Kahn