Annonce

Les commentaires sont clos.

M2 internship: Human action recognition using the fusion of spatio-temporal data and scene interpretation

13 Novembre 2024


Catégorie : Stagiaire


Host laboratory: Connaissance et Intelligence Artificielle Distribuées (CIAD) – http://www.ciad-lab.fr.

Keywords: Human action recognition, classification, video data, deep learning, scene interpretation, robots, autonomous vehicles.

Contacts : Abderrazak Chahi (abderrazak.chahi@utbm.fr), Yassine Ruichek (yassine.ruichek@utbm.fr)

Description of the internship topic:

Human action recognition in video sequences is a particularly difficult problem due to the variations in the visual and motion of people and actions, changing camera viewpoint, moving backgrounds, occlusions, noise, and the enormous amount of video data. Detecting and understanding human activity or video motion is essential for a variety of applications, such as video surveillance and anomaly detection in crowded scenes, and safe and cooperative interaction between humans and robots in shared workspaces. Action and motion recognition can also be used in intelligent and/or autonomous vehicles to detect driver behavior and improve road safety. Over the past decade, significant progress has been made in action and motion recognition using spatiotemporal representation of video sequences, optical flow information, and fusion of the two.

The objective of this project is to develop new machine learning approaches that address the fusion of spatiotemporal information and scene model understanding to produce a state-adaptive representation of the scene. The scene state understanding model will extract situation data (interpretation, context, circumstances, etc.) related to the different states and conditions in the scene. This model leverages deep dual-stream architectures to integrate human context (such as postures, expressions, or interactions) and environmental context (background, objects) for a more comprehensive understanding of the scene. Experiments will be conducted on well-known video datasets for action recognition. We also plan to apply the developed methods in at least one of the experimental platforms of the laboratory (automated vehicles and robots equipped with perception and localization sensors and communication interfaces).

References:

  • Kumie, G. A., Habtie, M. A., Ayall, T. A., Zhou, C., Liu, H., Seid, A. M., & Erbad, A. (2024). Dual-attention network for view-invariant action recognition. Complex & Intelligent Systems, 10(1), 305-321.
  • Karim, M., Khalid, S., Aleryani, A., Khan, J., Ullah, I., & Ali, Z. (2024). Human action recognition systems: A review of the trends and state-of-the-art. IEEE Access.
  • Bianco, V., Finisguerra, A., & Urgesi, C. (2024). Contextual Priors Shape Action Understanding before and beyond the Unfolding of Movement Kinematics. Brain Sciences, 14(2), 164.
  • Qiu, H., & Hou, B. (2024). Multi-grained clip focus for skeleton-based action recognition. Pattern Recognition, 148, 110188.

Candidate Profile :

  • Holder or in the process of preparing a Master’s degree in computer science, computer vision, deep learning, robotics or related field.
  • Advanced knowledge and practice in object-oriented programming (C++, Python) and machine learning tools (deep learning platforms: Pytorch, TensorFlow, Matlab) are required.
  • Knowledge in ROS framework will be appreciated.
  • Advanced level in English writing and speaking is required.

Application (CV, scores, reference letters, …) to Abderrazak Chahi (abderrazak.chahi@utbm.fr), Yassine Ruichek (yassine.ruichek@utbm.fr) - Deadline: December 25, 2024

Starting date : February/March 2024