Annonce

Les commentaires sont clos.

Frugal Learning / Zero-Shot Learning of Semantic Segmentation Image Application on COVERED (CollabOratiVE Robot Environment Dataset) for 3D Semantic Segmentation

8 Janvier 2024

Catégorie : Stagiaire

Title:Frugal Learning / Zero-Shot Learning of Semantic Segmentation Image

Application on COVERED (CollabOratiVE Robot Environment Dataset) for 3D Semantic Segmentation[1]

Scientific fields: Computer science

Keywords: Machine Learning; Deep Learning; Frugality; Semantic Segmentation; RGB Color Image; Human-Robot Collaboration (HRC); Multi-LiDAR; Real Industrial Environment; Human-Machine Interaction

Supervisor

Ahed ALBOODY (Associate Professor in Computer Science – CESI LINEACT)https://lineact.cesi.fr/en/cv-chercheurs/alboody-ahed/

Email: aalboody@cesi.fr

Associate Professor (Enseignant-Chercheur) at CESI École d’Ingénieurs,

https://www.cesi.fr/

Research interest : Computer Vision; Machine Learning ; Deep Learning

Research work

Recently, the concept of frugal machine learning has been introduced in [1, 2] in order to define what a frugal machine learning methodology should be, and how to evaluate frugality models. Frugal Artificial Intelligence (AI) promises to train machine learning algorithms and deep learning models using less data (Zero-, One-example, or with Little-data) and less computing power (frugality in power/ energy efficiency) while guaranteeing robustness and performance in an application domain (computer vision, image semantic segmentation, etc.). The challenges raised when learning models have to be trained on tasks for which zero or one example training data is available. With this challenge of data leakage in computer vision, we have major obstacles to train learning methods (Deep Learning in particular). Among them, learning using zero/few data for computer vision applications requires developing frugal learning models capable of operating (in learning as in inference) on computing architectures (Edge Computing: GPU & TPU). Several aspects of frugality should be reported: data frugality based on training with zero/few learning methods, frugality in terms of deep learning model complexity and computing capacity, and frugality in terms of energy efficiency and IA low-power consumption.

Image semantic segmentation [3, 4, 5] is a fundamental task of image processing, pattern recognition and computer vision and a key issue in many applications, e.g., in industrial imaging, medical imaging, microscopy and remote sensing. For example, Zero-Shot Semantic Segmentation (ZS3) [6, 7, 8, 9] aims to segment the novel categories that have not been seen in the training. In [7], authors proposed ZegFormer the first framework that decouple the zero-shot semantic segmentation into: 1) class-agnostic segmentation and 2) segment-level zero-shot classification.

In this stage, we will study the frugality in the context of Machine/Deep learning (ML/DL) for image semantic segmentation. The objective of this proposal will be to propose and develop frugal learning models (Zero–Shot Learning) based on deep learning architectures for image semantic segmentation that can provide efficient results while being structured to provide a reduced time and space complexity. Either in deep neural networks or in other machine learning models, the space and time complexity is mainly due to the linear part of the models, involving large matrices or tensors of data and parameters. A key challenge is to reduce this particular aspect. More precisely, we will consider frugality aspects (data, model complexity, time consuming, etc.) to propose our proper models by take inspiration from the following recent works: 1) the conception of ultra-lightweight models by design [4, 5]; 2) the frugality on data/image label and zero-shot image segmentation [7-14].

The proposed models will be trained on two datasets COVERED (CollabOratiVE Robot Environment Dataset) for 3D Semantic segmentation [15]: In the emerging Industry 5.0 paradigm, conventional robots are being replaced with more intelligent and flexible collaborative robots (cobots). Safe and efficient collaboration between cobots and humans largely relies on the cobot’s comprehensive semantic understanding of the dynamic surrounding of industrial environments. Despite the importance of semantic understanding for such applications, 3D semantic segmentation of collaborative robot workspaces lacks sufficient research. Finally, to measure the accuracy and study the performance, we will apply the proposed models on datasets for Human-Robot Collaboration (HRC) in Real Industrial Environment.

The expected works should contribute to the following research directions:

1.Proposing new frugal learning models based on zero-shot learning and lightweight models;

2.Studying the properties of such models (complexity, expressivity, frugality);

3.Applications of frugal learning models on datasets for 3D semantic segmentation;

Work plan

The working plan in general is divided in two phases:

In the first phase (about two-months), student will provide the state of-the-art (SOTA) of the frugality in the context of machine/deep learning applied for image 3D semantic segmentation task.

In the second phase (about four months), student would propose contributions to the following research directions:

1.Proposing new frugal learning models based on zero-shot learning and lightweight models;

2.Studying the properties of such models (complexity, expressivity, frugality);

3.Application of these frugal learning models on datasets for the realization of semantic segmentation task;

Expected scientific production

Different scientific productions, write an international peer-reviewed conference paper or an indexed journal paper are expected:

1.Publication relating to the literature review about frugality learning for semantic segmentation,

2.Publication relating to our proposal of a new frugal model, model performance and evaluation based on testing on Human-Robot Collaboration datasets for semantic segmentation in real industrial environment.

Introduction to the laboratory CESI LINEACT

CESI LINEACT (EA 7527), Digital Innovation Laboratory for Business and Learning at the service of the Competitiveness of Territories, anticipates and accompanies the technological mutations of the sectors and services related to industry and construction. CESI's historical proximity to companies is a determining factor for our research activities, and has led us to focus our efforts on applied research close to companies and in partnership with them. A human-centered approach coupled with the use of technologies, as well as the territorial network and the links with training, have allowed us to build a transversal research; it puts the human being, his needs and his uses, at the center of its problems and approaches the technological angle through these contributions.

Its research is organized according to two interdisciplinary scientific themes and two application areas.

1.Theme 1 "Learning and Innovation" is mainly concerned with Cognitive Sciences, Social Sciences and Management Sciences, Training Sciences and Techniques and Innovation Sciences. The main scientific objectives of this theme are to understand the effects of the environment, and more particularly of situations instrumented by technical objects (platforms, prototyping workshops, immersive systems, etc.) on the learning, creativity and innovation processes.

2.Theme 2 "Engineering and Digital Tools" is mainly concerned with Digital Sciences and Engineering. The main scientific objectives of this theme concern the modeling, simulation, optimization and data analysis of industrial or urban systems. The research work also focuses on the associated decision support tools and on the study of digital twins coupled with virtual or augmented environments.

These two themes develop and cross their research in the two application areas of the Industry of the Future and the City of the Future, supported by research platforms, mainly the one in Rouen dedicated to the Factory of the Future and the one in Nanterre dedicated to the Factory and Building of the Future.

Link to the laboratory website: https://lineact.cesi.fr/en/

CESI LINEACT RESEARCH THEME

Machine-Interaction

Work location:CESI – Campus Nice

Address: Campus Sud des Métiers, 13 avenue Simone Veil, 06200 Nice, Provence-Alpes-Côte d’Azur, France

Starting Date:Preferablyfrom February to July 2024

Duration:6 months

How to apply

Please send your application to aalboody@cesi.fr with the subject “CESI-Internship Frugal_Learning”

References

[1]Lingjiao Chen, Matei Zaharia, and James Y. Zou, “FrugalML : How to use ML Prediction APIs more accurately and cheaply,” in Advances in Neural Information Processing Systems 33 : Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, Virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, Eds., 2020.

[2]Mikhail Evchenko, Joaquin Vanschoren, Holger H. Hoos, Marc Schoenauer, and Michèle Sebag, “Frugal Machine Learning,” arXiv :2111.03731 [cs, eess], Nov. 2021.

[3]Shervin Minaee, Yuri Y. Boykov, Fatih Porikli, Antonio J Plaza, Nasser Kehtarnavaz, and Demetri Terzopoulos, “Image segmentation using deep learning : A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–1, 2021.

[4]Linjie Wang, Quan Zhou, Chenfeng Jiang, Xiaofu Wu, and Longin Jan Latecki, “DRBANET: A Lightweight Dual-Resolution Network for Semantic Segmentation with Boundary Auxiliary,” arXiv :2111.00509, [cs], Oct. 2021.

[5]Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo, “Segformer : Simple and efficient design for semantic segmentation with transformers,” CoRR, vol. abs/2105.15203, 2021.

[6]W. Ren, Y. Tang, Q. Sun, C. Zhao and Q. -L. Han, "Visual semantic segmentation based on few/zero-shot learning: An overview," in IEEE/CAA Journal of Automatica Sinica, doi: 10.1109/JAS.2023.123207.

[7]J. Ding, N. Xue, G. Xia and D. Dai, "Decoupling Zero-Shot Semantic Segmentation," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 11573-11582, doi: 10.1109/CVPR52688.2022.01129.

[8]B. Michele, A. Boulch, G. Puy, M. Bucher and R. Marlet, "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds," 2021 International Conference on 3D Vision (3DV), London, United Kingdom, 2021, pp. 992-1002, doi: 10.1109/3DV53792.2021.00107.

[9]N. Kato, T. Yamasaki and K. Aizawa, "Zero-Shot Semantic Segmentation via Variational Mapping," 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea (South), 2019, pp. 1363-1370, doi: 10.1109/ICCVW.2019.00172.

[10]Z. Gu, S. Zhou, L. Niu, Z. Zhao and L. Zhang, "From Pixel to Patch: Synthesize Context-Aware Features for Zero-Shot Semantic Segmentation," in IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 10, pp. 7689-7703, Oct. 2023, doi: 10.1109/TNNLS.2022.3145962.

[11]Gu, Zhangxuan et al. “Context-aware Feature Generation For Zero-shot Semantic Segmentation.” Proceedings of the 28th ACM International Conference on Multimedia (2020).

[12]Maxime Bucher, Tuan-Hung Vu, Matthieu Cord, and Patrick Pérez, “Zero-shot semantic segmentation,” in Advances in Neural Information Processing Systems 32 : Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, Eds., 2019, pp. 466–477.

[13]Y. Zheng, J. Wu, Y. Qin, F. Zhang and L. Cui, "Zero-Shot Instance Segmentation," 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 2593-2602, doi: 10.1109/CVPR46437.2021.00262.

[14]Chen, YC., Lai, CF. An intuitive pre-processing method based on human–robot interactions: zero-shot learning semantic segmentation based on synthetic semantic template. J Supercomput 79, 11743–11766 (2023). https://doi.org/10.1007/s11227-023-05068-8

[15]C. Munasinghe, F. M. Amin, D. Scaramuzza and H. W. van de Venn, "COVERED, CollabOratiVE Robot Environment Dataset for 3D Semantic segmentation," 2022 IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA), Stuttgart, Germany, 2022, pp. 1-4, doi: 10.1109/ETFA52439.2022.9921525.

[16]J. Fan, P. Zheng and C. K. M. Lee, "A Multi-Granularity Scene Segmentation Network for Human-Robot Collaboration Environment Perception," 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 2022, pp. 2105-2110, doi: 10.1109/IROS47612.2022.9981684.

[1]Fatemeh-MA/COVERED: Dataset for 3D semantic segmentation in robotic industrial environment (github.com)