Réunion

Détection de contenus générés

Date : 04 Avril 2025
Horaire : 09h00 - 17h00
Lieu : CNRS - Délégation Paris Michel-Ange, Campus Gérard Mégié, 3 rue Michel-Ange - 75794 Paris cedex 16, Amphithéatre Curie

Axes scientifiques :

Codage et sécurité multimedia

GdRs impliqués :

GdR Sécurité Informatique

Organisateurs :

- Jan Butora (CRIStAL)
- Vincent Itier (CRIStAL)
- Pauline Puteaux (CRIStAL)
- Eva Giboulot (INRIA - Rennes)

Nous vous rappelons que, afin de garantir l'accès de tous les inscrits aux salles de réunion, l'inscription aux réunions est gratuite mais obligatoire.

Inscriptions

37 personnes membres du GdR IASIS, et 86 personnes non membres du GdR, sont inscrits à cette réunion.

Capacité de la salle : 194 personnes. 71 Places restantes

Inscriptions closes pour cette journée

Annonce

Journée commune entre le GDR IASIS et le GDR Sécurité informatique

Dans un contexte où la génération de contenus par intelligence artificielle devient de plus en plus simple et accessible, la capacité à différencier les contenus générés de ceux qui ne le sont pas est primordiale. Le groupe de travail (GT) “Détection de Contenus Générés (DCG)” organisant cet événement se concentre sur l’analyse et la détection de contenus synthétisés par des modèles d’apprentissage profond tels que les modèles de langage (LLM) pour le texte, les modèles de diffusion pour les images, et bien d’autres applications.

La question de la détection des contenus générés va au-delà de la simple différenciation ; elle se situe également au cœur de la lutte contre les falsifications numériques. En effet, une partie seulement d’un contenu peut être générée, ce qui complique davantage l’identification des manipulations. Ces approches basées sur l’intelligence artificielle prennent progressivement le relais des méthodes traditionnelles de manipulation d’images et de vidéos (Photoshop, Gimp, etc.), et sont désormais intégrées directement dans ces outils (comme Adobe Firefly dans Photoshop). Cette évolution soulève des enjeux importants en termes de fiabilité, de sécurité, et d’éthique des contenus numériques.

L’événement se concentre sur plusieurs axes clés autour de la détection et de la manipulation de contenus par IA :

– Détection de contenus générés par IA : Textes, audios, vidéos, et autres formats.
– Localisation des manipulations par IA : Identifier où et comment l’intelligence artificielle a été utilisée pour modifier ou créer un contenu.
– Watermarking de l’IA : Techniques visant à intégrer des filigranes numériques permettant de tracer l’origine des contenus générés.
– Explicabilité des détecteurs et classificateurs d’IA : Comprendre et rendre transparent le fonctionnement des systèmes de détection afin d’améliorer leur fiabilité et leur interprétabilité.

Appel à participation

Nous invitons les doctorants, post-doctorants et jeunes chercheurs/chercheuses à soumettre une proposition pour présenter leurs travaux lors de l’événement. Les présentations dureront entre 15 et 20 minutes, suivies de 5 minutes de questions-réponses. Ces interventions seront l’occasion de partager des recherches récentes et de stimuler des discussions autour des défis et des avancées dans le domaine de la détection et de la manipulation des contenus générés par l’intelligence artificielle.

Les présentations peuvent être faites en français ou en anglais, mais les slides doivent être en anglais.

Si vous êtes intéressé(e) à participer, veuillez envoyer un e-mail à jan.butora@cnrs.fr pour soumettre votre proposition et obtenir plus de détails sur le processus de soumission (avant le 13 mars).

Nous encourageons vivement la participation de la communauté académique émergente et espérons avoir l’opportunité d’explorer ensemble les dernières avancées dans ce domaine passionnant.

Organisateurs

– Jan Butora (CNRS, CRIStAL), jan.butora@cnrs.fr
– Eva Giboulot (Inria, IRISA), eva.giboulot@inria.fr
– Vincent Itier (IMT Nord Europe, CERI SN, CRIStAL), vincent.itier@imt-nord-europe.fr
– Pauline Puteaux (CNRS, CRIStAL/LIRMM), pauline.puteaux@cnrs.fr

Orateurs

– Ewa Kijak (Université de Rennes 1, IRISA), https://people.irisa.fr/Ewa.Kijak/
– Christian Riess (Friedrich-Alexander-University of Erlangen-Nuremberg), https://www.cs1.tf.fau.de/christian-riess/

La journée bénéficie du soutien logistique du PEPR Cybersécurité, projet COMPROMIS.
Site internet : https://www.pepr-cybersecurite.fr/projet/compromis/).

Programme

09h00 : Ouverture de la journée

09h10 - 10h10 : Orateur invité Christian Riess – “Neural Waste Processing: Detecting Generated Images with Pre-Trained Neural Networks”, Friedrich-Alexander-Universität

10h10 - 10h35 : Pause café

10h35 - 11h00 : Raphaël Couronné – “Building a digital common to benchmark AI-generated image detectors”, PEReN

11h00 - 11h25 : Minh Thong Doi – “DeepFake Detection based on Noise Residuals”, CERI SN, CRIStAL

11h25 - 11h50 : Alexandre Libourel – “DeTOX : Combatting deepfakes of (French) celebrities”, EURECOM

11h50 - 12h15 : Etienne Levecque – “Manipulation localization via JPEG compatibility attack”, LIST3N, Université de Technologie de Troyes

----- Pause déjeuner -----

14h15 - 15h15 : Oratrice invitée Ewa Kijak – “From detection to characterization: what can we tell about an image modification?”, Université de Rennes 1, IRISA

15h15 - 15h40 : Pause café

15h40 - 16h05 : Lilian Bour – “Latent representation manipulation for face editing”, GREYC, Université Caen Normandie

16h05 - 16h30 : Matthieu Dubois – “MOSAIC : Machine-generated text detection in an unsupervised setting”, Sorbonne Université, CNRS, ISIR

16h30 - 16h55 : Aghilas Sini – “Curriculum Learning for Fake Speech detection (CLeFS)”, LIUM, Le Mans Université

16h55 : Conclusion de la journée

Résumés des contributions

Neural Waste Processing: Detecting Generated Images with Pre-Trained Neural Networks

Orateur invité : Christian Riess (Friedrich-Alexander-Universität)

In the recent past, the detection of generated images became one of the hottest topics in image forensics: given an input image, a machine learning system has to decide whether the image was captured with an actual camera or generated with a neural network. It quickly turned out that it is relatively easy to train a detector for one specific set of real and generated images, but that it is relatively difficult to train a detector on one set of images and apply it on another. This so-called generalization problem is arguably the largest obstacle in the deployment of forensic detectors as a service.

In this talk, we will look at one particular design paradigm for forensic detectors that generalizes comparably well: to craft a forensic detector around a neural network that has been trained for another task. As such, the detection of generated images becomes more of an indirect task that recycles some otherwise irrelevant artifacts of the pretrained network. Various facets of this idea can be appreciated in different scientific works, and we will add some own investigations towards a better understanding of the underlying design pattern.

Building a digital common to benchmark AI-generated image detectors

Raphaël Couronné (Pôle d'Expertise de la Régulation Numérique (PEReN))

When addressing disinformation campaigns on the internet, determining whether content is AI-generated is crucial for understanding the tactics employed. However, current state-of-the-art open source classification algorithms may not be both highly performing and consistent to identify fake content in real-world scenarios. In this work we limited the scope on the task of fully ai-generated image binary classification. Our main contribution is a code library designed to benchmark current state of the art, open-source, classification methods. As a proof of concept, we performed an initial benchmark with 13 already trained classifiers, and a balanced validation dataset. We did not fine-tune nor retrain the classifiers in order to stick as close as possible to the original functioning. We first looked at performance degradation against key cofactors within the validation dataset. Then, we assessed performance degradation via testing on academic datasets closer to real-world conditions. We focused on the two best classifiers on the benchmark, as well as an ensemble of all classifiers. To prioritize precision over recall, we calibrated the thresholds of these three classifiers using the validation dataset.

DeepFake Detection based on Noise Residuals

Minh Thong Doi (CERI SN, CRIStAL)

Deepfake technology presents significant challenges, particularly in fraud, misinformation, and evidence tampering. As deepfakes become more prevalent, effective detection methods are essential. We introduce DJIN, a deepfake detection model designed to retain noise components by avoiding pooling layers in the initial stages. Pre-trained on ImageNet for steganography using the JIN version, DJIN demonstrates superior performance compared to CoDE, Corvi et al., and CLIP. DJIN outperforms these models in handling high-quality images and processing images of all sizes. Given that deepfake generators typically produce high-quality outputs, an explainability analysis shows that DJIN leverages image noise by focusing more on darker areas in real images and brighter areas in generated ones.

DeTOX : Combatting deepfakes of (French) celebrities

Alexandre Libourel (EURECOM)

Politicians and government leaders are prime targets for deepfake attacks, where a single manipulated video can severely damage reputations or even pose national security risks. Attackers exploit vast amounts of publicly available audio and video data to generate highly realistic deepfakes. Current deepfake generation models successfully replicate the physical biometrics of an individual—such as facial structure and voice—but fail to reproduce their behavioral biometrics, including characteristic facial expressions and movement patterns. This gap can be exploited to improve the performance of Person of Interest (PoI) deepfake detection. By learning a POI’s unique facial dynamics and gestures, these detectors can identify inconsistencies where deepfakes fail to capture genuine behavioral traits. Unlike previous approaches relying on Facial Action Units—an incomplete representation of expressivity—our method models POI-specific behaviors using a state-of-the-art face-reenactment encoder coupled with a transformer to extract behavioral patterns. Furthermore, our method doesn’t require any deepfake samples during training. This independence from specific deepfake generation techniques makes our approach robust and adaptable, particularly for protecting high-profile individuals most vulnerable to deepfake threats. This work is a part of the DeTOX project, a French defense program from the ’Astrid: guerre Cognitive’ consortium. The present project aims at detecting deepfakes of French VIPs by leveraging generation weaknesses in video, audio, and audio-video synchronization.

Manipulation localization via JPEG compatibility attack

Etienne Levecque (LIST3N, Université de Technologie de Troyes)

JPEG compression and decompression act as many-to-one functions: each block is derived from a function that applies to at least one original antecedent. However, when local modifications are made to a JPEG image, such as inpainting using a generator, this property can be disrupted. In this case, the manipulated block may lack any identifiable antecedent, rendering it incompatible. If we know the exact JPEG pipeline of the image, we can attempt to recover one antecedent for each block. If this fails for certain blocks, it proves that these blocks have been altered, allowing us to pinpoint the manipulation down to the block level. Contrarily, if all blocks have at least one antecedent, we can consider the image to be authentic. This approach is powerful, explainable, and its false positive rate depends on the number of iterations and converge to zero.

From detection to characterization: what can we tell about an image modification?

Oratrice invitée : Ewa Kijak (Université de Rennes 1, IRISA)

Latent representation manipulation for face editing

Lilian Bour (GREYC)

Generative models can modify facial attributes such as smiles, beards, and gender, but their efficiency and evaluation remain open questions. This presentation introduces a comprehensive evaluation framework for facial editing models, focusing on three key aspects: image quality, identity preservation, and attribute entanglement. Evaluation of three editing models is presented across two datasets and future research directions are highlighted. Image quality is assessed using standard metrics, including Structural Similarity Index Measure (SSIM), Learned Perceptual Image Patch Similarity (LPIPS), and Fréchet Inception Distance (FID). Additionally, no-reference facial quality assessment methods specifically designed for face evaluation are explored, offering a novel approach to evaluating these models. Identity preservation is measured using facial recognition models, while attribute entanglement is analyzed through facial attribute classification models. The results reveal that image quality can be low for certain models, with noticeable artifacts affecting the final output. Identity preservation is generally well maintained, ensuring that the edited images retain key facial characteristics. However, attribute entanglement remains a challenge, leading to unintended modifications of non-target attributes.

MOSAIC : Machine-generated text detection in an unsupervised setting

Matthieu Dubois (Sorbonne Université, CNRS, ISIR)

Curriculum Learning for Fake Speech detection (CLeFS)

Aghilas Sini (LIUM)

Ce travail s’inscrit dans la problématique de l’incapacité des systèmes de détection de deepfake speech à anticiper de nouvelles attaques, en raison de l’essor et de l’évolution rapide des technologies de synthèse vocale. Nous proposons d’exploiter le Curriculum Learning comme mécanisme d’apprentissage afin de concevoir un système évolutif et plus robuste face aux attaques émergentes.

Nos expérimentations ont été menées sous les contraintes du challenge international ASVSpoof5, à savoir l’interdiction d’utiliser des modèles pré-entraînés et la limitation aux données fournies par les organisateurs pour l’entraînement. Pour répondre à ces exigences, nous avons adopté AASIST, un classificateur neuronal profond bi-classes spécialisé dans la détection de spoofing.

Dans un premier temps, nous avons étendu le classificateur bi-classes en un modèle multi-classes, afin d’anticiper l’émergence de nouvelles attaques. Concrètement, nous proposons d’entraîner un classificateur multiclasse en considérant l’ensemble des attaques et le bona fide présents dans le jeu de données, puis d’interpréter le modèle appris en bi-classes lors de l’inférence.

Dans notre approche, nous avons implémenté différentes stratégies propres au paradigme du Curriculum Learning, en explorant trois hyperparamètres clés : optimizer, sampler et accumulation. Les résultats obtenus montrent des performances comparables à celles des approches sans le mécanisme Curriculum Learning. Les expériences menées ont permis d’aborder des discussions sur l’intérêt de la réinitialisation de l’optimizer, de la réorganisation des données d’apprentissage (sampler) et de la sauvegarde des données durant l'apprentissage (accumulation).

Cette étude ouvre de nouvelles perspectives, notamment en intégrant une prise en compte plus fine des spécificités du locuteur et en adaptant l’apprentissage en fonction des typologies d’attaques.

Les commentaires sont clos.

Réunion

Détection de contenus générés

Inscriptions

Annonce

Appel à participation

Organisateurs

Orateurs

Programme

Résumés des contributions

Neural Waste Processing: Detecting Generated Images with Pre-Trained Neural Networks

Building a digital common to benchmark AI-generated image detectors

DeepFake Detection based on Noise Residuals

DeTOX : Combatting deepfakes of (French) celebrities

Manipulation localization via JPEG compatibility attack

From detection to characterization: what can we tell about an image modification?

Latent representation manipulation for face editing

MOSAIC : Machine-generated text detection in an unsupervised setting

Curriculum Learning for Fake Speech detection (CLeFS)

IASIS en chiffres

A noter

Cartographie des expertises du GdR

Actus de la communauté

Assemblée Générale du GdR, 6-8 octobre 2025

Appel EIC PathFinder Challenges, défi « Generative-AI based Agents to Revolutionize Medical Diagnosis and Treatment of Cancer »

Journée « Calcul sobre »

GdR « Calcul: Paradigmes, Parallélisme, Performance, Précision »

COLT 2025, Lyon, 30 juin – 4 juillet