[StageM2] – Quantifying Dataset Influence in Generative Models via Subspace Analysis
Context and Motivation Generative AI models such as GANs and diffusion models are trained on massive image collections that are massive, often opaque datasets [1, 2]. These datasets may include personal photos, artwork, or copyrighted media, raising important concerns about attribution, copyright, and content ownership. Existing auditing techniques—such as membership…
