
Figure: sound-field reconstruction example in the bright zone with our neural model for personal Sound Zones (sound field at 500Hz, extracted from Hrant Arzumanyan M.Sc. Thesis).
Context and objectives
Personalized Sound Zones (PSZs) allow controlling the acoustic signal emitted in a given spatial area [1]. It has applications in many contexts, such as the diffusion of personalized audio content in vehicle interiors. Generating PSZs require a microphone arrays and a loudspeaker array to control the acoustic field. Historically, PSZs methods rely on constrained optimization problems where we want to maximize the reconstruction of the signal in the target zone (called bright zone) and to minimize the acoustic level in a control zone (called dark zone). These classical approaches are efficient in fixed setups but lack robustness in the adaptive scenario, i.e., when the bright zone is misaligned with the initial setup. This typically occurs in cars when the driver changes the position of its seat.
This internship will focus on using deep neural networks (DNN) to learn the mapping from a target pressure in the bright zone to the filters to apply at each loudspeaker. Building on the literature on DNN for PSZs [2, 3, 4] and a previous internship, the intern will work on the data generation, the model design and training and the evaluation of a neural network for adaptive sound zones. The internship will be organized as follows:
- Literature review on the use of DNN for personal sound zones and getting started with the classical methods
- Getting started with the existing code (Python, Pytorch)
- Improve the data generation procedure to increase the dataset size and allow for moving zone evaluation
- Design of a new DNN approach based on conditional VAE to adapt the model to the moving bright zone scenario
- Possibility to explore other architectures and approaches
- Evaluation of these models on simulated and real data in the context of moving zones
Laboratories and team
The intern will be supervised by two labs: the LAUM and the LIUM at Le Mans Université. The former is recognized worldwide for their works on acoustics, with expertise on numerous topics including sound field control and reconstruction. The latter is historically recognized for their work in speech processing. The LIUM is now diversifying its work to audio processing in general.
Théo Mariotte is an Assistant Professor at LIUM. He works on audio and speech processing, with a focus on interpretability and multi-microphone processing methods. (contact : theo.mariotte@univ-lemans.fr )
Manuel Melon is a Full Professor at LAUM. He works on electroacoustic and sound field control. He has been supervising multiple PhD thesis on the topic of Sound Zones.
Marie Tahon is a Full Professor at LIUM whose research focuses on expressive speech analysis and synthesis with interests in interpretable neural approaches.
About Le Mans
Coming to Le Mans might sound scary! But trust us, living here can be a wonderful experience. The university is not big, which allows meeting people from diverse scientific communities. This is also facilitated by the dynamics of the PhD student association (open to everyone). Living in Le Mans is not expensive, with a great cultural offer (National theater for example). Finally, if you want to leave for the weekend, you can go to many places by the train without connections (Paris (1h), Lyon (3h), Marseille (5h), Strasbourg (4h)…).
Bibliography
[1] T. Betlehem, W. Zhang, M. A. Poletti, et T. D. Abhayapala, « Personal Sound Zones: Delivering interface-free audio to multiple listeners », IEEE Signal Process. Mag., vol. 32, no 2, p. 81‑91, mars 2015, doi: 10.1109/MSP.2014.2360707.
[2] Qiao, Y., & Choueiri, E. (2025). SANN-PSZ: Spatially Adaptive Neural Network for Head-Tracked Personal Sound Zones. IEEE Transactions on Audio, Speech and Language Processing.
[3] S. Zhao, Q. Zhu, E. Cheng, et I. S. Burnett, « A room impulse response database for multizone sound fieldreproduction (L) », The Journal of the Acoustical Society of America, vol. 152, no 4, p. 2505‑2512, oct. 2022, doi: 10.1121/10.0014958.
[4] G. Pepe, L. Gabrielli, S. Squartini, L. Cattani, et C. Tripodi, « Deep Learning for Individual Listening Zone », in 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP), Tampere, Finland: IEEE
