The fifth edition of the Symposium on Security & Privacy in Speech Communication, focuses on speech and voice through which we express ourselves. As speech communication can be used to command virtual assistants to transport emotion or to identify oneself, the symposium encourages participants to give answers on how we can strengthen security and privacy for speech representation types in user-centric human/machine interaction? The symposium therefore sees that interdisciplinary exchange is in high demand and aims to bring together researchers and practitioners across multiple disciplines – more specifically: signal processing, cryptography, security, human-computer interaction, law, ethics, and anthropology. This symposium provides a venue for ongoing research stemming from the past successful workshops from the ISCA special interest group on Security & Privacy in Speech Communication (SPSC-SIG), where views of technological and humanities communities nurture one another to develop multidisciplinary and interdisciplinary skills.
For the general symposium, we welcome contributions to related topics, as well as progress reports, project dissemination, or theoretical discussions and “work in progress”. In addition, guests from academia, industry and public institutions as well as interested students are welcome to attend the conference without having to make their own contribution. In addition to the regular submissions, the participants of the VoicePrivacy Attacker Challenge will be invited to submit the extended versions of their challenge contributions.
Although, we aim for meeting all of you on-site, we also opt for virtual presentations during the symposium.
This year, registration is once again handled through the official Interspeech registration system. We offer a fee of €65 per regular member and 45€ per student member. The fee includes participation in the workshop, coffee breaks, and lunch. Virtual participation is free.
If you wish to attend only this workshop (without registering for Interspeech 2025) or want to participate online, please write a mail to ingo.siegert@ovgu.de. The event is open to everyone, regardless of their involvement in the VoicePrivacy Challenge or the SPSC Symposium.
Abstract: The proliferation of generative AI has introduced sophisticated synthetic speech capabilities, fundamentally challenging traditional methods of verifying speaker authenticity. Most existing approaches rely on using AI to detect AI-generated content, a reactive strategy that becomes increasingly ineffective as synthetic techniques improve. In this presentation, I propose a proactive alternative: instead of attempting to detect machine-generated speech, we focus on affirmatively proving that speech originates from a live human speaker. This approach emphasizes the development of tamper-resistant speech interfaces capable of verifying that the speech signal itself is produced by the articulatory movements of a present, living human, rather than synthesized by an AI system. I will introduce a novel hardware-software solution that operates entirely at the edge, combining radar and microphone inputs to capture biosignals such as articulatory motion, glottal activity, heartbeat, and respiration. These signals are analyzed in real time to confirm the physical presence and biological origin of speech, producing a continuous “proof-of-personhood” signal that persists throughout the interaction. Unlike current detection-based methods, this system is invariant to advances in generative AI, offering a more robust foundation for trust in speech communication. The presentation will cover technical performance under standard and adversarial conditions, explore ethical and privacy considerations, and examine the implications for human communication in high-stakes domains.
Bio: Visar Berisha is a professor at Arizona State University with a joint appointment in the School of Electrical, Computer and Energy Engineering and the College of Health Solutions. His interdisciplinary research lies at the intersection of engineering and human health, focusing on how neurological conditions affect human behavior—particularly through speech. His team has developed AI-driven tools for detecting clinically relevant behavioral changes, which are now FDA-registered and used internationally by pharmaceutical companies and healthcare providers. His work exemplifies how machine learning and signal processing can advance digital health monitoring at scale.
We are glad to announce the preliminary program. All times are in Central European Summer Time (CEST) (UTC+2).
Time | Event |
---|---|
08:30 - 09:00 | Welcome with coffee and tea |
09:00 - 09:05 | Start of the Symposium |
09:05 - 10:05 | Keynote |
09:05 - 10:05 | Visar Berisha. Affirming the Human Origin of Speech: Biosignal-based Authentication for Trustworthy Communication in the Age of Deepfakes |
10:05 - 10:45 | Oral Paper Session |
Victor Moreno, João Lima, Flávio Simões, Ricardo Violato, Mário Uliani Neto, Fernando Runstein and Paula Costa. Revealing Cross-Lingual Bias in Synthetic Speech Detection under Controlled Conditions |
|
Michele Panariello, Sarina Meyer, Pierre Champion, Xiaoxiao Miao, Massimiliano Todisco, Ngoc Thang Vu and Nicholas Evans. The Risks and Detection of Overestimated Privacy Protection in Voice Anonymisation |
|
10:45 - 11:15 | Break |
11:15 - 12:15 | Oral Paper Session |
Tom Bäckström and Fedor Vitiugin. Beyond User-centric: Modelling Privacy and Fairness Effects of Speech Interfaces on Community- and Society-Levels |
|
Carlos Franzreb, Arnab Das, Tim Polzehl and Sebastian Möller. Optimizing the Dataset for the Privacy Evaluation of Speaker Anonymizers |
|
Xi Xuan, Yang Xiao, Rohan Kumar Das and Tomi Kinnunen. Multilingual Source Tracing of Speech Deepfakes: A First Benchmark |
|
12:15 - 13:15 | Lunch break |
13:15 - 14:15 | Oral Paper Session |
Rayane Bakari, Olivier Le Blouch, Nicholas Evans, Nicolas Gengembre, Michele Panariello and Massimiliano Todisco. The influence of non-timbral cues in voice anonymisation and evaluation |
|
Jarno Van Arkel, Martha Larson and Emmanuel Vincent. Video games and speech privacy: A case study of Fortnite |
|
Bartomiej Marek, Piotr Kawa and Piotr Syga. Are audio DeepFake detection models polyglots? |
|
14:15 - 14:30 | Break |
14:30 - 17:00 | Poster Session |
Seoyoung Park, Thien-Phuc Doan and Souhwan Jung. An Imperceptible Adversarial Watermarking to Prevent Voice Cloning |
|
Thomas Thebaud, Nicholas Mehlman, Yaohan Guan, Laureano Moro-Velazquez, Jesus Villalba Lopez, Shrikanth Narayanan and Najim Dehak. PPX-Anon: Prosody, Pitch and X-Vectors for De-Anonymization; our submission to the Voice Attacker Challenge 2024 |
|
Ivanina Ivanova, Abhay Dayal Mathur, Nicoline Nymand-Andersen and Nils Holzenberger. Defence Against the Deepfake Arts : Improving Audio Deepfake Detection With Context Awareness |
|
Sarina Meyer and Ngoc Thang Vu. Use Cases for Voice Anonymization |
|
Jule Pohlhausen and Joerg Bitzer. Revisiting the Privacy of Low-Frequency Speech Signals: Exploring Resampling Methods, Evaluation Scenarios, and Speaker Characteristics |
|
approx. 18:00 | Evening Gathering at Café Restaurant de V |
After a short 30-minute walk, we will wind down the evening together. To help us keep track of the participants, we kindly ask you to register using the following link. |
All times are given with respect to the UTC+02:00 zone (Central European Summer Time, CEST).
You can use a time zone converter to check the times in your local time zone.
Paper submission opens
Paper submission deadline
Acceptance Notification
Final (camera ready) paper submission
Symposium
The Symposium is held at Aula Conference Center at TU Delft.
TU Delft is approximately 30 km from the Rotterdam Ahoy Convention Centre where Interspeech 2025 is located. Plan your trip using: https://9292.nl/en/ Take a train to Delft Central Station and go up the escalators to the ground floor to exit the station. Either walk to from there to the Aula Conference Center (2km) or follow the signs inside the station to the bus departures, which are located next to the station. The closest bus stop to the venue is called “Christiaan Huygensweg, Delft” (make sure to choose “Christiaan Huygensweg” in Delft and not in another city). For train or bus, you can pay using a contactless credit or debit card. Touch it to the reader at the beginning and the end of your trip.
Lecture Room C
Aula Conference Center (Aula Congrescentrum)
Mekelweg 5
Delft University of Technology (TU Delft)
2628 CC Delft, Netherlands
In the following map you can see the Symposiums Location (1) and the Interspeech Location (2).
Restaurant “Café de V” Voorstraat 9, Delft https://www.cafe-de-v.nl
Ingo SIEGERT, Otto von Guericke University Magdeburg, Germany
Sneha DAS, Technical University of Denmark, Denmark
Natalia TOMASHENKO, Inria, France
M.A. (Martha) LARSON, Radboud University, The Netherlands
MEHTAB UR RAHMAN, Radboud University, The Netherlands
(alphabetical)
Ajinkya Kulkarni, Idiap Research Institute, Switzerland
Brij Mohan Lal Srivastava, Nijta, France
Candy Olivia Mawalim, Japan Advanced Institute of Science and Technology, Japan
David Boyle, Imperial College London, UK
Emmanuel Vincent, Inria, France
Gerald Penn, University of Toronto, Canada
Hemlata Tak, Pindrop, USA
Ingo Siegert, Otto von Guericke University Magdeburg, Germany
Jennifer Williams, University of Southampton, UK
Junichi Yamagishi, National Institute of Informatics, Japan
Korbinian Riedhammer, Nuremberg Institute of Technology, Germany
Lin Zhang, National Institute of Informatics, Japan
Md Sahidullah, TCG CREST & Academy of Scientific and Innovative Research (AcSIR), India
Natalia Tomashenko, Inria, France
Nick Evans, EURECOM, France
Pierre Champion, Inria, France
Sarina Meyer, University of Stuttgart, Germany
Sebastian Le Maguer, University of Helsinki, Finland
Simon King, University of Edinburgh, UK
Sneha Das, Technical University of Denmark, Denmark
Tim Polzehl, DFKI, Germany
Tom Bäckström, Aalto University, Finland
Xiaoxiao Miao, Singapore Institute of Technology, Singapore
Xin Wang, National Institute of Informatics, Japan
You (Neil) Zhang, University of Rochester, USA
Ziqian Luo, Carnegie Mellon University, USA