Yixuan Xiao

I am currently a PhD student at the University of Stuttgart, supervised by Prof. Dr. Thang Vu. I started my PhD in 2024 April. My research interests lie in speech processing tasks such as audio deepfake detection and speech synthesis. My contact info can be found here. I received my M.Sc. in Computational Linguistics, also from the University of Stuttgart. My Master’s thesis was titled “Mitigating Text Domain Mismatch in ASR Systems through Prompt-based Learning” and supervised by Prof. Dr. Thang Vu.

Prior to this, I worked as a senior algorithm engineer at Baidu’s Speech Team (specialized in high-performance computing) and at NetEase Youdao’s AI Team (specialized in ASR and computer-aided pronunciation training). Earlier, I completed a taught Master’s programme in Artificial Intelligence at the University of Edinburgh and a B.Sc. in Computer Science and Technology at Beijing’s Institute of Technology, supervised by Prof. Dr. Xianling Mao.

Teaching

Courses

I am/have been the (co-)lecturer for the following courses:

Publications

Supervision

Thesis Topics

NOTE: Due to a high number of teaching responsibilities in 2026SS (One 4 SWS lab course, two or maybe three Master’s thesis projects), I will not be able to supervise new students. Thank you for your understanding.

I used to supervise the following topics:

  1. Audio Deepfake Detection: Model training and analysis; requires familiarity with our codebase IMS-ADD.
  2. Codec-based Speech Synthesis: Prompting and fine-tuning TTS or ALM models; audio reconstruction using neural audio codecs.
  3. Speech Analysis: Analyzing speech to better understand speech models. Example tools include librosa, openSMILE, Parselmouth, speechmetrics, and SpeechBrain. Relevant models can sometimes be found on HuggingFace, e.g., speech enhancement models.

HiWi Position

NOTE:Because supervision workload is expected to be high in 2026SS, I also won’t be able to supervise research projects, hence no research-oriented HiWi positions are available.