About Me

I’m a PhD student in Electrical and Computer Engineering at Johns Hopkins University, advised by Prof. Mounya Elhilali, with an expected graduation in 2026. Prior to this, I earned two Bachelor’s degrees and a Master’s from Tsinghua University.

Outside academia, I’m also a music producer and independent artist.

I’m open to collaborations in audio and speech signal processing, as well as music technology. Feel free to connect with me on LinkedIn.

  Audio Researcher
  Audio/music generation and speech synthesis
General audio understanding and analysis

Music Producer

Hip-hop/pop producer with 10M+ streams across platforms
Creator of music production tutorials on Bilibili

News

2024.10: 🎉 EzAudio Space was on the 🔥 trending board of Hugging Face Spaces
2023.10: 🎉 So excited to give an oral presentation on Diff-Pitcher on WASPAA!

Selected Publications

Audio/Speech/Music Generation

2025 pre-print CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech | Helin Wang*, Jiarui Hai*, Dading Chong, Karan Thakkar, Tiantian Feng, Dongchao Yang, Junhyeok Lee, Laureano Moro Velazquez, Jesus Villalba, Zengyi Qin, Shrikanth Narayanan, Mounya Elhiali, Najim Dehak | [paper] [page] [code] [space]
2025 Interspeech EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer | Jiarui Hai, Yong Xu, Hao Zhang, Chenxing Li, Helin Wang, Mounya Elhilali, Dong Yu | [paper] [page] [code] [space]
2025 ICASSP SSR Speech: Towards Stable, Safe, and Robust Zero-shot Text-based Speech Editing and Synthesis | Helin Wang, Meng Yu, Jiarui Hai, Chen Chen, Yuchen Hu, Rilin Chen, Najim Dehak, Dong Yu | [paper] [page] [code]
2024 Interspeech DreamVoice: Text-Guided Voice Conversion | Jiarui Hai*, Karan Thakkar*, Helin Wang, Zengyi Qin, Mounya Elhilali | [paper] [code] [page]
2023 WASPAA Diff-Pitcher: Diffusion-based Singing Voice Pitch Correction | Jiarui Hai, Mounya Elhilali | [paper] [page] [code]

Audio/Speech/Music Separation

2025 pre-print SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline | Helin Wang, Jiarui Hai, Dongchao Yang, Chen Chen, Kai Li, Junyi Peng, Thomas Thebaud, Laureano Moro Velazquez, Jesus Villalba, Najim Dehak | [paper] [page] [code] [space]
2025 ICASSP SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer | Helin Wang*, Jiarui Hai*, Yen-Ju Lu, Karan Thakkar, Mounya Elhilali, Najim Dehak | [paper] [page] [code]
2024 Interspeech Noise-robust Speech Separation with Fast Generative Correction | Helin Wang, Jesus Villalba, Laureano Moro-Velazquez, Jiarui Hai, Thomas Thebaud, Najim Dehak | [paper]
2024 ICASSP DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction | Jiarui Hai*, Heilin Wang*, Dongchao Yang, Karan Thakkar, Najim Dehak, Mounya Elhilali | [paper] [page] [code]

Audio/Speech/Music Understanding

2024 ICASSP Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding | Karan Thakkar, Jiarui Hai, Mounya Elhilali | [paper]
2023 ASRU Boosting Modality Representation with Pre-trained Models and Multi-task Training for Multimodal Sentiment Analysis | Jiarui Hai*, Yu-Jeh Liu*, Mounya Elhilali | [paper]
2022 ICASSP Progressive Teacher-Student Training Framework for Music Tagging | Rui Lu, Baigong Zheng, Jiarui Hai, Fei Tao, Zhiyao Duan, Ji Liu | [paper]

Educations

2022.08 - Present, Johns Hopkins University, Baltimore, United States
- Doctor of Philosophy | Electrical and Computer Engineering
- Advisor: Prof. Mounya Elhilali
2020.08 - 2022.06, Tsinghua University, Beijing, China
- Master of Engineering | Civil Engineering
- Big Data Program Member | Big Data Research Center
2016.08 - 2020.06, Tsinghua University, Beijing, China
- Bachelor of Engineering | Civil Engineering
- Bachelor of Science | Business Analytics

Experiences

2024.05 - 2024.08, Tencent Americas, Bellevue, USA
- Research Intern | AI lab
- Mentor: Dr. Yong Xu, Dr. Hao Zhang, and Dr. Dong Yu
2021.06 - 2022.01, Kuaishou, Beijing, China
- Music Technology Intern | AI Platform
- Advisor: Prof. Zhiyao Duan
2021.06 - 2021.09, University of Notre Dame, Notre Dame, United States
- Summer Research | Department of Psychology
- Advisor: Prof. Zhiyong Zhang
2019.06 - 2019.09, University of Hong Kong, Hongkong, China
- Summer Research | Business School
- Advisor: Prof. Hailiang Chen

Academic Services

Conference reviewer for ICASSP, Interspeech, SLT
Workshop reviewer for ICLR 2025 Workshop DeLTa, NeurIPS 2024 Workshop Audio Imagenation
Journal reviewer for IJCV

Music Activities

2021.08, Hosted a lecture about music production at Modern Sky Studio
2021.05, Worked in the production of a rap song for the TV show HipHop Bank
2021.04, Top New Producer (1%), BeatsHome Hip-hop Production Contest, China

Jiarui Hai | 海家瑞