About Me

I’m a PhD student in Electrical and Computer Engineering at Johns Hopkins University, advised by Prof. Mounya Elhilali, with an expected graduation in 2026. Prior to this, I earned two Bachelor’s degrees and a Master’s from Tsinghua University.

Outside academia, I’m also a music producer and independent artist.

I’m open to collaborations in audio and speech signal processing, as well as music technology. Feel free to connect with me on LinkedIn.

Audio Researcher

  • Audio/music generation and speech synthesis
  • General audio understanding and analysis

Music Producer

  • Hip-hop/pop producer with 10M+ streams across platforms
  • Creator of music production tutorials on Bilibili

News

  • 2024.10: 🎉 EzAudio Space was on the 🔥 trending board of Hugging Face Spaces
  • 2023.10: 🎉 So excited to give an oral presentation on Diff-Pitcher on WASPAA!

Selected Publications

Audio/Speech/Music Generation

  • 2025 pre-print CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech | Helin Wang*, Jiarui Hai*, Dading Chong, Karan Thakkar, Tiantian Feng, Dongchao Yang, Junhyeok Lee, Laureano Moro Velazquez, Jesus Villalba, Zengyi Qin, Shrikanth Narayanan, Mounya Elhiali, Najim Dehak | [paper] [page] [code] [space]
  • 2025 Interspeech EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer | Jiarui Hai, Yong Xu, Hao Zhang, Chenxing Li, Helin Wang, Mounya Elhilali, Dong Yu | [paper] [page] [code] [space]
  • 2025 ICASSP SSR Speech: Towards Stable, Safe, and Robust Zero-shot Text-based Speech Editing and Synthesis | Helin Wang, Meng Yu, Jiarui Hai, Chen Chen, Yuchen Hu, Rilin Chen, Najim Dehak, Dong Yu | [paper] [page] [code]
  • 2024 Interspeech DreamVoice: Text-Guided Voice Conversion | Jiarui Hai*, Karan Thakkar*, Helin Wang, Zengyi Qin, Mounya Elhilali | [paper] [code] [page]
  • 2023 WASPAA Diff-Pitcher: Diffusion-based Singing Voice Pitch Correction | Jiarui Hai, Mounya Elhilali | [paper] [page] [code]

Audio/Speech/Music Separation

  • 2025 pre-print SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline | Helin Wang, Jiarui Hai, Dongchao Yang, Chen Chen, Kai Li, Junyi Peng, Thomas Thebaud, Laureano Moro Velazquez, Jesus Villalba, Najim Dehak | [paper] [page] [code] [space]
  • 2025 ICASSP SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer | Helin Wang*, Jiarui Hai*, Yen-Ju Lu, Karan Thakkar, Mounya Elhilali, Najim Dehak | [paper] [page] [code]
  • 2024 Interspeech Noise-robust Speech Separation with Fast Generative Correction | Helin Wang, Jesus Villalba, Laureano Moro-Velazquez, Jiarui Hai, Thomas Thebaud, Najim Dehak | [paper]
  • 2024 ICASSP DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction | Jiarui Hai*, Heilin Wang*, Dongchao Yang, Karan Thakkar, Najim Dehak, Mounya Elhilali | [paper] [page] [code]

Audio/Speech/Music Understanding

  • 2024 ICASSP Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding | Karan Thakkar, Jiarui Hai, Mounya Elhilali | [paper]
  • 2023 ASRU Boosting Modality Representation with Pre-trained Models and Multi-task Training for Multimodal Sentiment Analysis | Jiarui Hai*, Yu-Jeh Liu*, Mounya Elhilali | [paper]
  • 2022 ICASSP Progressive Teacher-Student Training Framework for Music Tagging | Rui Lu, Baigong Zheng, Jiarui Hai, Fei Tao, Zhiyao Duan, Ji Liu | [paper]

Educations

  • 2022.08 - Present, Johns Hopkins University, Baltimore, United States
  • 2020.08 - 2022.06, Tsinghua University, Beijing, China
    • Master of Engineering | Civil Engineering
    • Big Data Program Member | Big Data Research Center
  • 2016.08 - 2020.06, Tsinghua University, Beijing, China
    • Bachelor of Engineering | Civil Engineering
    • Bachelor of Science | Business Analytics

Experiences

  • 2024.05 - 2024.08, Tencent Americas, Bellevue, USA
  • 2021.06 - 2022.01, Kuaishou, Beijing, China
  • 2021.06 - 2021.09, University of Notre Dame, Notre Dame, United States
  • 2019.06 - 2019.09, University of Hong Kong, Hongkong, China

Academic Services

  • Conference reviewer for ICASSP, Interspeech, SLT
  • Workshop reviewer for ICLR 2025 Workshop DeLTa, NeurIPS 2024 Workshop Audio Imagenation
  • Journal reviewer for IJCV

Music Activities

Copyright © Jiarui Hai. Powered by Jekyll with AcadHomepage theme.