On academic job market 2026. Research interests: Audio Processing, Machine Learning, Foundation Models for
I am currently a postdoc fellow at Emory Nursing Center for Data Science. Website is not updated. I was a PhD candidate in the Department of Electrical and Computer Engineering at Johns Hopkins University working in deep learning based speech enhancement and automatic speaker recognition. I am fortunate to work with Prof. Najim Dehak and Prof. Jesús Villalba. I am also affiliated with the Center for Language and Speech Processing and Human Language Technology Center of Excellence. Previously, I interned at Tencent America with Dr. Shi-Xiong Zhang at Dr. Dong Yu speech group, INRIA with Dr. Antoine Deleforge in Dr. Remi Gribonval PANAMA group, and New York University with Prof. Siddharth Garg. I obtained my bachelor’s and master’s degree from Indian Institute of Technology (IIT) Kanpur in Electrical Engineering while working with Prof. Tanaya Guha and Prof. Rajesh Hegde. I hold a minor in Artificial Intelligence from Dept. of Computer Science and Engineering from IIT Kanpur.
During my PhD, I worked on building speech enhancement solutions with focus on 1) state-of-the-art effectiveness (validated by real & noisy test sets from NIST Speaker Recognition Evaluation), 2) speaker-identity preservation, and 3) transfer learning. I also worked on domain adaptation and bandwidth extension (super-resolution) problems for bridging the domain gap between (low-bandwidth) telephone and (high-fidelity) microphone audio. I also collaborated on problems like spoken language recognition, adversarial signals (attack & defense), and automatic speech recognition. In terms of techniques, we relied on Generative Adversarial Networks (GAN), self-supervised models, and perceptual losses.
I am interested to work more broadly in deep learning (self-supervised learning, multi-modal learning) and human language technology (speech processing, natural language processing) problems.