I am an undergraduate from Shanghai Jiao Tong University, School of Mechanical Engineering. My advisor is Prof. Chen Xie.

🎓 Education

2022.09 - present, Shanghai Jiao Tong University, School of Mechanical Engineering, Shanghai, B.S.
2025.09 - 2026.01, National Taiwan University, Department of Computer Science, Taiwan, Intern

📝 Publications & Patents

Hao-Hui Xie, Ho-Lam Chung, Yi-Cheng Lin, Ke-Han Lu, Wenze Ren, Xie Chen, and Hung-yi Lee. “TW-Sound580K: A Regional Audio-Text Dataset with Verification-Guided Curation for Localized Audio-Language Modeling.” Under Review, Interspeech 2026, arXiv:2603.05094. [Paper]
Tengjie Zhu, Guanyu Cai, Yang Zhaohui, Guanzhu Ren, Hao-Hui Xie, ZiRui Wang, Junsong Wu, Jingbo Wang, Xiaokang Yang, Yao Mu, and Yichao Yan. “CLOT: Closed-Loop Global Motion Tracking for Whole-Body Humanoid Teleoperation.” Under Review, RSS 2026, arXiv:2602.15060. [Paper]

Attended the 17th Asian Conference on Machine Learning (ACML 2025) in Taipei, Taiwan.

2025.04 - present, ScaleLab at Shanghai Jiao Tong University, Advisor: Yao Mark Mu
2025.07 - 2025.08, Zhejiang Lingqiao Intelligent Technology Co., Ltd., AI & Embedded Systems Intern
2025.09 - 2026.01, Speech Processing Lab, National Taiwan University, Advisor: Hung-yi Lee
- Fine-tuned large-scale Taiwanese speech–language models using the DeSTA2.5 framework, integrating Llama3-8B as the text backbone and Whisper-v3 as the acoustic encoder. Implemented LoRA-based alignment and timestamp-aware fine-tuning to enhance accent adaptation and improve ASR robustness across diverse regional dialects.
- Designed and optimized a full multi-GPU training and evaluation pipeline, including data preprocessing, feature extraction, LoRA parameter-efficient training, and distributed evaluation across speech benchmarks such as TAU-TW Benchmark. Achieved a relative improvement of over 8–10% in recognition accuracy compared to baseline models.
- Developed and experimented with speech–language alignment strategies that combine contrastive loss and multi-level feature fusion, enabling the model to perform cross-lingual reasoning and contextually grounded transcription under complex acoustic variations (e.g., Taiwanese Mandarin, Hokkien, and mixed-accent speech).
2026.01 - present, X-LANCE Lab, Shanghai Jiao Tong University, Advisor: Chen Xie

openclaw-sjtu — An AI campus assistant for SJTU students, built on the OpenClaw skill framework. Covers 21+ features including homework tracking, course reviews, Shuiyuan community summarization, and a PPT generator with SJTU templates.