I am an undergraduate from Shanghai Jiao Tong University, School of Mechanical Engineering. My advisor is Prof. Chen Xie.
🎓 Education
- 2022.09 - present, Shanghai Jiao Tong University, School of Mechanical Engineering, Shanghai, B.S.
- 2025.09 - 2026.01,
National Taiwan University, Department of Computer Science, Taiwan, Intern
📝 Publications & Patents
- Hao-Hui Xie, Ho-Lam Chung, Yi-Cheng Lin, Ke-Han Lu, Wenze Ren, Xie Chen, and Hung-yi Lee. “TW-Sound580K: A Regional Audio-Text Dataset with Verification-Guided Curation for Localized Audio-Language Modeling.” Under Review, Interspeech 2026, arXiv:2603.05094. [Paper]
- Tengjie Zhu, Guanyu Cai, Yang Zhaohui, Guanzhu Ren, Hao-Hui Xie, ZiRui Wang, Junsong Wu, Jingbo Wang, Xiaokang Yang, Yao Mu, and Yichao Yan. “CLOT: Closed-Loop Global Motion Tracking for Whole-Body Humanoid Teleoperation.” Under Review, RSS 2026, arXiv:2602.15060. [Paper]
🏅 Honors & Awards
- 2023.02 Mathematical Contest in Modeling (MCM), Award M
- 2023.12 SJTU Academic Excellence Scholarship (Category C)
- 2024.12 SJTU Academic Excellence Scholarship (Category B)
- 2024.05 SJTU Outstanding Student Leader
📌 Academic Conferences
Attended the 17th Asian Conference on Machine Learning (ACML 2025) in Taipei, Taiwan.
💻 Research Experience
- 2025.04 - present, ScaleLab at Shanghai Jiao Tong University, Advisor: Yao Mark Mu
- 2025.07 - 2025.08, Zhejiang Lingqiao Intelligent Technology Co., Ltd., AI & Embedded Systems Intern
- 2025.09 - 2026.01, Speech Processing Lab, National Taiwan University, Advisor: Hung-yi Lee
- Fine-tuned large-scale Taiwanese speech–language models using the DeSTA2.5 framework, integrating Llama3-8B as the text backbone and Whisper-v3 as the acoustic encoder. Implemented LoRA-based alignment and timestamp-aware fine-tuning to enhance accent adaptation and improve ASR robustness across diverse regional dialects.
- Designed and optimized a full multi-GPU training and evaluation pipeline, including data preprocessing, feature extraction, LoRA parameter-efficient training, and distributed evaluation across speech benchmarks such as TAU-TW Benchmark. Achieved a relative improvement of over 8–10% in recognition accuracy compared to baseline models.
- Developed and experimented with speech–language alignment strategies that combine contrastive loss and multi-level feature fusion, enabling the model to perform cross-lingual reasoning and contextually grounded transcription under complex acoustic variations (e.g., Taiwanese Mandarin, Hokkien, and mixed-accent speech).
- 2026.01 - present, X-LANCE Lab, Shanghai Jiao Tong University, Advisor: Chen Xie
🚀 Projects
- openclaw-sjtu — An AI campus assistant for SJTU students, built on the OpenClaw skill framework. Covers 21+ features including homework tracking, course reviews, Shuiyuan community summarization, and a PPT generator with SJTU templates.