2024
DSA Seminar
研讨会
研讨会主题
TITLE
Spatial Audio, Spatial Audio-Visual and Visual Learning
研讨会时间
TIME
Oct. 11, 2024
10:00 AM - 11:00 AM (Beijing Time)
研讨会地址
VENUE
E3-202, Guangzhou Campus
研讨会链接
ZOOM ID
Zoom Meeting ID: 962 2017 7186
Passcode: dsat
研讨会简介
ABSTRACT
We human beings extensively use both audio and visual information to perceive the physical world. Despite the ubiquity of audio-visual signal co-existence, existing research dominantly focuses on visual signal, leaving the acoustic counterpart research has lagged far behind. One important contributing reason for this trend is that acoustic signals can be easily converted into 2D images with transforms such as short time Fourier transform. In this talk, I would first address the research question: is treating spatial acoustic signals as 2D images optimal? I will explore how to design novel neural networks to directly learn from audio raw waveform (not 2D image) or continuously model spatial acoustic effects (ICML-21, AISTATS-23, ICML-24)? Furthermore, I will show an audio-visual multimodal learning framework where audio-vision is weakly-correlated, reflecting real-scenarios such as gas-leak (WACV-24). I will also present a visual topological learning framework in embodied AI (RSS-23). Finally, I conclude by discussing several potential research directions.
分享者简介
SPEAKER BIO
Yuhang HE
Computer Science, University of Oxford
Yuhang He is a final-year Ph.D. student in Computer Science, University of Oxford. Prior to his Ph.D. journey, he has had several years' industrial research experience in companies like Baidu. During his Ph.D. study, he completed two internships, one in Mitsubishi Electric Research Lab (MERL) and the other one in Microsoft, Munich, Germany. He has publications in top-tier conferences like ICML, AISTATS, RSS, WACV. His research interest currently lies in audio-vision-X multimodal spatial intelligence learning, with the ultimate goal of achieving (or even surpassing) human-level spatial intelligence. He incorporates practical applications and theoretical analysis in his research. In his spare time, he enjoys running marathons and practicing street photography.
关注了解更多资讯
DSA Thrust