讲座 | Multimodal Enhanced 3D Perception and its Applications

文摘   2024-09-05 17:01   广东  


主持人:Prof. Hui Huang





As a basic 3D representation form, point cloud is active in various tasks such as autonomous driving, embodied AI, biomolecular structure prediction and design. Although the 3D perception has achieved good development, multimodal enhanced 3D perception is in urgent need, especially with LLM and VLM. Starting from the collection of point clouds, this talk first proposes a point cloud down-sampling and recovery algorithm based on reversible networks, which greatly improves the storage and communication efficiency. After we can effectively obtain point cloud data, we have studied classic tasks such as point cloud shape classification, point cloud 3D detection and tracking, and large scene 3D semantic segmentation, occupancy prediction and reconstruction from single modality to multimodal. 

Our algorithm has achieved excellent results in many public international competitions, e.g., the novel prize for the occupancy prediction in CVPR24 Autonomous Grand Challenge, the first place in SemanticKitti semantic segmentation, the first place in CVPR2023 HOI4D segmentation, etc. In the end, we extended the 3D perception technologies to downstream applications, such as taking face generation, protein and small molecule/RNA binding prediction, etc.


李镇博士现任香港中文大学(深圳)理工学院助理教授,未来智联网络研究院助理院长,校长青年学者。李镇博士获得香港大学计算机科学博士学位 (2014-2018年),他还于2018年在芝加哥大学担任访问学者。李镇博士荣获2023年吴文俊人工智能优秀青年,2021年中国科协第七届青年托举人才,2023CVPR HOI4D竞赛第一名,2022年SemanticKITTI语义分割竞赛第一名,2023年IROS 最佳论文Finalist,ICCV2021 Urban3D竞赛第二名,CASP12接触图预测全球冠军等。李镇博士还获得了来自于国家、省市级以及工业界的科研项目。

李镇博士领导了港中深的 Deep Bit Lab (https://mypage.cuhk.edu.cn/academics/lizhen/),其主要的研究方向是3D视觉解析及应用 (包括但不限于点云解析,多模态联合解析),深度学习等基础理论算法研究,并致力于将2D/3D人工智能算法推广应用于交叉学科,自动驾驶,工业视觉等场景中,在该方向著名国际期刊和会议发表论文60余篇,包括顶级期刊Cell Systems, Nature Communications, T-PAMI, TMI, TVCG, TNNLS等,以及顶级会议CVPR, ICCV, ECCV, NeurIPS, ICLR, IROS, ACM MM, AAAI, IJCAI, MICCAI等。李镇博士担任IEEE Transactions on Mobile Computing、IROS副编,以及众多顶刊、顶会的审稿人,李镇博士还是广东院士联合会脑科学与类脑智能专委委员,VALSE、MICS、中国图象图形学学会机器视觉专委会,3DV专委会等学术组织的委员。

Visual Computing Research Center



深圳大学可视计算研究中心致力于大力提升可视计算科学研究与高等教育水平,以计算机图形学、计算机视觉、人机交互、机器学习、机器人、可视化和可视分析为学科基础,促进多个学科的深入交叉和集成创新。详见官网: vcc.tech