
Long Chen
Department of Computer Science and Engineering (CSE)
School of Engineering (SENG)
The Hong Kong University of Science and Technology (HKUST)
Email: longchen A~T ust.hk
Office: Room CYT-3003, Cheng Yu Tung Building, HKUST, Clear Water Bay
Dr. Long CHEN (Chinese: 陈隆) is an assistant professor at the department of CSE, Hong Kong University of Science and Technology (HKUST). He is leading a computer vision and machine learning research group: LONG Group. Before joining HKUST, he was a postdoctoral research scientist at the DVMM Lab, Columbia University. He obtained his Ph.D. degree in Computer Science from the DCD Lab, Zhejiang University. During Ph.D. study period, he was also a visiting student at the MReal Lab, Nanyang Technological University (NTU), and the NExT Center, National University of Singapore (NUS). He obtained his B.Eng. degree from Dalian University of Technology. He was a senior research scientist at Tencent AI Lab.
His primary research directions are Computer Vision
, Machine Learning
, Multimedia
, and Artificial Intelligence
.
General Research Interests: Specifically, he aims to build an efficient multimodal AI system that can realize "human-like" multimodal understanding and generation. By “human-like”, we mean that the vision systems should be equipped with three types of abilities: 1) Explainable: The model should rely on (right) explicit evidences when making decisions, i.e., right for the right reasons. 2) Robust: The model should be robust to some situations with only low-quality training data (e.g., training samples are biased, noisy, or limited). 3) Universal: The model design is relatively universal, i.e., it is expected to be effective for various tasks. Meanwhile, with the rapid development of foundation models, such as large language model (LLMs), vision-language models (VLMs), and vision generation models (e.g, diffusion models), our group, we are also very interested in several releveant cutting-edge directions: 4) Building more explainable, robust, and universal vision models with the help of pretrained models (LLMs, diffusion models). 5) Designing more efficient and stronger multimodal LLMs. 6) The inherent weaknesses in existing LLMs and diffusion models.
Recent Research Directions:
-
Foundation Models Efficient Finetuning
: Parameter-efficient Tuning ([IterIS, CVPR’25], [ComPro, IJCV’24]), Memory-efficient Tuning ([UniPT, CVPR’24], [SHERL, ECCV’24]), Modality-efficient Tuning ([PathWeave, NeurIPS’24]), Reinforcment Learning with Human/AI Feedback (RLHF/RLAIF) ([B2-DiffuRL, CVPR’25], [Fast RL, EMNLP’24], [R3HF, arXiv’24]). -
Visual Generation and Editing
: Image Generation/Editing ([CLIPDrag, ICLR’25], [Free-Event arXiv’24]), Video Generation/Editing ([DisPose, ICLR’25], [Ca2-VDM, arXiv’25]), 3D Mesh Generation ([Nautilus, arXiv’25]), 3D Gaussian Editing ([VcEdit, ECCV’24]). -
Open-world/vocabulary Perception
: Object Detection ([Survey, TPAMI’24], [CCKT-Det, ICLR’25]), Scene Graph Generation ([NICEST, TPAMI’24], [INOVA, arXiv’25], [RECORD, NeurIPS’23], Compositional Classification ([PLO, arXiv’23]), Image Classification ([Diff-II, CVPR’25]), Pose Estimation ([Di2Pose, NeurIPS’24]), Situation Recognition ([LEX, ACMMM’24]). -
Multimodal Understanding and Reasoning
: Interleaved Generation ([CoMM, CVPR’25]), Multimodal Editing ([DECap, ECCV’24]), Visual Question Answering ([IdealGPT, EMNLP’23 Findings]).
Research Group: LONG Group @ HKUST CSE
1. Based on the current funding situation, we have only extremely limited postdocs, research assistants, and visiting students openings. (Please also highlight if you have other funding sources or supports).
2. As for Ph.D. and M.Phil. positions, we always have the openings all year around.
3. To further increase the diversity, Ph.D./M.Phil applicants from overseas countries and HK are strongly recommended.
Recent Teaching:
-
2025 Spr.:
COMP6411C: Advanced Topics in Multimodal Machine Learning
-
2024 Fall:
COMP4901Z: Reinforcement Learning
-
2024 Spr.:
COMP6411C: Advanced Topics in Multimodal Machine Learning
News
Feb, 2025 | I will serve as an Associate Editor for IEEE Transactions on Image Processing (TIP). |
---|---|
Feb, 2025 | I will serve as an Area Chair for NeurIPS 2025 and an Area Char for ACM MM 2025. |
Feb, 2025 | We will organize a tutorial about Multimodal LLM in CVPR 2025. |
Dec, 2024 | I will serve as an Area Chair for ICML 2025. |
Dec, 2024 | I will give a talk in AAAI 2025 New Faculty Highlights program. |
Nov, 2024 | I will serve as a Senior PC for IJCAI 2025. |
Nov, 2024 | Our research group has the 4th group outing activity: Hiking in MacLehose Trail (Section 2), again!. |
Sep, 2024 | I was ranked as the World’s Top 2% Most-cited Scientists (in the single year 2023) by Stanford University. |
Sep, 2024 | I will serve as an Area Chair for CVPR 2025. |
Aug, 2024 | I will serve as an Area Chair for ICLR 2025. |
Jul, 2024 | Two students have received HKUST RedBird PhD Awards. Congrats to Chaolei and Jiazhen!. |
Jun, 2024 | I will serve as a Senior PC for AAAI 2025. |
Recent Publications
- arXiv
- arXivarXiv preprint (arXiv) , arXiv
- arXivarXiv preprint (arXiv) , arXiv
- arXiv[New!!] Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache SharingarXiv preprint (arXiv) , arXiv , Codes
- arXivarXiv preprint (arXiv) , arXiv , Codes
- arXivarXiv preprint (arXiv) , arXiv
- arXiv
- CVPRComputer Vision and Pattern Recognition (CVPR) , 2025 , Codes
- CVPRComputer Vision and Pattern Recognition (CVPR) , 2025 , Codes
- CVPR[New!!] Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce ClassificationComputer Vision and Pattern Recognition (CVPR) , 2025 , Codes
- ICLRInternational Conference on Learning Representations (ICLR) , 2025 , Codes
- ICLRInternational Conference on Learning Representations (ICLR) , 2025 , Codes
- ICLRInternational Conference on Learning Representations (ICLR) , 2025 , Codes
- NeurIPSNeural Information Processing Systems (NeurIPS) , 2024
- NeurIPSNeural Information Processing Systems (NeurIPS) , 2024 , Codes
- EMNLPEmpirical Methods in Natural Language Processing (EMNLP) , 2024
- ECCVEuropean Conference on Computer Vision (ECCV) , 2024 , Website
- ECCVEuropean Conference on Computer Vision (ECCV) , 2024
- CVPRComputer Vision and Pattern Recognition (CVPR) , 2024 , Codes
- ICLRInternational Conference on Learning Representations (ICLR) , 2024 , Codes
- TPAMIIEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI) , 2024 , Codes
- TPAMIIEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI) , 2024 , Codes , extension of CVPR’22 work
- TPAMIIEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI) , 2024 , Codes , extension of ICLR’22 work