👋 About Me

I am currently a PhD student at Tsinghua University, devoted to advancing AGI by enhancing the understanding capabilities of multimodal large language models. My main focus is on reinforced reasoning, including latent reasoning, spatial reasoning, etc.. Prior to this, I conducted research at Massachusetts Institute of Technology (MIT) and earned my B.S. in Electronic Engineering from Sun Yat-sen University.

From 2026, I worked as a Research Intern (青云人才计划) at Hunyuan Group, Tencent, Inc., and subsequently joined YouTu Lab in the same role (青云人才计划). At the end of 2025, I was at LongCat Foundation Group (北斗人才计划), Meituan, Inc.. Before that, I worked as a Research Intern (筋斗云人才计划) at ByteDance. From 2024 to 2025, I worked as a Research Intern at AI/ML Group of Microsoft Research Asia (MSRA). I also interned at QQ Foundation Group, Tencent, Inc. in 2024.

From 2023 to 2025, I served as CTO and Co-founder of Beijing OneXOne Tech Co., Ltd., an AI education startup that completed Series A funding.

I am currently seeking suitable collaboration or job opportunities. Feel free to reach out! (czq23@mails.tsinghua.edu.cn).

🔥 News

  • 2026.05:  🎉 Four papers (OmniVideo-R1, GTASR, Reasoning-VLA, L²-VMAS) have been accepted by ICML 2026.
  • 2026.04:  🎉 Release technical report Script-a-Video in Tencent Hunyuan Team.
  • 2026.02:  🎉 Three papers (3DThinker, MACT (Highlight), VisMem) have been accepted by CVPR 2026.
  • 2026.01:  🎉 One paper (ViF) has been accepted by ICLR 2026.
  • 2025.11:  🎉 Two papers (SIFThinker, ChildBench (Oral)) have been accepted by AAAI 2026.
  • 2025.06:  🎉 One paper (VisRL) has been accepted by ICCV 2025.
  • 2025.04:  🎉 One paper has been accepted by CVPRW 2025 (Highlight).
  • 2025.03:  🎉 One paper (DV-Matcher) has been accepted by CVPR 2025.
  • 2024.04:  🎉 One paper (Three-Phases-LoRA) has been accepted by ICANN 2024.

📖 Education

Sep. 2026 – Dec. 2027: Ph.D., Data Science and Information Technology, Tsinghua University (THU), Beijing, China.
Dec. 2024 – Jul. 2025: Research Intern, Multisensory Intelligence Group, Massachusetts Institute of Technology (MIT), Remote.
Sep. 2023 – Jun. 2026: M.Sc. (change to Ph.D.), Data Science and Information Technology, Tsinghua University (THU), Beijing, China.
Sep. 2019 – Jun. 2023: B.Sc., Electronic Information Science and Technology, Sun Yat-sen University, Guangzhou, China. (Rank 2/123)

📑 Technical Report

Report
Script-a-Video
Script-a-Video: Deep Structured Audio-visual Captions via Factorized Streams and Relational Grounding
Tencent Hunyuan Team · Zhangquan Chen (Core Contributor)
Technical report, 2026

📝 Selected Publications

ICCV
VisRL
VisRL: Intention-Driven Visual Perception via Reinforced Reasoning
Zhangquan Chen, Xufang Luo, Dongsheng Li
International Conference on Computer Vision (ICCV) & CVPRW Highlight, 2025

CVPR
DV-Matcher
DV-Matcher: Deformation-based Non-Rigid Point Cloud Matching Guided by Pre-trained Visual Features
Zhangquan Chen, Puhua Jiang, Ruqi Huang
Computer Vision and Pattern Recognition (CVPR), 2025

AAAI
SIFThinker
SIFThinker: Spatially-Aware Image Focus for Visual Reasoning
Zhangquan Chen, Ruihui Zhao, Chuwei Luo, Mingze Sun, Xinlei Yu, Yangyang Kang, Ruqi Huang
AAAI Conference on Artificial Intelligence (AAAI), 2026

CVPR
3DThinker
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views
Zhangquan Chen, Manyuan Zhang, Xinlei Yu, Xufang Luo, Mingze Sun, Zihao Pan, Yan Feng, Peng Pei, Xunliang Cai, Ruqi Huang
Computer Vision and Pattern Recognition (CVPR), 2026

ICML
OmniVideo-R1
OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention
Zhangquan Chen, Jiale Tao, Ruihuang Li, Yihao Hu, Ruitao Chen, Zhantao Yang, Xinlei Yu, Haodong Jing, Manyuan Zhang, Shuai Shao, Biao Wang, Qinglin Lu, Ruqi Huang
International Conference on Machine Learning (ICML), 2026

arXiv
4DThinker
4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding
Zhangquan Chen, Manyuan Zhang, Xinlei Yu, Xiang An, Bo Li, Xin Xie, ZiDong Wang, Mingze Sun, Shuang Chen, Hongyu Li, Xiaobin Hu, Ruqi Huang
arXiv 2026
arXiv
NFR
NFR: Neural Feature-Guided Non-Rigid Shape Registration
Zhangquan Chen, Puhua Jiang, Mingze Sun, Ruqi Huang
arXiv 2025

ICANN
Three-Phases-LORA
A Three-Phases-LORA Finetuned Hybrid LLM Integrated with Strong Prior Module in the Education Context
Zhangquan Chen, Chunjiang Liu, Haobin Duan
International Conference on Artificial Neural Networks, 2024
arXiv
Meow-Omni 1
Meow-Omni 1: A Multimodal Large Language Model for Feline Ethology
Jucheng Hu†, Zhangquan Chen†, Yulin Chen, Chengjie Hong, Liang Zhou, Tairan Wang, Sifei Li, Giulio Zhu, Feng Zhou, Yiheng Zeng, Suorong Yang, Dongzhan Zhou
(† equal contribution) arXiv 2026
arXiv
A-GRAE
Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation
Zhiqi Yu†, Zhangquan Chen†, Mengting Liu, Heye Zhang, Liangqiong Qu
(† equal contribution) arXiv 2026

arXiv
The Latent Space survey
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook
Xinlei Yu†, Zhangquan Chen†, Yongbo He†, Tianyu Fu†, Cheng Yang†, Chengming Xu†, Yue Ma†, Xiaobin Hu†, Zhe Cao, Jie Xu, Guibin Zhang, Jiale Tao, Jiayi Zhang, Siyuan Ma, Kaituo Feng, Haojie Huang, Youxing Li, Ronghao Chen, Huacan Wang, Chenglin Wu, Zikun Su, Xiaogang Xu, Kelu Yao, Kun Wang, Chen Gao, Yue Liao, Ruqi Huang, Tao Jin, Zhucun Xue, Cheng Tan, Jiangning Zhang, Wenqi Ren, Yanwei Fu, Yong Liu, Yu Wang, Xiangyu Yue, Yu-Gang Jiang, Shuicheng Yan
(† equal contribution) arXiv 2026

AAAI
ChildBench
Easy for Children, Hard for AI: The Limits of Multimodal LLMs in Early Childhood Learning
Jingping Liu, Xueyan Wu, Hanxuan Chen, Ziyan Liu, Zhangquan Chen, Ronghao Chen, Huacan Wang
AAAI Conference on Artificial Intelligence (AAAI Oral), 2026

ICML
GTASR
Joint Geometric and Trajectory Consistency Learning for One-Step Real-World Super-Resolution
Chengyan Deng, Zhangquan Chen, Li Yu, Kai Zhang, Xue Zhou, Wang Zhang
International Conference on Machine Learning (ICML), 2026

ICLR
ViF
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
Xinlei Yu, Chengming Xu, Guibin Zhang, Yongbo He, Zhangquan Chen, Zhucun Xue, Jiangning Zhang, Yue Liao, Xiaobin Hu, Yu-Gang Jiang, Shuicheng Yan
International Conference on Learning Representations (ICLR), 2026
arXiv
CALMARS
Adversarial Robustness for Unified Multi-Modal Encoders via Efficient Calibration
Chih-Ting Liao, Zhangquan Chen, Chunlei Meng, Tzu-Yu Huang, Xin Cao, Xu Zheng
arXiv 2025

CVPR
MACT
Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling
Xinlei Yu, Chengming Xu, Zhangquan Chen, Yudong Zhang, Shilin Lu, Cheng Yang, Jiangning Zhang, Shuicheng Yan, Xiaobin Hu
Computer Vision and Pattern Recognition (CVPR Highlight), 2026

ICML
Reasoning-VLA
Reasoning-VLA: A Fast and General Vision-Language-Action Reasoning Model for Autonomous Driving
Dapeng Zhang, Zhenlong Yuan, Zhangquan Chen, Chih-Ting Liao, Yinda Chen, Fei Shen, Qingguo Zhou, Tat-Seng Chua
International Conference on Machine Learning (ICML), 2026

ICML
L²-VMAS
Dual Latent Memory for Visual Multi-agent System
Xinlei Yu, Chengming Xu, Zhangquan Chen, Bo Yin, Cheng Yang, Yongbo He, Yihao Hu, Jiangning Zhang, Cheng Tan, Xiaobin Hu, Shuicheng Yan
International Conference on Machine Learning (ICML), 2026

arXiv
SpaMEM
SpaMEM: Benchmarking Dynamic Spatial Reasoning via Perception-Memory Integration in Embodied Environments
Chih-Ting Liao, Xi Xiao, Chunlei Meng, Zhangquan Chen, Yitong Qiao, Weilin Zhou, Tianyang Wang, Xu Zheng, Xin Cao
arXiv 2026

CVPR
VisMem
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models
Xinlei Yu, Chengming Xu, Guibin Zhang, Zhangquan Chen, Yudong Zhang, Yongbo He, Peng-Tao Jiang, Jiangning Zhang, Xiaobin Hu, Shuicheng Yan
Computer Vision and Pattern Recognition (CVPR), 2026
arXiv
OmniZoo
Topology-Agnostic Animal Motion Generation from Text Prompt
Keyi Chen, Mingze Sun, Zhenyu Liu, Zhangquan Chen, Ruqi Huang
arXiv 2025
arXiv
EvoFSM
EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines
Shuo Zhang, Chaofa Yuan, Ryan Guo, Xiaomin Yu, Rui Xu, Zhangquan Chen, Zinuo Li, Zhi Yang, Shuhao Guan, Zhenheng Tang, Sen Hu, Liwen Zhang, Ronghao Chen, Huacan Wang
arXiv 2026

💼 Internships

Microsoft Research Asia ByteDance Tencent Hunyuan Meituan LongCat QQ Tencent YouTu Lab
  • May 2026 – Present: Tencent · Research Intern (青云人才计划) · YouTu Lab · Shennong Research Center
  • Dec 2025 – Apr 2026: Tencent · Research Intern (青云人才计划) · Hunyuan Multimodal LLM Group
  • Aug 2025 – Dec 2025: Meituan · Research Intern (北斗人才计划) · M17 LongCat Foundation Model Group
  • Apr 2025 – Aug 2025: ByteDance · Research Intern (筋斗云人才计划) · Multimodal LLM Content Understanding Group
  • Dec 2024 – Mar 2025: Microsoft Research Asia · Research Intern (Rising Star Award) · AI/ML Group
  • May 2024 – Aug 2024: Tencent · Research Intern · QQ Multimodal LLM Group
  • Apr 2023 – 2025: Beijing OneXOne Tech Co., Ltd. · Chief Technology Officer and Co-founder

🎖 Honors and Awards

🏅 Honor Recognitions

  • Sun Yat-sen University Outstanding Graduate
  • Sun Yat-sen University Outstanding Graduation Thesis (Rank 1)
  • Sun Yat-sen University Outstanding Role Model
  • Sun Yat-sen University Outstanding League Member (2020, 2021)
  • Suzhou Innovation and Entrepreneurship Leading Talent
  • Microsoft Research Asia Rising Star Award (Top 10%)

💰 Scholarship Awards

  • National Scholarship (2022)
  • Tsinghua University Scholarship (2025)
  • Shanghai Institute of Organic Chemistry, CAS Scholarship (2020)
  • Sun Yat-sen University Outstanding Undergraduate Scholarship (2020, 2021, 2022)
  • Sun Yat-sen University Discipline Competition Scholarship

🏆 Academic Competitions

  • Outstanding Winner (1st globally, Top 0.03%), 7th SW International College Mathematical Modeling Competition (2022)
  • First Prize, National College Mathematical Modeling Competition (2020)
  • Second Prize, 12th National College Mathematics Competition (2020)
  • Second Prize, Asia-Pacific Mathematical Modeling Competition (2021)
  • Honorable Mention, American Mathematical Contest in Modeling (2021)
  • Second Prize, National CYC Cup Mathematical Modeling Competition (2022)
  • Third Prize, Programming Competition (2020)
  • Merit Award, National Olympiad in Mathematics (2020)
  • Merit Award, Electronic Design Competition (2020)

⭐ Talent Programs

  • Tencent Qingyun Top Talent Program (腾讯青云人才计划)
  • ByteDance Jindouyun Top Talent Program (字节筋斗云人才计划)
  • Jingdong Top Young Technical Genius (TGT) Program (京东TGT人才计划)
  • Meituan Beidou Top Talent Program(美团北斗人才计划)
  • ModelBest “Ahead Four” Top Talent Program (面壁前进四人才计划)

💬 Services

Reviewer: ICML (silver reviewer), NeurIPS, AAAI, ECCV and other top-tier conferences/journals in computer vision and machine learning.

📊 Visitor Statistics

Total visitors: Page views: