Mushui Liu   刘木水

Alibaba Group, Zhejiang University
Hangzhou, China

Email: lms@zju.edu.cn

Mushui Liu

Biography

I am currently a Researcher at Alibaba Group through the T-Star Lab Talent Program, also serving as a corporate postdoctoral fellow under the supervision of Prof. Jun Xiao and Dr. Ying Chen. My research focuses on multimodal artificial intelligence, especially Image Generation, Unified Models, Video Understanding and Generation, and Representation Learning.

I received my Ph.D. degree in the College of Information Science & Electronic Engineering from Zhejiang University in June 2025, where I was supervised by Prof. Yunlong Yu. I also received my B.S. degree from Zhejiang University in 2020.

During my Ph.D. studies, I had a wonderful time interning at Taobao and Tmall Group, Fuxi AI Lab NetEase, Disney Hulu, and ByteDance.

💬 Our team is hiring research interns who have strong engineering skills and a strong interest in AIGC. Feel free to drop me an email (lms@zju.edu.cn) if you have an interest in the above topics, and remote cooperation is welcome.

News

Selected Publications [Google Scholar]

# co-first author | * corresponding author.

  1. TFCustom: Customized Image Generation with Time-Aware Frequency Feature Guidance
    Mushui Liu, Dong She, Jingxuan Pang, Qihan Huang, Jiacheng Ying, Wanggui He, Yuanlei Hou, Siming Fu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR, Highlight), 2025.
  2. Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO
    Yunze Tong#, Mushui Liu#,*, Canyu Zhao, Wanggui He, Shiyi Zhang, Hongwei Zhang, Peng Zhang, Jinlong Liu, Ju Huang, Jiamang Wang, Hao Jiang, Pipei Huang
    International Conference on Machine Learning (ICML), 2026.
  3. LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
    Mushui Liu, Yuhang Ma, Zhen Yang, Jun Dan, Yunlong Yu, Zeng Zhao, Bai Liu, Changjie Fan, Zhipeng Hu
    The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
  4. Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning
    Mushui Liu, Fangtai Wu, Bozheng Li, Ziqian Lu, Yunlong Yu, Xi Li
    The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
  5. Mint: Multi-Modal Chain of Thought in Unified Generative Models for Enhanced Image Generation
    Yi Wang#, Mushui Liu#, Wanggui He, Hanyang Yuan, Longxiang Zhang Ziwei Huang, Guanghao Zhang, Wenkai Fang, Haoze Jiang, Shengxuming Zhang, Dong She, Jinlong Liu, Weilong Dai, Mingli Song, Hao Jiang, Jie Song
    European Conference on Computer Vision (ECCV), 2026.
  6. DCoAR: Deep Concept Injection into Unified Autoregressive Models for Personalized Text-to-Image Generation
    Fangtai Wu#, Mushui Liu#, Weijie He, Zhao Wang, Yunlong Yu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026.
  7. MetaFusion: Instance-Aware Meta-Prompting and Token Refinement for VLMs Generalization
    Mushui Liu, Fangtai Wu, Ziqian Lu, Zhao Wang, Yunlong Yu, Jungong Han, Zhongfei Zhang
    IEEE Transactions on Artificial Intelligence (TAI), 2026.
  8. MOSAIC: Multi-Subject Personalized Generation via Correspondence-Aware Alignment and Disentanglement
    Dong She#, Siming Fu#, Mushui Liu#, Qiaoqiao Jin, Hualiang Wang, Mu Liu, Jidong Jiang
    International Conference on Learning Representations (ICLR), 2026.
  9. FUSE: Fine-Grained and Semantic-Aware Learning for Unified Image Understanding and Generation
    Peng Zhang#, Wanggui He#, Mushui Liu#, Wenyi Xiao, Siyu Zou, Yuan Li, Xingjian Wang, Guanghao Zhang, Yanpeng Liu, Weilong Dai, Jinlong Liu, Shuyi Ying, Ruikai Zhou, Yunlong Yu, Yubo Tao, Hai Lin, Hao Jiang
    The 40th Annual AAAI Conference on Artificial Intelligence (AAAI), 2026.
  10. CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation
    Guozhen Zhang#, Tao Zhong#, Yan Xia#, Mushui Liu#, Zhelun Yu, Haoyuan Li, Wanggui He, Fangxun Shu,Dong She, Yi Wang, Hao Jiang
    The 40th Annual AAAI Conference on Artificial Intelligence (AAAI), 2026.
  11. Mars: Mixture of Auto-Regressive Models for Fine-grained Text-to-Image Synthesis
    Wanggui He#, Siming Fu#, Mushui Liu#, Xierui Wang, Wenyi Xiao, Fangxun Shu, Yi Wang, Lei Zhang, Zhelun Yu, Haoyuan Li, Ziwei Huang, LeiLei Gan, Hao Jiang
    The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
  12. Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition
    Bozheng Li#, Mushui Liu#, Gaoang Wang, Yunlong Yu
    The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
  13. RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
    Zhen Yang, Guibao Shen, Liang Hou, Mushui Liu, Luozhou Wang, Xin Tao, Pengfei Wan, Di Zhang, Ying-Cong Chen
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR, findings), 2026.
  14. Improving Zero-Shot Generalization for CLIP with Variational Adapter
    Ziqian Lu, Fangtai Shen, Mushui Liu, Yunlong Yu, Zhao Wang, Xi Li, Jungong Han
    European Conference on Computer Vision (ECCV), 2024.
  15. OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning
    Mushui Liu, Bozheng Li, Yunlong Yu
    European Conference on Artificial Intelligence (ECAI, Oral), 2024.
  16. Variational Adapter: Improving CLIP in Data-Imbalanced Scenarios
    Ziqian Lu, Mushui Liu, Yunlong Yu, Zhao Wang, Xi Li, Jungong Han
    IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT), 2025.
  17. Synth-CLIP: Synthetic Data Make CLIP Generalize Better in Data-Limited Scenarios
    Mushui Liu, Weijie He, Ziqian Lu, Jun Dan, Yunlong Yu, Yingming Li, Xi Li, Jungong Han
    Neural Networks (Neural Networks), 2025.
  18. Tolerant Self-Distillation for Image Classification
    Mushui Liu, Yunlong Yu, Zhong Ji, Jungong Han, Zhongfei Zhang
    Neural Networks (Neural Networks), 2024.
  19. Fully Fine-Tuned CLIP Models are Efficient Few-Shot Learners
    Mushui Liu, Bozheng Li, Jun Dan, Ziqian Lu, Zhao Wang, Yunlong Yu
    Knowledge-Based Systems (KBS), 2025.
  20. Hybrid mask generation for infrared small target detection with single-point supervision
    Weijie He, Mushui Liu, Yunlong Yu
    Neurocomputing (Neurocomputing), 2025.
  21. Lightweight MIMO-WNet for single image deblurring
    Mushui Liu, Yunlong Yu, Yingming Li, Zhong Ji, Wen Chen, Yang Peng
    Neurocomputing (Neurocomputing), 2023.
  22. RestorerID: Towards Tuning-Free Face Restoration with ID Preservation
    Jiacheng Ying#, Mushui Liu#, Zhe Wu, Runming Zhang, Zhu Yu, Siming Fu, Si-Yuan Cao, Chao Wu, Yunlong Yu, Huiliang Shen
    In Submission.
  23. DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation
    Mushui Liu, Weijie He Yunlong Yu, Zhao Wang, Chao Wu
    In Submission.
  24. CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers
    Dong She#, Mushui Liu#, Jingxuan Pang, Jin Wang, Zhen Yang, Wanggui He, Guanghao Zhang, Yi Wang, Qihan Huang, Haobin Tang, Yunlong Yu, Siming Fu
    In Submission.
  25. CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation
    Mushui Liu, Jun Dan, Ziqian Lu, Yunlong Yu, Yingming Li, Xi Li
    In Submission.

Academic Services