avatar.jpg

Senior Research Scientist

Adobe Research

Seattle, WA

Document UnderstandingImage GenerationUnsupervised LearningMultimodal LearningSelf-Supervised LearningInstruction TuningCross-LingualKV Cache OptimizationMultimodal LLMObject DetectionCUDADiffusion LLMVideo CaptioningComputer VisionLLMVideo UnderstandingReasoningText-to-ImageLayoutLarge Language ModelsAttention MechanismCNNOpen VocabularyCross-Modal RetrievalImage CaptioningNLPDiffusion ModelsVision-LanguageEntity SegmentationEfficient Inference

My name is Jiuxiang Gu (顾久祥). I am a Senior Research Scientist at Adobe Research in Seattle. I received my Ph.D. from Nanyang Technological University, Singapore (2016.1–2019.5), under the supervision of Prof. Jianfei Cai, Dr. Gang Wang, and Prof. Tsuhan Chen. I currently serve as an Area Chair for ICLR 2025 and WACV 2024/2025, a Senior Program Committee Member for IJCAI 2021–2024, and a Program Committee Member for AAAI 2021–2023, NAACL 2021, and others. My research journey began in hardware design. From 2010 to 2015, I worked as an ASIC design engineer. In 2015, I made the transition to Artificial Intelligence. My current research interests include:

  • Multimodal Foundation Models (LLM, MLLM, Diffusion LLM/MLLM, Text-to-Image/Video/3D Generation, Document Intelligence)
  • Efficient Architecture & Scaling (Pruning, Quantization, KV Cache Optimization, Edge Deployment)
  • Reasoning & Alignment (Chain-of-Thought, Hidden Thinking, Self-supervised Learning, Post-training)
  • Impact & Production: Contribute to Adobe Firefly and Acrobat AI Assistant

Open to collaborations and internships in the above areas.

📧 Feel free to reach out: jigu@adobe.com / gu.jiuxiang@gmail.com

Selected Publications

2026

  1. CVPR 2026
    Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models
    Shufan Li, Jiuxiang Gu, Kangning Liu, and 4 more authors
  2. ICLR 2026
    LaViDa-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation
    Shufan Li, Jiuxiang Gu, Kangning Liu, and 4 more authors

2025

  1. AAAI 2025
    Numerical pruning for efficient autoregressive models
    Xuan Shen, Zhao Song, Yufa Zhou, and 12 more authors

2024

  1. ICLR 2024 Oral
    Lrm: Large reconstruction model for single image to 3d
    Yicong Hong, Kai Zhang, Jiuxiang Gu, and 7 more authors
  2. ICLR 2024
    ADoPD: A large-scale document page decomposition dataset
    Jiuxiang Gu, Xiangxi Shi, Jason Kuen, and 5 more authors

2021

  1. NeurIPS 2021
    Unidoc: Unified pretraining framework for document understanding
    Jiuxiang Gu, Jason Kuen, Vlad I Morariu, and 5 more authors

2018

  1. AAAI 2018 Oral
    Stack-Captioning: Coarse-to-Fine Learning for Image Captioning
    Jiuxiang Gu, Jianfei Cai, Gang Wang, and 1 more author
  2. CVPR 2018 Spotlight
    Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
    Jiuxiang Gu, Jianfei Cai, Shafiq Joty, and 2 more authors
  3. Pattern Recognition, 2018
    Recent advances in convolutional neural networks
    Jiuxiang Gu, Zhenhua Wang, Jason Kuen, and 8 more authors