Jiuxiang Gu
Home
News
Publications
CV
Jiuxiang Gu
Latest
ADoPD: A Large-Scale Document Page Decomposition Dataset (ICLR 2024)
Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic (arXiv 2024)
Lrm: Large reconstruction model for single image to 3d (ICLR 2024)
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation (ICLR 2024)
Selective reflection-tuning: Student-selected data recycling for llm instruction-tuning (arXiv 2024)
A Critical Analysis of Document Out-of-Distribution Detection (EMNLP 2023)
Aims: All-inclusive multi-level segmentation for anything (NeruIPS 2023)
Customization Assistant for Text-to-image Generation (arXiv 2023)
DocEdit: language-guided document editing (AAAI 2023)
Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances (arXiv 2023)
LayerDoc: layer-wise extraction of spatial hierarchical structure in visually-rich documents (WACV 2023)
Learning the visualness of text using large vision-language models (EMNLP 2023)
Llavar: Enhanced visual instruction tuning for text-rich image understanding (arXiv 2023)
Reflection-tuning: Data recycling improves llm instruction-tuning (arXiv 2023)
Ca-ssl: Class-agnostic semi-supervised learning for detection and segmentation (ECCV 2022)
Delving into out-of-distribution detection with vision-language representations (NeurIPS 2022)
DocLayoutTTS: Dataset and Baselines for Layout-informed Document-level Neural Speech Synthesis. (INTERSPEECH 2022)
Doctime: A document-level temporal dependency graph parser (NNACL 2022)
Ei-clip: Entity-aware interventional contrastive learning for e-commerce cross-modal retrieval (CVPR 2022)
FedKC: Federated knowledge composition for multilingual natural language understanding (ACMWeb 2022)
Fine-grained entity segmentation (arXiv 2022)
High-quality entity segmentation (arXiv 2022)
Improving the reliability for confidence estimation (ECCV 2022)
Interactive image generation with natural-language feedback (AAAI 2022)
Learning adaptive axis attentions in fine-tuning: Beyond fixed sparse attention patterns (ACL 2022)
MGDoc: Pre-training with multi-granular hierarchy for document image understanding ( 2022)
Meta spatio-temporal debiasing for video scene graph generation (ECCV 2022)
Open world entity segmentation (TPAMI 2022)
Open-vocabulary instance segmentation via robust cross-modal pseudo-labeling (CVPR 2022)
Tigan: Text-based interactive image generation and manipulation (AAAI 2022)
Towards language-free training for text-to-image generation (CVPR 2022)
User-Entity Differential Privacy in Learning Natural Language Models (Big Data 2022)
Exploiting semantic embedding and visual feature for facial action unit detection (CVPR 2021)
Multi-scale aligned distillation for low-resolution detection (CVPR 2021)
Selfdoc: Self-supervised document representation learning (CVPR 2021)
Towards interpreting and mitigating shortcut learning behavior of NLU models (arXiv 2021)
UNISON: Unpaired cross-lingual image captioning (AAAI 2021)
Unidoc: Unified pretraining framework for document understanding (NeurIPS 2021)
Unsupervised cross-lingual image captioning (AAAI 2021)
Finding it at another side: A viewpoint-adapted matching encoder for change captioning (ECCV 2020)
Resilient load restoration in microgrids considering mobile energy storage fleets: A deep reinforcement learning approach (PESGM 2020)
Self-supervised relationship probing (NeurIPS 2020)
Scene graph generation with external knowledge and image reconstruction (CVPR 2019)
Unpaired image captioning via scene graph alignments (ICCV 2019)
Watch It Twice: Video Captioning with a Refocused Video Encoder (ACMMM 2019)
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models (CVPR 2018)
Recent advances in convolutional neural networks (Pattern Recognition 2018)
Stack-Captioning: Coarse-to-Fine Learning for Image Captioning (AAAI 2018)
Unpaired image captioning by language pivoting (ECCV 2018)
Video Captioning with Boundary-aware Hierarchical Language Decoding and Joint Video Prediction (Neurocomputing 2018)
An empirical study of language cnn for image captioning (ICCV 2017)
Cite
×