Jiuxiang Gu
Home
News
Publications
CV
Ruiyi Zhang
Latest
A Multi-LLM Debiasing Framework (arXiv 2024)
ADOPD: A Large-Scale Document Page Decomposition Dataset (The Twelfth International Conference on Learning Representations 2024)
ARTIST: Improving the Generation of Text-rich Images by Disentanglement (arXiv 2024)
Customization assistant for text-to-image generation (Proceedings of the IEEE onference on Computer Vision and Pattern Recognition 2024)
LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models (arXiv 2024)
MMR: Evaluating Reading Ability of Large Multimodal Models (arXiv 2024)
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation (The Twelfth International Conference on Learning Representations 2024)
Self-Cleaning: Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances (Findings of the Association for Computational Linguistics (NAACL) 2024)
TRINS: Towards Multimodal Language Models that Can Read (Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024)
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation (arXiv 2024)
A Critical Analysis of Document Out-of-Distribution Detection (Findings of the Association for Computational Linguistics (EMNLP) 2023)
Llavar: Enhanced visual instruction tuning for text-rich image understanding (arXiv 2023)
Learning adaptive axis attentions in fine-tuning: Beyond fixed sparse attention patterns (Findings of the Association for Computational Linguistics (ACL) 2022)
Tigan: Text-based interactive image generation and manipulation (Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) 2022)
Towards language-free training for text-to-image generation (Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR 2022)
Cite
×