Jiuxiang Gu
Home
News
Publications
CV
Ani Nenkova
Latest
ADOPD: A Large-Scale Document Page Decomposition Dataset (The Twelfth International Conference on Learning Representations 2024)
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation (The Twelfth International Conference on Learning Representations 2024)
Self-Cleaning: Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances (Findings of the Association for Computational Linguistics (NAACL) 2024)
LayerDoc: layer-wise extraction of spatial hierarchical structure in visually-rich documents (Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV) 2023)
DocLayoutTTS: Dataset and Baselines for Layout-informed Document-level Neural Speech Synthesis. (INTERSPEECH 2022)
Doctime: A document-level temporal dependency graph parser (Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics (NNACL) 2022)
Learning adaptive axis attentions in fine-tuning: Beyond fixed sparse attention patterns (Findings of the Association for Computational Linguistics (ACL) 2022)
Learning the Visualness of Text Using Large Vision-Language Models (Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2022)
MGDoc: Pre-training with multi-granular hierarchy for document image understanding (Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2022)
Unidoc: Unified pretraining framework for document understanding (Advances in Neural Information Processing Systems (NeurIPS) 2021)
Cite
×