LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models (arXiv 2024)

Publication
arXiv