ECCV

Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning

The prevalent approach to the image captioning is an encoder-decoder framework, where the combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) is the de-facto. In contrast, CNNs are exploited in sequence learning …

Unpaired Image Captioning by Language Pivoting

Image captioning is a multimodal task involving computer vision and natural language processing, where the goal is to learn a mapping from the image to its natural language description. In general, the mapping function is learned from a training set …