Video Captioning

Video Captioning with Boundary-aware Hierarchical Language Decoding and Joint Video Prediction

The explosion of video data on the internet requires effective and efficient technology to generate captions automatically for people who are not able to watch the videos. Despite the great progress of video captioning research, particularly on video …

Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning

The prevalent approach to the image captioning is an encoder-decoder framework, where the combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) is the de-facto. In contrast, CNNs are exploited in sequence learning …

Watch It Twice: Video Captioning with a Refocused Video Encoder

With the rapid growth of video data and the increasing demands of various applications such as intelligent video search and assistance toward visually-impaired people, video captioning task has received a lot of attention recently in computer vision …