Header menu link for other important links
X
Weakly-supervised image captioning based on rich contextual information
Zheng H.-T, Wang Z, Ma N, Chen J, Xiao X,
Published in Springer Science and Business Media LLC
2018
Volume: 77
   
Issue: 14
Pages: 18583 - 18599
Abstract
Automatically generation of an image description is a challenging task which attracts broad attention in artificial intelligence. Inspired by methods of computer vision and natural language processing, different approaches have been proposed to solve the problem. However, captions generated by the existing approaches have been lack of enough contextual information to describe the corresponding images completely. The labeled captions in the training set only basically describe images and lack of enough contextual annotations. In this paper, we propose a Weakly-supervised Image Captioning Approach (WICA) to generate captions containing rich contextual information, without complete annotations for the contextual information in datasets. We utilize encoder-decoder neural networks to extract basic captioning features and leverage object detection networks to identify contextual features. Then, we encode the two levels of features by a phrase-based language model in order to generate captions with rich contextual information. The comprehensive experimental results reveal that proposed model outperforms the existing baselines in terms of on the richness and reasonability of contextual information for image captioning. © 2017, Springer Science+Business Media, LLC.
About the journal
JournalData powered by TypesetMultimedia Tools and Applications
PublisherData powered by TypesetSpringer Science and Business Media LLC
ISSN1380-7501
Open Access0