Text this: A novel image captioning model with visual-semantic similarities and visual representations re-weighting