Video and Image Question Answering: building a bridge between visual content analysis and reasoning on textual data