Text this: Efficient text-to-video retrieval via multi-modal multi-tagger derived pre-screening