Text this: Gaze Estimation Based on a Multi-Stream Adaptive Feature Fusion Network