Text this: A Lightweight Multi-Scale Model for Speech Emotion Recognition