Text this: Representation Learning for Grounded Spatial Reasoning