Text this: Hybrid Deep Learning Framework for Eye-in-Hand Visual Control Systems