Text this: A SWIN-based vision transformer for high-fidelity and high-speed imaging experiments at light sources