Text this: Fusion-Based Damage Segmentation for Multimodal Building Façade Images from an End-to-End Perspective