Text this: Multi-Stage Audio-Visual Fusion for Dysarthric Speech Recognition With Pre-Trained Models