Text this: English listening and speaking ability evaluation model fusing computer vision and speech recognition algorithms