Text this: Human motion recognition based on feature fusion and residual networks