Text this: A large-scale dataset for Chinese historical document recognition and analysis