Text this: A criterion for selecting the appropriate one from the trained models for model‐based offline policy evaluation