Text this: Beyond the ROC Curve: Activity Monitoring to Evaluate Deep Learning Models in Clinical Settings