Text this: Object detection and tracking in video sequences: formalization, metrics and results