Text this: Activity-Based Scene Decomposition for Topology Inference of Video Surveillance Network