Text this: Monocular 3D object detection with thermodynamic loss and decoupled instance depth