Text this: Transformer-Based Model for Monocular Visual Odometry: A Video Understanding Approach