Text this: Representation Learning for Vision-Based Autonomous Driving via Probabilistic World Modeling