Text this: Global-Mapping-Consistency-Constrained Visual-Semantic Embedding for Interpreting Autonomous Perception Models