Text this: Invariant Representation Learning in Multimedia Recommendation with Modality Alignment and Model Fusion