Text this: Hierarchical cross-modal attention and dual audio pathways for enhanced multimodal sentiment analysis