Text this: A Composite Recognition Method Based on Multimode Mutual Attention Fusion Network