Text this: A performance-driven hybrid text-image classification model for multimodal data