Text this: A Multidisciplinary Multimodal Aligned Dataset for Academic Data Processing