Text this: Tumor detection in breast cancer pathology patches using a Multi-scale Multi-head Self-attention Ensemble Network on Whole Slide Images