Text this: Large-vocabulary forensic pathological analyses via prototypical cross-modal contrastive learning