Text this: Head information bottleneck (HIB): leveraging information bottleneck for efficient transformer head attribution and pruning