Text this: Semantic-enhanced panoptic scene graph generation through hybrid and axial attentions