Text this: Automatic summarization of cooking videos using transfer learning and transformer-based models