Text this: A multi-model longitudinal assessment of ChatGPT performance on medical residency examinations