Text this: Hybrid learning strategies: integrating supervised and reinforcement techniques for railway wheel wear management with limited measurement data