Text this: M-Learning: Heuristic Approach for Delayed Rewards in Reinforcement Learning