Text this: Two-stage deep reinforcement learning method for agile optical satellite scheduling problem