Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess
In this study, hybrid state-action-reward-state-action (SARSAλ) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning stra...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2020-01-01
|
| Series: | Complexity |
| Online Access: | http://dx.doi.org/10.1155/2020/4708075 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|