Reinforcement learning algorithm based on minimum state method and average reward
In allusion to the problem that Q-Learning,which was used discount reward as the evaluation criterion,could not show the affect of the action to the next situation,AR-Q-Learning was put forward based on the average reward and Q-Learning.In allusion to the curse of dimensionality,which meant that the...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2011-01-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/74419758/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|