It is rational to adopt the average reward reinforcement learning algorithms for solving the absorbing goal states cyclical tasks.
对于有吸收目标状态的循环任务,比较合理的方法是采用基于平均报酬模型的强化学习。
2
USDA analysts continue to work on details for the Average Crop Revenue Election (ACRE) program as signup for the 2009 Direct and Counter-cyclical Program (DCP) moves forward.