In order to plan ahead for multiple moves, an algorithm known as a markov decision process is commonly used when there are only a reasonably small group of possible world states.
为了计划后面多个步骤,当可能的世界状态数目不算太多时,通常用到一种被称作马尔科夫决策过程的算法。
2
Any system that can be described in this manner is a Markovprocess.
任何可以被描述成为这样一种形式的系统就是马尔科夫过程。
3
Reinforcement learning based on Markov decision process is a way of on-line learning, which can be applied to single agent environment.