The policyiterationmethod is used in solving process.
文中应用策略迭代法求解。
2
An appropriate selection of basis function directly in?uences the learning performance of a policyiterationmethod during the value function approximation.
An appropriate selection of basis function directly in? Uences the learning performance of a policyiterationmethod during the value function approximation.