Intelligent Engineering Systems through Artificial Neural Networks, Volume 20
Download citation file:
- Ris (Zotero)
- Reference Manager
Most reinforcement learning algorithms are of the model-free type in which the transition probabilities are not computed and the agent seeks to make decisions without building the transition probability model. We focus on the model-based, also called model-building, algorithms that attempt to build the model along with optimization of the decision-making process. Model-based algorithms have certain advantages over model-free algorithms in that their behavior is more stable and robust. Another aspect of robustness and stability of the algorithm has to do with the variability in the value of the performance measure returned by the algorithm. We will present a new model-building algorithm that builds the transition probability model simultaneously with the value function and a new variance-penalizing algorithm that exhibits robustness with respect to the performance measure.