Abstract
Methods of reinforcement learning are finding great success in solving complex decision and control tasks across different domains and applications. In the frame of control tasks, these methods seek to produce optimized controllers by gathering experience through interaction with the target system. Such controllers, which are often referred to as agents, can be generated without the need for rigorous physical modeling, parameter identification, and detailed knowledge of the system. Moreover, these approaches potentially allow to actuate several actuators with heterogenous control signal types, such as proportional valves and switching valves. Together with the generic interaction-based training concept, these approaches are promising tools for the control of complex hydraulic applications. The general applicability of reinforcement learning-based controllers for hydraulic control tasks has been shown in literature. However, several design options are not clearly investigated, especially regarding the selection of algorithms and the simultaneous coordination of multiple actuators. Therefore, this paper presents a study on different concepts for the reinforcement learning-based control of multiple actuators in a hydraulic press. Furthermore, solutions using a stochastic on-policy actor-critic algorithm (PPO) and a deterministic off-policy actor-critic method (TD3) are compared in this context. The results show that a setup with individual agents for each valve slightly outperforms solutions where one agent is used to control multiple valves. Moreover, the TD3 algorithm appears to yield better results on the given task than PPO.