Abstract
In human–robot collaborative assembly, leveraging multimodal commands for intuitive robot control remains a challenge from command translation to efficient collaborative operations. This article investigates multimodal data-driven robot control for human–robot collaborative assembly. Leveraging function blocks, a programming-free human–robot interface is designed to fuse multimodal human commands that accurately trigger defined robot control modalities. Deep learning is explored to develop a command classification system for low-latency and high-accuracy robot control, in which a spatial-temporal graph convolutional network is developed for a reliable and accurate translation of brainwave command phrases into robot commands. Then, multimodal data-driven high-level robot control during assembly is facilitated by the use of event-driven function blocks. The high-level commands serve as triggering events to algorithms execution of fine robot manipulation and assembly feature-based collaborative assembly. Finally, a partial car engine assembly deployed to a robot team is chosen as a case study to demonstrate the effectiveness of the developed system.