Multi-robot systems have received more and more attentions in the robotics community in the past decade. The most important issue in this area is multi-robot coordination, which focuses on how to make multiple autonomous robots cooperate or compete with each other to complete a common task. Due to its complexity, the conventional planning-based or behavior-based approaches can not work well in multi-robot coordination, especially in a dynamic unknown environment. Therefore, machine learning is becoming a promising method to help robots work in an unknown dynamic environment and improve their performance increasingly. The Q-learning algorithm was selected by most of multi-robot researchers to accomplish the above objective because of its simplicity and low computational requirements. However, directly extending the single-agent Q-learning algorithm will violate its Markov assumption and result in a low convergence speed and failing to learn a good cooperative policy. In this paper, the team Q-learning algorithm, which was originally designed for the framework of Stochastic Games (SG), is proposed to make decisions for a multi-robot purely cooperative project: Multi-robot object transportation. Firstly, the basic idea of the framework of Stochastic Games and the team Q-learning algorithm are introduced. Next, the algorithm is extended to a multi-robot object transportation task, and the implementation details are presented. Some computer simulation results are presented to demonstrate that the team Q-learning algorithm works well to make decisions for the proposed multi-robot system. Finally, effects of some parameters of team Q-learning are assessed and some interesting conclusions are drawn. In particular, the simulation results show that training is helpful for improving the performance of multi-robot decision-making, but its effect is very limited. In addition, it is also pointed out that the team Q-learning will result in a huge learning space when the robot number is bigger than ten, which indicates that a new Q-learning algorithm integrating single-agent Q-learning and Team Q-learning is urgent to be developed for multi-robot systems.

