## Abstract

This paper presents a hierarchical nonlinear control algorithm for the real-time planning and control of cooperative locomotion of legged robots that collaboratively carry objects. An innovative network of reduced-order models subject to holonomic constraints, referred to as interconnected linear inverted pendulum (LIP) dynamics, is presented to study cooperative locomotion. The higher level of the proposed algorithm employs a supervisory controller, based on event-based model predictive control (MPC), to effectively compute the optimal reduced-order trajectories for the interconnected LIP dynamics. The lower level of the proposed algorithm employs distributed nonlinear controllers to reduce the gap between reduced- and full-order complex models of cooperative locomotion. In particular, the distributed controllers are developed based on quadratic programing (QP) and virtual constraints to impose the full-order dynamical models of each agent to asymptotically track the reduced-order trajectories while having feasible contact forces at the leg ends. The paper numerically investigates the effectiveness of the proposed control algorithm via full-order simulations of a team of collaborative quadrupedal robots, each with a total of 22 degrees-of-freedom. The paper finally investigates the robustness of the proposed control algorithm against uncertainties in the payload mass and changes in the ground height profile. Numerical studies show that the cooperative agents can transport unknown payloads whose masses are up to 57%, 97%, and 137% of a single agent's mass with a team of two, three, and four legged robots.

## 1 Introduction

Legged robots can form collaborative robot (corobot) teams that assist humans in labor-intensive tasks such as construction, manufacturing, and assembly. The evolution of legged robots that cooperatively manipulate/transport objects can be described by high-dimensional and inherently unstable complex systems. Although powerful computational approaches have allowed the deployment of distributed control algorithms for complex robot systems, state-of-the-art techniques are tailored to the control of multirobot systems (MRSs) (see, e.g., Refs. [1] and [2]) composed of collaborative robotic arms and multifingered hands [3], aerial vehicles [4,5], and ground vehicles [6–8], but *not* sophisticated legged machines that cooperatively transport objects.

The *overarching goal* of this paper is to develop a hierarchical computational algorithm to enable the real-time planning and control of cooperative locomotion for multi-agent legged robotic systems that carry objects. The higher level of the proposed algorithm employs a supervisory control, based on event-based model predictive control (MPC), to generate optimal trajectories for individual agents. In particular, the MPC is formulated for the optimal control of an interconnected network of holonomically constrained reduced-order systems, developed based on linear inverted pendulum (LIP) models, subject to having feasible individual ground reaction forces (GRFs). To reduce the gap between the network of reduced- and full-order complex models of cooperative locomotion, distributed nonlinear controllers, based on quadratic programing (QP) and virtual constraints, are implemented at the lower level of the proposed algorithm to impose the full-order dynamics of each agent to asymptotically track the optimal trajectories while keeping the GRFs at all contacting leg ends in the friction cone. It is shown that the proposed control approach can generate and robustly stabilize cooperative locomotion patterns for multi-agent quadrupedal robotic systems in the presence of model uncertainties arising from unknown payloads and ground height variations (see Fig. 1).

### 1.1 Related Work, Motivation, and Challenges.

Gait planning for complex dynamical models of cooperative locomotion is a significant challenge arising from the hybrid nature of models, nonlinearities, high dimensionality, and strong interactions amongst the agents. Hybrid systems theory has provided powerful techniques for modeling and analyzing dynamic locomotion of single legged machines [9–18]. Advanced nonlinear control algorithms have been developed to address the hybrid nature of locomotion such as hybrid reduction [19], controlled symmetries [15], transverse linearization [16], and hybrid zero dynamics (HZD) [10,20]. The HZD approach considers a set of kinematic constraints, referred to as virtual constraints, to coordinate the links of the robots during locomotion. Virtual constraints are asymptotically imposed by the action of a feedback control law (e.g., input–output (I-O) linearization [21]) and have been validated for stable locomotion of bipedal robots [10,11,22–25] and powered prosthetic legs [26,27]. The HZD approach formulates the gait planning problem as an offline nonlinear programing problem [24] which *cannot* address the real-time planning for cooperative locomotion.

Various powerful MPC-based approaches have been introduced for the real-time planning and robust control of solitary legged machines, including the LIP-based approach [28–34], single rigid body dynamics approach [35–37], nonlinear MPC [38], policy-regularized MPC [39], and QP-based whole-body control [40,41]. Quadrupedal robots that cooperatively transport an object can be described by a set of legged agents that are coupled to each other and the object via a set of holonomic constraints. The *challenge* is to develop real-time optimal control algorithms for such a complex and inherently unstable robotic system that control locomotion with many degrees-of-freedom (DOFs). Existing MPC approaches for legged robots are typically formulated as QPs to be solved every time sample—this makes the extension of these MPC-based techniques for composite mechanical systems arising from cooperative locomotion of quadrupedal robots computationally intensive. We would like to deploy innovative MPC techniques in the context of *networked systems* that can reduce the computational burden to allow real-time planning and coordination of sophisticated corobot teams. One approach to tackle this challenge is through the development of *event-based* MPC techniques [33,42], in which MPC problems are solved at particular time samples, referred to as events (e.g., beginning of each domain), rather than every time sample. Our previous work [33] presented a nonlinear control approach, based on event-based MPC, for the robust and stable locomotion of single quadrupedal robots. We would like to extend this framework for the motion control of cooperative locomotion.

### 1.2 Objectives and Contributions.

The *objectives* and *contributions* of this paper are as follows. We present an innovative and interconnected network of reduced-order models to address the real-time planning for cooperative locomotion of quadrupedal robots that carry an object. The proposed network of reduced-order models is nonlinear and developed based on the interconnection of LIP models subject to holonomic constraints. We then present a hierarchical nonlinear computational algorithm for the motion control of legged corobots (see Fig. 2). At the higher level of the control scheme, we present a supervisory predictive control algorithm, based on event-based MPC, to compute the optimal center of mass (COM) trajectories for the network of reduced-order models subject to the feasibility of the individual net GRFs. To address the nonlinearity of the interconnected reduced-order models and to formulate a convex optimal control problem, the event-based MPC is employed using the linearized dynamics at the beginning of each continuous-time domain (i.e., event samples). At the lower level of the control scheme, distributed nonlinear controllers, based on QP and virtual constraints, are developed to impose the full-order dynamical model of each agent to asymptotically track the prescribed optimal COM trajectories while keeping all individual GRFs at the contacting leg ends feasible. To demonstrate the effectiveness of the proposed control algorithm, a series of extensive and full-order numerical simulations is presented for cooperative locomotion of a team of two, three, and four advanced quadrupedal robots, Vision 60, augmented with Kinova arms, each with a total of 22DOFs (see Fig. 1). It is numerically shown that the proposed control scheme can generate and robustly stabilize locomotion patterns for a team of quadrupedal robots that carry different objects in the presence of model uncertainties in terms of payloads and unknown ground height variation. The numerical studies show that the agents can cooperatively transport unknown payloads whose masses are up to 57%, 97%, and 137% of a singles agent's mass with a team of two, three, and four legged corobots.

The approach of this paper completely differs from Ref. [43] in that this paper presents an innovative network of reduced-order models together with a supervisory MPC for the real-time planning of cooperative locomotion, whereas [43] only studied the stabilization of preplanned and offline trajectories. In addition, the approach of Ref. [43] studied the locomotion of two agents via the hybrid systems formulation that cannot be easily extended to the locomotion of multiple agents. We finally remark that the objective of this paper is to study the development of hierarchical control algorithms for stable and robust cooperative locomotion subject to the holonomic constraints that can arise from cooperative transportation problems. Hence, the main focus of the paper is on *cooperative locomotion*, and robotic manipulators are being used toward this goal.

## 2 Interconnected Linear Inverted Pendulum Dynamics

The objective of this section is to derive an interconnected network of reduced-order models for the cooperative locomotion of $N\u22652$ legged corobots that carry an object. The reduced-order network, referred to as the *interconnected LIP dynamics*, will be utilized for the real-time trajectory planning in Sec. 3. Here, we consider an open path graph for the network of LIP dynamics with *N* vertices and *N* – 1 edges. In particular, all inner vertices have degree 2 except the end vertices 1 and *N* that have degree 1 (see Fig. 2). The vertices of the graph represent the agents, and the edges represent interconnection between agents. In our notation, $N(i)$ denotes the set of all agents that are adjacent to the agent $i\u2208V:={1,\u2026,N}$.

*Remark 1* (Path graphs). The reason for the assumption of open path graph is to simplify the presentation of the interaction forces and the interconnected LIP dynamics in Theorem 1. This also allows us to easily present the proposed control scheme.

where $ri:=col(rix,riy)\u2208\mathbb{R}2$ represents the Cartesian coordinates of the COM of the agent *i* in the horizontal plane with respect to the inertial world frame, $\u2113$ denotes the height of the COM, *g* is the gravitational constant, $ui:=col(uix,uiy)\u2208\mathbb{R}2$ represents the Cartesian coordinates of the center of pressure (COP), $fi\u2208\mathbb{R}2$ denotes the external force on the COM of the agent *i*, and *m* is the total mass of the agent

*We suppose that there are holonomic constraints amongst the adjacent agents as follows:*

*for all*$i\u2208V$*and*$j\u2208N(i)$*and some constants c _{ij} > 0. The forces between agents i and j are further assumed to be*$fij=\u2212fji=(ri\u2212rj)\u2009\lambda ij$

*, where λ*$\lambda ij=\lambda ji$

_{ij}denotes the Lagrange multipliers with the symmetry property*(see Fig. 2).*

We remark that from Assumption 1 as well as the symmetry condition, there are *N* – 1 independent Lagrange multipliers *λ _{ij}* to be determined. For future purposes, we show these independent Lagrange multipliers as a vector $\lambda :=col(\lambda ij\u2009|\u2009i=1,\u2026,N\u22121,j=i+1)\u2208\mathbb{R}N\u22121$, where “$col$” represents the column operator. Throughout this paper, the boldface variables will correspond to the global variables of the interconnected LIP network.

*Remark 2*. Assumption 1 considers the holonomic constraints amongst the COMs of the adjacent agents as the LIP dynamics cannot address the moments about the COM. In particular, the addition of robotic manipulators to the reduced-order model can result in moments around the COM generated by the arms. Hence, the interconnected LIP dynamics do not include the manipulator models. However, we remark that the full-order dynamical model of cooperative locomotion in Sec. 5 will consider the holonomic constraints amongst the manipulators' end effectors (EEs). We further assume that the EE's motion with respect to the body is almost static. The numerical results of Sec. 5 will show the adequacy and validity of this assumption for the development of the supervisory MPC. The numerical results will also show that the proposed control algorithms can bridge the gap between the developed interconnected LIP model and the detailed full-order model. Section 6.2 will discuss this assumption and the results with more details.

in which $eij:=ri\u2212rj\u2208\mathbb{R}2$. In addition, $b:=col(bi\u2009|\u2009i=1,\u2026,N\u22121)\u2208\mathbb{R}N\u22121$, where $bi:=mg\u2113(ri\u2212rj)\u22a4(ui\u2212uj)\u2212mg\u2113||ri\u2212rj||2\u2212m||r\u02d9i\u2212r\u02d9j||2$ with $j=i+1$. In what follows, we study the conditions under which there is a unique solution $\lambda $ for the algebraic Eq. (5).

Theorem 1 (Uniqueness of $\lambda $). Suppose that $N\u22652$ and Assumption 1 is met. Then, the matrix $\Lambda N(r)$ is positive definite if $ri\u2260rj$ (or, equivalently, $eij\u22600$) for all $i\u2208V$ and $j\u2208N(i)$.

*Proof.* See Appendix A.1.▪

*I*

_{2}and $02$ denote the identity and zero matrices of order two, respectively. We remark that according to the construction procedure, the state manifold for the interconnected LIP dynamics can be expressed as

for some *c _{ij}* > 0 which is invariant under the flow of Eq. (6). In addition, we can show that $M$ is a $2N+2$-dimensional embedded submanifold of $\mathbb{R}4N$.

*Example 1*. For the case of two agents, the interconnected LIP dynamics can be expressed as the following nonlinear system:

in which *T _{s}* denotes the sampling time and $x[k]$ and $u[k]$ represent the state vector and control inputs at the time sample $k\u2208Z\u22650:={0,1,\u2026}$, respectively.

## 3 Supervisory Predictive Control

The objective of this section is to develop a supervisory control algorithm, based on the interconnected LIP dynamics, MPC, and convex optimization, to effectively plan and coordinate multi-agent legged robots in real-time.

Models of legged locomotion are hybrid and can be illustrated as directed graphs. In this representation, continuous-time dynamics are represented by vertices of the graph to describe the evolution of the system by the Lagrangian dynamics. The edges of the graph then represent the instantaneous and discrete-time transitions amongst the continuous-time dynamics to model the possible and abrupt changes in the state vector according to the rigid impacts of the leg ends with the environment. In this paper, we consider a general locomotion (walking) pattern for the quadrupedal agents with start and stop conditions as a directed graph $Gw=(Vw,Ew)$ (see Fig. 3), where the vertices set $Vw$ represents the continuous-time domains (e.g., double-, triple-, and quadruple-contact domains) and edges set $Ew\u2282Vw\xd7Vw$ denotes the discrete-time transitions (e.g., impacts and take-offs) (see Fig. 3(b)). We further suppose that there are $md\u22651$ continuous-time domains and each continuous-time domain consists of $ng\u22651$ grid points (i.e., time samples) (see Fig. 3(a)). In this paper, we consider a general *aperiodic* locomotion pattern. Hence, domains are enumerated to show the successive continuous-time domains from start to stop. Consequently, there can be two distinct domains with the same stance legs (e.g., domains 2 and $md\u22121$ in Fig. 3(b)). The domain indicator function is then defined as $\zeta :Z\u22650\u2192{1,2,\u2026,md}$ by $\zeta (k):=\u230akng\u230b+1$ for $0\u2264k<md\u2009ng$ and $\zeta (k):=md$ for $k\u2265md\u2009ng$ to assign the domain index for every time sample. Here, $\u230a\xb7\u230b$ represents the floor function.

*i*in the domain $\zeta (k)$ (see Fig. 2). In addition, the net GRF acting on the agent

*i*must be in the friction cone. These friction cone conditions together with the dynamics (6) can be expressed as the following nonlinear inequality constraints:

*Problem 1* (Real-time planning of agents). Let us consider the locomotion pattern $Gw$ with a given set of desired footholds encoded in the convex hulls (i.e., support polygons) $U\zeta (k)i$ for all agents $i\u2208V$. For a given initial state $x0\u2208M$ and final state $xf\u2208M$, the planning problem consists of finding an optimal augmented control input $u[k]$ in real-time that steers the interconnected LIP dynamics (7) from $x0$ to $xf$ subject to the constraints (8) and (9).

*Remark 3* (Computation of footholds). We remark that the desired footholds are computed at the beginning of the locomotion and are used during the locomotion to form the support polygons in Problem 1. One way to compute the desired footholds is as follows. We can first consider a straight line connecting the initial position of each agent to its final position in the horizontal plane. We then generate a sequence of footholds along this line via a proper step length.

To address Problem 1, we consider a supervisory predictive control that has access to the global positions (i.e., reduced-order sates) of all agents (i.e., *r _{i}* for $i\u2208V$) via a direct communication network [1]. We then extend the event-based MPC approach of [33]—that generates optimal trajectories for locomotion of a single agent—to address the motion planning problem for cooperative locomotion of multi-agent robots. In the proposed approach, the supervisory predictive control is solved at the event samples, taken at the beginning of continuous-time domains, to reduce the computational burden of the networked system.

*Remark 4* (Supervisory MPC). The supervisory predictive control can be either solved on one of the agents' onboard computers or all agents' computers. The first approach would result in a heterogeneous team with a leader, and the latter one would result in a homogeneous team. In the first approach, the supervisory MPC is solved at the beginning of each continuous-time domain for the leader. In contrast, in the second approach, the MPC is solved at the beginning of continuous-time domains for each agent. Although the first approach generally requires less computation burden than the second one, our numerical results show that both of these techniques are computationally tractable for cooperative locomotion of a team of legged robots with up to four agents. In particular, the computation time for the supervisory MPC in these techniques takes less than 1 (ms) (see Sec. 5 for details).

*k*+

*j*predicted at time

*k*, and $uk+j|k$ denotes the input of the LIP network (i.e., COPs) at time

*k*+

*j*computed at time

*k*. In addition, the Jacobian matrices and affine term are computed from Eq. (7) and updated at every event sample according to $A:=\u2202F\u2202x(x[k],u[k\u22121])\u2208\mathbb{R}4N\xd74N,\u2009B:=\u2202F\u2202u(x[k],u[k\u22121])\u2208\mathbb{R}4N\xd72N$, and $d:=F(x[k],u[k\u22121])\u2212A\u2009x[k]\u2212B\u2009u[k\u22121]\u2208\mathbb{R}4N$. An analogous technique can be used to estimate (9) as the following affine inequality:

where $Uk+Nh\u22121|k:=col(uk|k,\u2026,uk+Nh\u22121|k)\u2208\mathbb{R}2NNh$. Here, the terminal and stage cost functions are given by $p(xk+Nh|k):=||xk+Nh|k\u2212xk+Nh|kdes||P2$ and $\u2113(xk+j|k,uk+j|k):=||xk+j|k\u2212xk+j|kdes||Q2+||uk+j|k||R2$, respectively, for some positive definite matrices $P\u2208\mathbb{R}4N\xd74N,\u2009Q\u2208\mathbb{R}4N\xd74N$, and $R\u2208\mathbb{R}2N\xd72N$. In addition, $xk+j|kdes$ denotes a desired state trajectory and $||z||P2:=z\u22a4P\u2009z$. We remark that the supervisory event-based MPC can be translated into QP. Let $(xk+j|k\u22c6,uk+j|k\u22c6)$ denote the optimal solution over the control horizon. Then the optimal COM trajectory of the agents over one continuous-time domain (i.e., $xk+j|k\u22c6$ for $j=0,\u2026,ng\u22121$) will be utilized as the reference trajectory to be tracked by the low-level distributed controllers in Sec. 4. The MPC problem will be solved again at the beginning of the next continuous-time domain.

## 4 Distributed Controllers

The objective of this section is to present the low-level distributed controllers to impose the full-order dynamical models of individual agents to asymptotically track the optimal reduced-order trajectories prescribed by the supervisory predictive control while having feasible contact forces. Here, we extend the virtual constraints controller of Ref. [33] for the development of distributed controllers for multi-agent systems. More specifically, the distributed low-level controller formulates the I-O linearization problem as a QP that addresses the feasibility of each agent's individual GRFs at the contacting leg ends while tracking the optimal COM trajectories for cooperative locomotion and desired swing leg path of each agent. We remark that, unlike the supervisory MPC, the distributed low-level controller only considers the full-order dynamical model of each agent. Hence, it does not require the full state measurements of the other agents.

In this paper, each legged agent is assumed to consist of the 18-DOF quadrupedal robot Vision 60, manufactured by Ghost Robotics (Philadelphia, PA),^{2} plus a 4DOF Kinova arm (see Fig. 4) for the locomotion and manipulation purposes. The total mass of this augmented agent is 35 (kg). The DOFs on Vision 60 are composed of six underactuated DOFs for the absolute position and orientation of the robot plus 12 actuated DOFs associated with the legs. More specifically, each leg of the robot consists of six actuated DOFs for the hip roll, hip pitch, and knee joints. All DOFs of the Kinova arm are further assumed to be actuated. The detailed view of the joint arrangement and DOFs of the robot are represented in Fig. 4. In our notation, the local configuration vector and local control inputs (i.e., joint-level torques) for the agent $i\u2208V$ are denoted by $qi:=col(ptorso,i,\varphi torso,i,qbody,i)\u2208\mathbb{R}22$ and $\tau i\u2208\mathbb{R}16$, respectively, where $ptorso,i\u2208\mathbb{R}3$ and $\varphi torso,i\u2208\mathbb{R}3$ describe the absolute position and orientation of the torso for the agent *i* with respect to the inertial frame (see Fig. 4). The remaining portion, $qbody,i\u2208\mathbb{R}16$ then represents the body variables of the robot that form the shape of the robot. Finally, let $zi=col(qi,q\u02d9i)\u2208\mathbb{R}44$ and $Fi\u2208\mathbb{R}3\u2113c,i$ denote the local full-order states and contact forces at the leg ends of the agent. Here, $\u2113c,i$ represents the number of contacting legs with the ground.

*i*:

where $h0(qi)$ represents the set of holonomic quantities to be controlled, referred to as the controlled variables, and $hd,i(t)$ denotes the desired evolution of the controlled variables. The controlled variables, $h0(qi)$, are chosen as the orientation of the agent (i.e., roll, pitch, and yaw) together with its COM position, the Cartesian coordinates of the swing leg ends, and the Cartesian coordinates of the manipulator's EE in the inertial world frame. The desired evolution of the COM position in $hd,i(t)$ is defined as a Bézier polynomial [44] that passes through the discrete and optimal reduced-order trajectory generated by the supervisory predictive control. In particular, we consider a Bézier polynomial whose coefficients are solved via least-squares at the beginning of each domain such that the polynomial has the best fit to the optimal COM trajectory of the agent *i* for the current domain. For the swing leg ends, $hd,i(t)$ is taken as a Bézier foot trajectory in the task space starting from the previous foothold with zero velocity and ending at the next preplanned foothold with zero velocity. Finally, the desired evolution for the EE's Cartesian coordinates is chosen as the desired COM trajectory plus a constant vector representing the EE's position with respect to the torso.

*τ*, we consider the full-order and

_{i}*floating-base*dynamics of the agent

*i without*considering the interaction forces arising from manipulation. Although the low-level distributed controllers do not consider the interaction forces amongst the EE and objects for simplifying the controller synthesis, the full-order simulation models of the cooperative locomotion in Sec. 5 will consider these interacting forces to illustrate the validity of this assumption and robustness of the proposed control algorithms. We now consider the following local dynamics for the controller synthesis:

*i*, $Hi(qi,q\u02d9i)\u2208\mathbb{R}22$ represents the Coriolis, centrifugal, and gravitational forces of the agent

*i*, and $\u03d2i\u2208\mathbb{R}22\xd716$ is the input distribution matrix. Furthermore, $Jc,i(qi)\u2208\mathbb{R}3\u2113c,i\xd722$ represents the contact Jacobian matrix. For future purposes, the local dynamics (14) can be written in the state-space form as follows:

*y*in Eq. (13) along the full-order dynamics of the agent

_{i}*i*described in Eq. (15) results in the following output dynamics:

*K*and

_{P}*K*are positive definite matrices. Closed form expressions for the Lie derivatives can be found in Appendix A.2. In addition, the local controller assumes a rigid contact model between the stance leg ends of the agent and the ground. In particular, the acceleration of the stance leg ends is assumed to be zero which can be expressed as

_{D}*τ*, we are interested in solving for $(\tau i,Fi)$ subject to Eqs. (16) and (18) such that (1) the contact forces belong to the friction cone (i.e., $Fi\u2208FC$) while having feasible torques (i.e., $\tau min\u2264\tau \u2264\tau max$), and (2) the local torques are minimum two-norm. Hence, we setup the following set of distributed real-time QPs that can be solved at 1 kHz:

_{i}Here, *δ* is a defect variable added to the output dynamics (16) to guarantee the existence of a feasible solution in two different scenarios. (1) If the coefficient matrix loses rank at particular configurations, there may not be a pair of control torques and GRFs, i.e., $(\tau i,Fi)$, that satisfies (16) and (18). To tackle this issue, we introduce the defect variable *δ* to make the equality constraints feasible. (2) If the torques and GRFs, i.e., $(\tau i,Fi)$, do not belong to the admissible sets (i.e., the inequality constraints are violated), the defect variable *δ* can again help us to find a feasible solution. To reduce the effect of the defect variable *δ* on the output dynamics, we minimize its two-norm via a large weighting factor in the cost function. More specifically, the cost function (19) tries to minimize a weighted sum of the two-norms of the local torques and the defect variable, where $\omega >0$ is the weighting factor. We remark that using the defect variable *δ*, the output dynamics become $y\xa8i+KD\u2009y\u02d9i+KP\u2009yi=\u2212\delta (t)$, which is input-to-state stable [45]. Hence, if $\delta (t)$ remains bounded, the output profile *y*(*t*) will be also bounded. This will be analyzed more in the numerical simulations of Sec. 5. The optimal solutions of these QPs are finally denoted by $\tau i=\Gamma i(t,zi)$ for $i\u2208V$ and are employed as local whole-body motion controllers.

## 5 Numerical Simulations

The objective of this section is to numerically verify the effectiveness of the proposed hierarchical control algorithm for cooperative transportation of objects by a team of composite robotic agents as shown in Fig. 1. We study both reduced- and full-order coupled models of legged agents to show the stability of locomotion patterns for the closed-loop system. We further investigate the robustness of the closed-loop system in the presence of unknown payloads and uncertainty in the ground height profile.

• *Control parameters:* We consider the cooperative locomotion of two and three agents with trot gaits including start and stop domains. We have observed that for every sampling time *T _{s}* in $[60,80]$ (ms) with

*n*= 4 grids per domain, the proposed control scheme can stabilize the locomotion patterns. Here, we choose

_{g}*T*= 80 (ms). The control horizon for the supervisory predictive control is taken as $Nh=nhng=4$ which considers one domain ahead. The other parameters for the supervisory predictive control are tuned as $P=103\u2009I4N\xd74N,\u2009Q=I4N\xd74N,\u2009R=10\u221210I2N\xd72N$ which stabilize the cooperative motion. We have numerically observed that for $l\u2208[0.35,0.55]$ (m), the robots behave safely, and the joint-level torques remain in an acceptable range. For the purpose of this paper, we choose

_{s}*l*=

*0.5 (m). The friction coefficient is assumed to be $\mu =0.6$. The supervisory predictive control is solved in an event-based manner (i.e., at the beginning of each domain), that is approximately every $ngTs=0.32$ s. Analogous to Ref. [33, Remark 1], we make use of a sparse QP structure to effectively solve the MPC (12). We can show that the number of decision variables for the sparse QP are $8NNh$ and $10NNh$ during the middle and start/stop domains, respectively.*

• *Reduced-order coupled models*: The evolution of the COM and COP for forward trot gaits of the individual agents in the interconnected LIP dynamics with *N *=* *2 and *N *=* *3 agents is depicted in Figs. 5(a)–5(b) and Figs. 5(c)–5(e), respectively. Here we make use of matlab for simulating the interconnected LIP dynamics (7) subject to the holonomic constraints and the supervisory predictive control (12). The initial configurations of the LIP models are chosen as $r1[0]=(0,0)\u22a4(m)$ and $r2[0]=(0,1)\u22a4(m)$ for *N *=* *2 agents. Moreover, the initial positions of the LIP models are taken as $r1[0]=(0.2,1)\u22a4(m),\u2009r2[0]=(0,0)\u22a4(m)$, and $r3[0]=(\u22120.5,\u22121)\u22a4(m)$ for *N *=* *3 agents. The step length for *N *=* *2 and *N *=* *3 is chosen as $(0.05,0)(m)$ and $(0.03,0)(m)$ in $\mathbb{R}2$, respectively. The target points are taken as the geometric center of the contact points in the last (i.e., stop) domain. Convergence to the target points with different number of agents and after *m _{d}* = 20 continuous-time domains is clear.

• *Full-order coupled models*: Next, we study the full-order complex model of cooperative locomotion with the proposed hierarchical control algorithm over *m _{d}* = 50 domains in RaiSim [46]. Here, we assume massless bars to be carried by the EEs of Vision 60 agents augmented with Kinova arms as shown in Fig. 1. The contact between the bar and the EE of Kinova arm is considered as a point contact. Based on this contact condition, the wrench between the object and EE of the arm only consists of the interaction forces. The QP arising from the supervisory predictive control is solved with qpSWIFT [47]. The average computation time of the higher-level QP on a laptop computer with an Intel

^{®}Core™ i7-10750H CPU 2.60 GHz and 16 GB RAM is 0.35 (ms) and 0.59 (ms) for

*N*=

*2 and*

*N*=

*3 agents, respectively. The distributed and low-level controllers of Eq. (19) are also solved with qpSWIFT in 1 kHz and the weighting factor*

*ω*is chosen as 10

^{7}.

The numerical simulation results for the stable cooperative locomotion of robots are provided in Figs. 6 and 7. Figures 6(a)–6(d) and 7(a)–7(c) illustrate the evolution of the virtual constraints and torque inputs for the individual agents during collaborative and forward trot gaits with *N *=* *2 and *N *=* *3 agents, respectively. Here, the speed of cooperative locomotion for two and three agents is 0.15 (m/s) and 0.1 (m/s). From these figures, we observe that the control inputs (joint-level torques) for all agents are bounded. In addition, the outputs (i.e., virtual constraints) remain bounded during the cooperative locomotion. We remark that Figs. 6 and 7 depict the first three components of the virtual constraints (i.e., *y _{x}*,

*y*, and

_{y}*y*) that represents the COM tracking error. In particular, these figures show that the COM of the full-order dynamical model of each agent tracks the optimal and reduced-order COM trajectory generated by the supervisory MPC. We also remark that the range of the control inputs (torques) is bounded between –5 (N·m) and 5 (N·m) by the low-level nonlinear controller. Finally, the control inputs in Figs. 6 and 7 depict the motor torques

_{z}*before*the gearbox system. Figure 8 depicts the evolution of the two-norm of the defect variable

*δ*for the cooperative locomotion of two agents. From this figure, we observe that

*δ*remains very small.

• *Robustness analysis*: To demonstrate the robustness of the proposed control algorithm against uncertainties, we assume that the mass of the bars between the adjacent agents' EEs is increased to 0.5 (kg) which is *unknown* for the controller. We further assume that additional unknown payloads of 20 (kg), 34(kg), and 48(kg) are cooperatively transported by two, three, and four agents on their torsos, respectively. Figures 9(a)–9(d) and Figs. 10(a)–10(c) depict the evolution of the virtual constraints and torque inputs for *N *=* *2 and *N *=* *3 agents, respectively. Furthermore, Figs. 11(a)–11(d) shows the evolution of the virtual constraints and torque inputs for *N *=* *4 agents. From these figures, it is observed that the control inputs (i.e., torques) and outputs (i.e., virtual constraints) remain bounded during the cooperative locomotion with uncertainties. Hence, the proposed control algorithm is capable of addressing the uncertainty arising from the payload mass. To demonstrate the effectiveness of the proposed control algorithms based on the interconnected LIP dynamics, we study the same numerical simulations with the MPC control algorithm of Ref. [33], in which the MPC is designed for individual robots without considering the interconnected LIP dynamics. Snapshots of the simulation results for cooperative locomotion of agents with and without the proposed approach of this paper are depicted in Figs. 12–14 to visualize the successes and failures. It is clear that the agents *cannot* have robustly stable cooperative locomotion while using their own MPC without considering the interconnected LIP dynamics.

To show the robustness of the controller against the change in the ground height profile, we study the cooperative locomotion with *N *=* *2 and *N *=* *3 agents on uneven terrain. Here, we assume that the ground height profile changes in a random manner in the discrete set ${\xb11,\xb12}$ (cm). The evolution of the virtual constraints and torque inputs together with the convergence of the robots to the target points is depicted in Figs. 15 and 16 for two and three agents, respectively. From the figures, it is observed that the control inputs and outputs remain bounded during the cooperative locomotion. Figures 17 and 18 depict the snapshots of the cooperative locomotion patterns with the proposed control algorithm. In addition, Figs. 17 and 18 compare the robustness and performance of the proposed control solutions with the individual MPC algorithms that do not consider the interaction forces for the path planning. Animations of these simulations can be found online.^{3}

## 6 Discussion

The numerical simulations of the reduced- and full-order models show the effectiveness of the proposed supervisory predictive control algorithm in generating stable cooperative locomotion patterns for multi-agent legged robots. The proposed hierarchical control algorithm developed based on the interconnected LIP dynamics allows robustly stable cooperative locomotion of multi-agent legged robots subject to holonomic constraints whereas the same legged machines cannot perform stable cooperative locomotion patterns without the proposed algorithm. The objective of this section is to analyze the results and to discuss the limitations of the proposed control approach.

### 6.1 Robustness Against Uncertainties.

The developed control algorithm enhances the level of robustness of the coupled full-order dynamical systems against uncertainties arising from the unknown payloads as well as ground height profile changes. Here, we make use of the number of steps that the robots can take as metrics to evaluate the performance of the closed-loop system. In particular, success occurs if the robots can reach the target points in a specified number of domains, denoted by *m _{d}*. Otherwise, it is a “failure (instability)” (e.g., the robots may fall before reaching the target point or in a number of domains less than

*m*).

_{d}*6.1.1 Unknown Payloads.* For the payload simulations, we consider cooperative locomotion over *m _{d}* = 50 continuous-time domains. The objective is to evaluate the performance of cooperative locomotion for a team of $N\u2208{2,3,4}$ agents. As described in Sec. 5 and Figs. 9–14, the developed control algorithms allow transporting unknown and much heavier objects than the maximum payload of a single agent (i.e., 12 (kg)). More specifically, legged corobots can cooperatively carry the payloads and arrive at the target positions in 50 domains. In contrast, the same interconnected agents without the supervisory MPC cannot depart from the initial positions due to the lack of consideration of the interaction forces at the planner level. Our numerical studies show that the cooperative system with the supervisory MPC algorithm can transport 20 (kg), 34 (kg), and 48 (kg) with two, three, and four agents, respectively. In other words, the agents can transport

*unknown*payloads whose masses are up to 57%, 97%, and 137% of a singles agent's mass with a team of two, three, and four legged corobots.

In addition to the payloads mentioned above on the agents' torso, we consider payloads amongst the EEs in Sec. 5. The maximum weight for this load follows the Kinova arm's payload limitation (i.e., 0.5 (kg)). If we do not consider this limitation, our numerical simulations show that the proposed control algorithm can transport payloads of 4.5 (kg), 9 (kg), and 13.5 (kg) between EEs with a team of two, three, and four corobots, respectively.

*6.1.2 Varying Ground Height Profiles.* For this set of simulations, we study cooperative locomotion on uneven terrains over *m _{d}* = 100 continuous-time domains. The objective is to evaluate the performance of cooperative locomotion for a team of $N\u2208{2,3}$ agents. Our numerical studies show that the proposed control algorithm can result in stable cooperative locomotion on unknown terrains with ground height changes in the discrete set ${\xb11,\xb12}$ (cm). In particular, we simulated 100 different ground height profiles with discontinuities within the above set. In all of these simulations, the agents can successfully reach the final target in the specified number of domains. In contrast, the interconnected system without the supervisory MPC algorithm always fails on these terrains.

### 6.2 Limitations and Analysis of Results

*6.2.1 Linearization of the Interconnected Linear Inverted Pendulum Dynamics.* The interconnected LIP dynamics in Eq. (6) are nonlinear. In order to formulate a convex optimal control problem, the supervisory control algorithm first linearizes the dynamics, and then solves an MPC problem for the linearized dynamics. Our numerical simulations in Fig. 5 depict the behavior of the nonlinear dynamics subject to the supervisory MPC. From this figure, we observe that the optimal control problem, formulated for the linearized dynamics, can stabilize the target points for the original and interconnected LIP dynamics. In particular, the states of the nonlinear system remain bounded and asymptotically converge to the target points. We also remark that the supervisory MPC does not use a constant Jacobian linearization for the entire period of locomotion. Instead, it linearizes the dynamics around the current point at the beginning of each continuous-time domain. This makes the linearization error zero (i.e., resets it) at the beginning of each domain, which in turn reduces the gap between the states of the linearized and nonlinear dynamics. Furthermore, the adequacy of this linearization technique is validated in the full-order and complex models of cooperative locomotion. In particular, the virtual constraint plots in Figs. 6 and 7 show that the actual COM positions of the agents follow the desired COM trajectories, based on the linearized dynamics, and the error remains bounded. For future research, we will investigate nonlinear MPC algorithms that can address the path planning problem for the interconnected reduced-order models without linearization.

*6.2.2 Limitations of the Reduced-order Linear Inverted Pendulum Dynamics.* In this paper, we make use of the LIP dynamics to form the interconnected reduced-order network. One of the limitations of the LIP model is that it cannot capture moments about the COM. Hence, the arms model is not used in the interconnected LIP dynamics as the forces generated by the grippers due to the holonomic constraints can create moments about the COMs. Consequently, the holonomic constraints are represented amongst the COMs in the reduced-order model. However, we remark that the actual holonomic constraints are imposed between the EEs in the full-order simulations of Sec. 5. In the numerical simulations of Sec. 5, the arm joints are not locked. However, we define some virtual constraints to control the EE's Cartesian coordinates in the task space. The desired trajectory for the EE's position is taken as the desired COM trajectory, generated by the supervisory MPC, plus a constant value that represents the relative motion of the EE with respect to the body. Hence, the EE's relative motion with respect to the body becomes almost static. Consequently, the process of initiating the grasping motion is not addressed in this work. This can limit the general problem of loco-manipulation during cooperative locomotion of multi-agent robots. For future research, we will investigate alternative networks of reduced-order models that can be integrated with simple arm models for manipulation purpose.

The alternative limitation of the LIP model is that it cannot address dynamic locomotion. Furthermore, the height of the COM is assumed to be constant in the LIP dynamics that can limit locomotion on rough terrains. This motivates the use of alternative reduced-order models for future research. However, this may also increase the complexity of the reduced-order models and the computational burden of the supervisory MPC.

*6.2.3 Considerations for the Real-World Implementation*. The proposed controllers of this work assume that the agents can share their “reduced-order” states for the path planning purpose. In particular, the higher level of the control algorithm (i.e., supervisory MPC) is assumed to have access to all the reduced-order states of the agents (i.e., positions of the COMs). This can be realized via a direct communication network [1]. However, the low-level controllers for the whole-body motion control are distributed and do *not* require full-order state sharing. Our recent preliminary work [48, Chap. 4] shows that the QP-based low-level controllers can stabilize the locomotion of single-agent legged robots in practice. We will experimentally evaluate the performance of the proposed controllers for multi-agent legged robots in future work.

*6.2.4. Dynamic Locomotion:* The numerical simulations of this paper have shown that the interconnected LIP dynamics and the event-based supervisory MPC are sufficient to have robustly stable cooperative locomotion of legged robots. Although state-of-the-art single quadrupedal robots have dynamic gaits, the nature of single-agent locomotion and collaborative locomotion of multi-agent systems for cooperative transportation is completely different. Hence, the proposed control algorithms are validated for cooperative locomotion with quasi-static gaits but not dynamic gaits. The developed control approach would likely need to be altered to address more agile locomotion patterns in complex environments for future work.

## 7 Conclusion

This paper presented a hierarchical nonlinear control algorithm for the real-time planning and control of legged robots that collaboratively carry objects. We presented an innovative network of reduced-order models subject to holonomic constraints, referred to as the interconnected LIP dynamics, to address the motion planning problem of cooperative locomotion. The properties of the interconnected LIP dynamics were studied to formulate a supervisory control as the higher-level planner in the proposed control algorithm. The supervisory control is formulated as an event-based predictive control to steer the interconnected LIP dynamics subject to the feasibility of the net GRFs of individual agents. At the lower level of the proposed control scheme, distributed nonlinear controllers, based on QP and virtual constraints, were developed to impose the full-order dynamical model of each agent to asymptotically track the optimal reduced-order trajectories, prescribed by the supervisory predictive control, while having feasible contact forces at the leg ends. The effectiveness and robustness of the proposed nonlinear control scheme were demonstrated and investigated via full-order numerical simulations of a team of two, three, and four collaborative quadrupedal robots, each with a total of 22DOFs, while carrying different objects in the presence of uncertainties.

In this work, we considered path graphs to describe the shape of the objects to be carried. For future work, we will investigate more sophisticated shapes and graphs subject to holonomic constraints. In addition, we will investigate the design of distributed predictive controllers at the higher level of the proposed control scheme to reduce the computational burden further. This work also focused on the cooperative locomotion of legged robots, and the manipulators were used to enable the agents' holonomic constraints. Hence, the robotic manipulators' grasping mechanism was not studied. For future work, we will investigate alternative networks of reduced-order models that can represent the interaction wrenches for the loco-manipulation purpose. Furthermore, we will extend the desired foothold planning framework to more agile cooperative locomotion in aggressive environments.

## Acknowledgment

The content is solely the responsibility of the authors and does not necessarily represent the official views of the NSF.

## Funding Data

Division of Civil, Mechanical and Manufacturing Innovation, National Science Foundation (NSF) (Grant No. CMMI-1923216; Funder ID: 10.13039/100000147).

Division of Electrical, Communications and Cyber Systems, National Science Foundation (NSF) (Grant No. ECCS-1924617; Funder ID: 10.13039/100000148).

## Nomenclature

- A, B, d =
Jacobian linearization of the interconnected LIP dynamics

- $D(.),\u2009H(.)$ =
mass-inertia matrix, Coriolis, centrifugal, and gravitational terms

- $F$ =
discrete-time dynamics for the network of LIPs

- F
_{i}= ground reaction forces on the agent i

- $Gw,\u2009Vw,\u2009Ew$ =
directed graph of the locomotion pattern, vertices set, and edges set

- $h0(qi),\u2009hd,i(t)$ =
controlled variables for the agent i, and desired evolution of the controlled variables for the agent i

- $\u2113c,i$ =
number of the contacting legs with the ground for the agent i

- $LgiLfi,LwiLfi,Lfi2$ =
lie derivatives along the full-order floating-base dynamics of the agent i

- m
_{d}, n_{g}= total number of continuous-time domains, Number of grid points for each continuous-time domain

- N =
total number of agent

- n
_{h}, N_{h}= number of continuous-time domains to plan over, control horizon

- p
_{i}= Cartesian coordinates of the stance leg ends of the agent i

- $ptorso,i$ =
absolute position of the torso of the agent i with respect to the inertial frame

- q
_{i}= configuration vector of the agent i consisting of $ptorso,i,\u2009\varphi torso,i$, and $qbody,i$

- $qbody,i$ =
shape (body) variables of the agent i

- r
_{i}, r = Cartesian coordinates of the COM of the agent i, Column vector consisting of all ris

- T
_{s}= time step between grid points known as the sampling time

- u
_{i}, u = Cartesian coordinates of the COP of the agent i, Column vector consisting of all uis

- $V$ =
vertices set of the open path graph for the interconnected LIP dynamics

- x =
augmented state vector of the interconnected LIP dynamics consisting of r and $r\u02d9$

- y
_{i}= local outputs of the agent i

- z
_{i}= full state variables of the agent i consisting of qi and $q\u02d9i$

- δ =
defect variable for the low-level QP

- $\zeta (k),\u2009U\zeta (k)i$ =
current domain indicator function for the sample time k, and Convex hull of the contacting points with the ground for the agent i

- λ
_{ij}, $\lambda $ = Lagrange multipliers between agents i and j, Column vector consisting of independent λijs

- τ
_{i}= local torque inputs for the agent i

- $\varphi torso,i$ =
absolute orientation of the torso of the agent i with respect to the inertial frame

- ω =
weighting factor of the low-level QP

- $\u03d2(.),\u2009Jc(.)$ =
input distribution matrix, and Contact Jacobian matrix

## Footnotes

### Appendix

##### A.1 Proof of Theorem 1.

where in the fifth line, we have made use of the norm property, that is, $ek,k+1\u22a4ek+1,k+2\u2009\alpha k\u2009\alpha k+1\u2264||ek,k+1||||ek+1,k+2|||\alpha k||\alpha k+1|$. We remark that the last result in Eq. (A1) is indeed positive. To clarify this point, let us assume that the $|\alpha k|=||ek+1,k+2||||ek,k+1|||\alpha k+1|$ for all $k=1,\u2026,N\u22122$. Then $\u2211k=1N\u22122(||ek,k+1|||\alpha k|\u2212||ek+1,k+2|||\alpha k+1|)2=0$. However, because of the term $||e12||2\alpha 12+||eN\u22121,N||2\alpha N\u221212$, the quadratic function $\alpha \u22a4\Lambda N\u2009\alpha $ is strictly positive which completes the proof.▪