Abstract
External and internal convertible (EIC) form-based motion control is one of the effective designs of simultaneous trajectory tracking and balance for underactuated balance robots. Under certain conditions, the EIC-based control design is shown to lead to uncontrolled robot motion. To overcome this issue, we present a Gaussian process (GP)-based data-driven learning control for underactuated balance robots with the EIC modeling structure. Two GP-based learning controllers are presented by using the EIC property. The partial EIC (PEIC)-based control design partitions the robotic dynamics into a fully actuated subsystem and a reduced-order underactuated subsystem. The null-space EIC (NEIC)-based control compensates for the uncontrolled motion in a subspace, while the other closed-loop dynamics are not affected. Under the PEIC- and NEIC-based, the tracking and balance tasks are guaranteed, and convergence rate and bounded errors are achieved without causing any uncontrolled motion by the original EIC-based control. We validate the results and demonstrate the GP-based learning control design using two inverted pendulum platforms.
1 Introduction
An underactuated balance robot possesses fewer control inputs than the number of degrees-of-freedom (DOFs) [1,2]. Motion control of underactuated balance robots requires both the trajectory tracking of the actuated subsystem and balance control of the unactuated, unstable subsystem [3–5]. Inverting the nonminimum phase unactuated nonlinear dynamics brings additional challenges in causal feedback control design. Several modeling and control methods have been proposed for these robots and their applications [4–10]. Orbital stabilization method was used for balancing underactuated robots [1,11–13], with applications to bipedal robot [14] and cart-inverted pendulum [1]. Energy shaping-based control was also designed for underactuated balance robots [15,16]. One feature of those methods is that the achieved balance-enforced trajectory is not unique and cannot be prescribed explicitly [1,11–13]. In Refs. [5] and [17], a simultaneous trajectory tracking and balance control of underactuated balance robots was proposed by using the property of the external and internal convertible (EIC) form of the robot dynamics. The EIC-based control has been demonstrated as one of the effective approaches to achieve fast convergence with guaranteed performance.
The above-mentioned control designs require an accurate model of robot dynamics, and the control performance would deteriorate under model uncertainties or external disturbances. Machine learning-based methods provide an efficient tool for robot modeling and control [18,19]. In particular, Gaussian process (GP) regression is an effective learning approach that generates nearly analytical structure and bounded prediction errors [7,19–21]. Development of GP-based performance-guaranteed control for underactuated balance robots has been reported in Refs. [4], [20], and [22]. In Ref. [4], the control design was conducted in two steps. A GP-based inverse dynamics controller for unactuated subsystem to achieve balance and a model predictive control (MPC) was used to simultaneously track the given reference trajectory and estimate the balance equilibrium manifold (BEM). The GP prediction uncertainties were incorporated into the control design to enhance the control robustness. The work in Ref. [5] followed the sequential control design in the EIC-based framework, and the controller was adaptive to the prediction uncertainties. The training data were selected to reduce the computational complexity.
This work takes advantage of the structured GP modeling approach in Refs. [5] and [7] and presents an integration of EIC-based control with GP models. We first present the conditions under which uncontrolled motions exist under the original EIC-based control design for underactuated balance robots. We identify these conditions and design the stable GP-based learning control with the properly selected nominal robot dynamic model. Two different controllers, called partial- and null-space-EIC (i.e., PEIC- and NEIC), are presented to improve the closed-loop performance. The PEIC-based control constructs a virtual inertia matrix to reshape the dynamics coupling between the actuated and unactuated subsystems. The EIC-induced uncontrolled motion is eliminated, and the robotic system behaves as a combined fully actuated subsystem and a reduced-order unactuated subsystem. Alternatively, the compensation effect in the NEIC-based control is applied to the uncontrolled coordinates in the null space, while the other part of the stable system motion stays unchanged. The PEIC- and NEIC-based controls achieve guaranteed robust performance with a fast convergence of the closed-loop tracking errors.
The control tasks considered in this work include both the trajectory tracking for the actuated subsystem and platform balance for the unstable subsystem. The interconnection between these two subsystems lies in implicit dynamic relationship that needs to be estimated in real time. The control problem considered here distinguishes from the work in literature. Most existing approaches, such as orbital stabilization and energy shaping, focus on stabilization only, that is, the trajectory of the actuated subsystem is not prescribed, and the main control task is to stabilize the unstable subsystem. The main contribution of this work lies in the new GP-based learning control of underactuated balance robots using the EIC structural properties. Compared with the approaches in Refs. [5] and [17], this work reveals underlying design properties and limitations of the original EIC-based control for underactuated balance robots. Compared with the work in Refs. [4] and [23], the proposed method takes advantage of the attractive EIC modeling properties for control design and does not use MPC that requires high computational demands. Compared with other learning control methods such as reinforcement learning, the proposed control integrates the robot's dynamics property (i.e., EIC structure) and the GP-based model learning. By integrating physics knowledge into model learning, we identify the conditions for nominal model selection, and the proposed control is designed with guaranteed performance. This paper is an extension of the previous conference submission [24] with new design, analysis, and experiments. Particularly, the NEIC-based control design and experiments were not presented in Ref. [24].
The rest of the paper is outlined as follows. We introduce the EIC-based control and present the problem statement in Sec. 2. Section 3 presents the GP-based robot dynamics. The PEIC- and NEIC-based controls are presented in Sec. 4. The stability analysis is discussed in Sec. 5. The experimental results are presented in Sec. 6, and finally Sec. 7 summarizes the concluding remarks.
2 External and Internal Convertible-Based Robot Control and Problem Statement
2.1 Robot Dynamics and External and Internal Convertible-Based Control.
for actuated () and unactuated () subsystems, respectively. Subscripts “aa (uu)” and “ua (au)” indicate the variables related to the actuated (unactuated) coordinates and coupling effects, respectively. For presentation convenience, we introduce , and , and the dependence of D, C, and G on q and is dropped. Subsystems and are also referred to as the external and internal subsystems, respectively [4,17].
The control goal is to steer actuated coordinate to follow a given desired trajectory for , while the unactuated, unstable subsystem is balanced at unknown equilibrium . Therefore, we need to estimate in real time to achieve simultaneously trajectory tracking (for ) and platform balance (for ). It is noted that not all arbitrary trajectories can be followed given the underactuated dynamics and balance requirement. Such a property has been explicitly discussed for the autonomous bikebot example in Ref. [25]. In this work, we assume that the given trajectory is well planned and the control exists. In this work, we assume that the given trajectory is well planned and the control exists. Designing and planning feasible trajectory is out of the scope of this work.
where . is obtained by inverting . Obtaining requires accurate system dynamics and needs to invert the nonminimum phase dynamics , which is challenging for noncausal control design.
where is used as the virtual control input in , that is, under .
Figure 1(a) illustrates the above sequential EIC-based control design. It has been shown in Ref. [17] that the control guarantees both and convergence to a neighborhood of the origin exponentially if the high-order approximation terms of the closed-loop systems are affine with error e. Therefore, the EIC-based control achieves trajectory tracking for and balancing task for simultaneously.
2.2 Motion Property Under External and Internal Convertible-Based Control.
Control design (5) uses a mapping from low-dimensional (m) to high-dimensional (n) spaces (i.e., ). Under control (6) with properly selected control gains, it has been shown in Ref. [17] that there exists a finite time T > 0, and for small number for t > T. Therefore, given the negligible error, we obtain .
where and are unitary orthogonal matrices. and with singular values . We partition V into the block matrix and . Since , the null space of is .
where , and . Note that still serves as a complete set of generalized coordinates for . Using the new coordinate , we have the following motion property under the original EIC-based control for , and the proof is given in Appendix A1.
No control input appears for coordinates in as shown in Eq. (9b), and only m actuated coordinates in are under active control, as shown in Eq. (9a). The results in Lemma 1 reveal the motion property of under the original EIC-based control design. The uncontrolled motion happens to a special set of underactuated balance robots under the conditions in Lemma 1. If the unactuated motion is only related to m (out of n) control inputs, the motion (9b) vanishes, and the EIC-based control works well. In Ref. [5], the EIC-based control worked properly for the rotary inverted pendulum with . In Refs. [4] and [25], the EIC-based control also worked well for the bikebot with n = 2 (planar motion) and m = 1 (roll motion) but the roll motion depends on steering control only, that is, no velocity control, and therefore, does not satisfy the condition for Lemma 1. We will show an example of the three-link inverted pendulum platform that demonstrates the uncontrolled motion under the original EIC-based control in Sec. 6.
With the above-discussed motion property under the EIC-based control, we consider the following problem.
Problem Statement: The goal of robot control is to design an enhanced EIC-based learning control to drive the actuated coordinate to follow a given profile and simultaneously the unactuated coordinate to be stabilized on the estimated . The uncontrolled motion presented in Lemma 1 should be avoided for robot dynamics (2).
3 Gaussian Process-Based Robot Dynamics Model
We build a GP-based robot dynamics model that will be used for control design in Sec. 4.
3.1 Gaussian Process-Based Robot Dynamics Model.
where for i = j, and are hyperparameters.
We build GP models to estimate , where and are for and , respectively. The training data are sampled from as and .
To quantify the GP prediction error, the following property for is obtained directly from Theorem 6 in Ref. [26].
wheredenotes the probability of an event,, and its ith entry is. A similar conclusion holds forwith.
3.2 Nominal Model Selection.
The nominal model plays an important role in the EIC control. We consider the following conditions for choosing the nominal model to overcome the uncontrolled motion under the learning control.
: is positive definite, , where constants ;
: ; and
: nonconstant kernel of .
With and , the generalized inversions of , and exist, which are used to compute the auxiliary controls. We can select to ensure . To see the requirement of , we rewrite . By Eq. (9), under the updated control , where is the ith column of V. Note that the part of dynamics is free of control if V is constant. Although is stabilized on converges to only in an m-dimensional subspace and the other dimensional motion uncontrolled. If the system is stable, the uncontrolled motion cannot be fixed in the configuration space throughout the entire control process. Therefore, a nonconstant kernel is needed.
Conditions – provide sufficient nominal model selection criteria. The commonly used nominal model in Refs. [5] and [7] is with . The constant nominal model is used in Ref. [7] as the system is fully actuated. It is not difficult to satisfy the nominal model conditions in practice. First, the nonlinear term is canceled by feedback linearization, and can be used. Matrix captures the robots' inertia property. The mass and length of robot links are usually available or can be measured. Meanwhile, the dynamics coupling for revolute joints shows up in the inertia matrix as trigonometric functions of the relative joint angles. Therefore, the diagonal elements can be filled with mass or inertia estimates, and the off-diagonal entries can be constructed with trigonometric functions multiplying inertia constants.
4 Gaussian Process-Enhanced External and Internal Convertible-Based Control
In this section, we propose two enhanced controllers using the GP model , i.e., PEIC- and NEIC-based control. The PEIC-based control aims to eliminate uncontrolled motion under the original EIC-based control by reassigning the dynamics coupling, while the NEIC-based control directly manages the uncontrolled motion in a transformed space; see Figs. 1(b) and 1(c).
4.1 Robust Auxiliary Control.
where and are control gains with parameters . The variance of GP prediction captures the uncertainty in robot dynamics and is updated online with sensor measurements.
where is the unactuated subsystem tracking error relative to the estimated BEM. Similar to and depend on with the parameters by .
for constants , where denotes the eigenvalue operator.
Note that , and is overactuated given . If depends on the same number of control inputs, column vectors in should be zero. Thus, the EIC-based control is applied between the same number of actuated and unactuated coordinates. The uncontrolled motion is avoided.
4.2 Partial External and Internal Convertible-Based Control Design.
where , and . Apparently, is virtually independent of , and the dynamics coupling exists only between and .
where . Clearly, the unactuated subsystem only depends on (or ) under the PEIC design as illustrated in Fig. 1(b). The following lemma presents the qualitative assessment of the PEIC-based control, and the proof is given in Appendix A2.
Lemma 3. If conditionstoare satisfied andis stable under the EIC-based control design,is stable under the PEIC-based control.
4.3 Null-Space External and Internal Convertible-Based Control Design.
where is the control design that drives pai to , and is transformed reference trajectory. The design of drives to the origin in . A straightforward yet effective design of can be , where . Compared to the PEIC-based control, plays the similar role of coordinates. In the new coordinate, the is associated with only.
The following result gives the property of the NEIC-based control, and the proof is given in Appendix A3.
Lemma 4. For, ifsatisfies conditionstoandis stable under the original EIC-based control,under the NEIC-based controlis also stable. Meanwhile,is unchanged compared to that under the EIC-based control.
The proofs of Lemmas 3 and 4 show that the inputs and follow the control design guidelines. Both the PEIC- and NEIC-based controllers preserve the structured form of the EIC design. Figures 1(b) and 1(c) illustrate the overall flowchart of the PEIC- and NEIC-based control design, respectively. To take advantage of the EIC-based structure, we follow the design guideline to make sure that motion of unactuated coordinates only depends on m inputs in configuration space (PEIC-based control) or transformed space (NEIC-based control). The input is re-used for uncontrolled motion under the NEIC-based control. The PEIC-based control assigns the balance task to a partial group of the actuated coordinates.
5 Control Stability Analysis
5.1 Closed-Loop Dynamics.
Obtaining BEM with Eq. (17) under is equivalent to inverting Eq. (21c). Thus, . Substituting the above equation into the dynamics yields , where and denotes the higher order terms.
with , and .
where .
where is the residual that contains higher order terms. denotes the total perturbations.
where , and .
5.2 Stability Results.
for given positive definite matrix , where is the constant part of A in Eq. (24) and does not depend on variances or . and .
We denote the corresponding Lyapunov function candidates for the NEIC- and PEIC-based controls as V1 and V2, respectively. The stability results are summarized as follows with the proof given in Appendix A4.
and the error e converges to a small ball around the origin, where γi is the convergence rate, ρi and are the perturbation terms, and .
6 Experimental Results
Two inverted pendulum platforms are used to conduct experiments to validate the control design. The results from each platform demonstrate different aspects of the control design.2
6.1 Two Degree-of-Freedom Rotary Inverted Pendulum
Figure 2(a) shows a 2DOF rotary inverted pendulum that was fabricated by Quanser Inc., Markham, ON, Canada. The base joint (θ1) is actuated by a DC motor, and the inverted pendulum joint (θ2) is unactuated, i.e., . We use this platform to illustrate the original EIC-based control and also compare the performance under different nominal models and controllers. The robot dynamic model is given in Ref. [27] and is also found in Appendix B1.
where for angle θi, i = 1, 2. The training data were sampled and obtained by applying control input , where and was the combination of sinusoidal waves with different amplitudes and frequencies. We chose this input to excite the system, and the gain k was selected without the need to balance the platform. It is difficult to guarantee that the system is fully excited. However, we changed the frequency of sinusoidal waves and obtained the motion data around the target trajectory.
We trained the GP regression models using a total of 500 data points randomly selected from a large dataset. We designed the control gains as , and . The variances Σa and Σu were updated online with new measurements in real time. The reference trajectory was rad. The control was implemented at 400 Hz in matlab/simulink real-time system. Both the velocity and acceleration are needed for control design and GP training and prediction. To reduce the influence of measurement noise on control design, BEM estimation, and GP agent training, a sliding window was used to filter the velocity measurement online. The acceleration was obtained through real-time differentiation. The same technique was also used for the three-link inverted pendulum in Sec. 6.2.
Figures 3(a) and 3(b) show the tracking of θ1 and balance of θ2 under the EIC-based control. With either or , the base link joint θ1 closely followed the reference trajectory , and the pendulum link joint θ2 was stabilized around its equilibrium as well. The tracking error was reduced further, and the pendulum closely followed the small variation under . With , the tracking errors became large when the base link changed rotation direction; see Fig. 3(c) at t = 10, 17, and 22 s. Both the time-varying and constant nominal models worked for the EIC-based learning control.
Table 1 further lists the tracking errors (mean and one standard deviation) under both GP models. For comparison purposes, we also conducted additional experiments to implement the original EIC-based control and the GP-based MPC design in Ref. [4]. The tracking and balance errors under the EIC-based learning control with model are the smallest. In particular, with the time-varying model , the mean values of tracking errors and e2 were reduced by 75% and 65%, respectively, in comparison with those under the original EIC-based control. Compared with the MPC method in Ref. [4], the tracking errors with nominal model are at the same level.
Figure 3(d) shows the control performance with nominal model under disturbance. At t = 17 s, an impact disturbance (by manually pushing the pendulum link) was applied, and the joint angles changed rapidly with rad and rad. The control gains increased () to respond to the disturbance. As a result, the pendulum motion tracked the BEM closely and maintained the pendulum balance after the impact disturbance. Figure 3(e) shows the calculated Lyapunov function candidate V(t) and its envelope (i.e., ) during the experiment. Figure 3(f) shows the error trajectory in the – plane. The solid/dashed line shows the error trajectory before/after impact disturbance. The tracking error converged quickly into the error bound. After the disturbance was applied at t = 17 s, both the Lyapunov function and errors grew dramatically. As the control gains increased, the errors quickly converged back to the estimated bound again.
6.2 Three Degree-of-Freedom Rotary Inverted Pendulum.
where . The control gains were , where GP variances and Σu were updated online in real-time. The reference trajectory was chosen as and rad.
For the PEIC-based control, we chose and , and the NEIC-based control was . Figure 4 shows the experimental results under the PEIC- and NEIC-based control. Under both controllers, the actuated joints (θ1 and θ2) followed the given reference trajectories ( and ) closely, and the unactuated joint (θ3) was balanced around the BEM () as shown in Figs. 4(a) and 4(b). The pendulum link motion displayed a similar pattern for both controllers. However, the tracking error e1 under the PEIC-based control (i.e., from −0.05 to 0.05 rad) was much smaller than that under the NEIC-based control (i.e., from −0.15 to 0.15 rad); see Figs. 4(c) and 4(d). The balance task in the PEIC-based control was assigned to joint θ2, and joint θ1 is viewed as virtually independent of θ2 and θ3. Joint θ1 achieved almost-perfect tracking control regardless of the errors for θ2 and θ3. The compensation effect in the null space appeared in the entire configuration space, and any motion error in the unactuated joints affected the motion of all actuated joints. Similar to the previous example, Fig. 4(e) shows the error trajectory profile in the – plane. Figure 4(f) shows the Lyapunov function profiles under the PEIC- and NEIC-based controls.
Figure 5 shows the motion of the actuated coordinate in the transformed coordinate under various controllers. Under the PEIC- and NEIC-based controls, the variables followed the reference profile as shown in Figs. 5(a) and 5(b). Figure 5(c) shows the motion profile under the original EIC-based control. In the first 2 s, joint θ3 followed the BEM under the EIC-based control, and coordinates displayed a similar motion pattern. However, coordinate showed diverge behavior and led to a failure completely. Therefore, as analyzed previously, the system became unstable under the EIC-based control though conditions to were satisfied.
In NEIC-based control, drives the uncontrolled motion variable to its reference trajectory. To further reduce the tracking error, we can increase α values. Figure 6 shows the experiment results of the error profiles under various α values varying from 0.5 to 1.5. With a large α value, the tracking error of the actuated coordinates was reduced. Table 2 further lists the steady-state errors (in joint angles) under the NEIC-based control with various α values, the PEIC-based control and the physical model-based control design. Under the NEIC-based control with , the system was stabilized; when increasing α values to 1 and 1.5, the mean tracking errors were reduced 50% and 70% for θ1, respectively, and 40% for θ2. Since control input did not affect the balance task of the unactuated subsystem, the tracking errors for θ3 maintained the same level. It is of interest that the control effort (i.e., last column in Table 2) only shows a slight increase with large α values.
(rad) | (rad) | (rad) | |||
---|---|---|---|---|---|
PEIC (GP) | 0.0302 ± 0.0178 | 0.0566 ± 0.0685 | 0.1182 ± 0.0160 | 0.1343 ± 0.0166 | 5.7659 |
NEIC (GP, ) | 0.1395 ± 0.0946 | 0.1166 ± 0.0512 | 0.0303 ± 0.0209 | 0.2001 ± 0.0770 | 5.9022 |
NEIC (GP, ) | 0.0756 ± 0.0481 | 0.0195 ± 0.0152 | 0.1101 ± 0.0499 | 5.7089 | |
NEIC (GP, ) | 0.0376 ± 0.0302 | 0.0792 ± 0.0482 | 0.0207 ± 0.0169 | 0.0972 ± 0.0470 | 5.7305 |
PEIC (model) | 0.2168 ± 0.1165 | 0.2398 ± 0.1649 | 0.0179 ± 0.0140 | 0.3587 ± 0.1307 | 5.7978 |
NEIC (model, ) | 0.1374 ± 0.0922 | 0.1237 ± 0.0597 | 0.0455 ± 0.0385 | 0.2095 ± 0.0769 | 5.8452 |
(rad) | (rad) | (rad) | |||
---|---|---|---|---|---|
PEIC (GP) | 0.0302 ± 0.0178 | 0.0566 ± 0.0685 | 0.1182 ± 0.0160 | 0.1343 ± 0.0166 | 5.7659 |
NEIC (GP, ) | 0.1395 ± 0.0946 | 0.1166 ± 0.0512 | 0.0303 ± 0.0209 | 0.2001 ± 0.0770 | 5.9022 |
NEIC (GP, ) | 0.0756 ± 0.0481 | 0.0195 ± 0.0152 | 0.1101 ± 0.0499 | 5.7089 | |
NEIC (GP, ) | 0.0376 ± 0.0302 | 0.0792 ± 0.0482 | 0.0207 ± 0.0169 | 0.0972 ± 0.0470 | 5.7305 |
PEIC (model) | 0.2168 ± 0.1165 | 0.2398 ± 0.1649 | 0.0179 ± 0.0140 | 0.3587 ± 0.1307 | 5.7978 |
NEIC (model, ) | 0.1374 ± 0.0922 | 0.1237 ± 0.0597 | 0.0455 ± 0.0385 | 0.2095 ± 0.0769 | 5.8452 |
6.3 Discussion.
For the rotary pendulum example, we have n = m, and the null space vanishes. The compensation effect is no longer needed by the NEIC-based control, i.e., and . In this case, the PEIC- and NEIC-based controls are degenerated to the EIC-based control. For the 3DOF inverted pendulum, the control inputs u1 and u2 act on θ3 joints through and . Therefore, as shown in Lemma 1, the uncontrolled motion exists since all controls show up in dynamics. This observation explains why the original EIC-based control failed to balance the three-link inverted pendulum. If the dynamics is related to m control inputs (through ) for n > m such as the bikebot dynamics in Refs. [4] and [25], only m external controls were updated, and the EIC-based control worked well without any uncontrolled motion.
For the PEIC-based control, the robot dynamics were partitioned into , which contains a fully actuated system , and a reduced-order underactuated system . The EIC-based control is applied to and only. The dynamics of in general does not depend on any specific m actuated coordinates, since the mapping is time-varying across different control cycles. In the NEIC-based control design, and become an underactuated subsystem, and is fully actuated.
In practice, no specific rules are defined to select out of coordinates, and therefore, there are a total of options to select different coordinates. We take advantage of such a property to optimize tracking performance for selected coordinates. In the 3DOF pendulum case, we assigned the balance task of θ3 to θ2 motion. The length of link 1 was only 0.09 m and was much shorter than the length of link 2 (0.23 m). The coupling effect between θ2 and θ3 was much stronger than that between θ1 and θ3; see D13 and D23 in Appendix B2. Thus, it was efficient to use the motion of θ2 as a virtual control input to balance θ3. When implementing the PEIC-based controller with , the system cannot achieve the desired performance and becomes unstable. We also implemented the proposed controller with the physical model. The control errors are listed in Table 2. Compared with the learning-based controllers, the model-based control resulted in larger errors. Since the mechanical frictions and other unstructured effects were not considered, the physical model might not capture and reflect the accurate robot dynamics. The results confirmed the advantages of the proposed learning-based control approaches.
The unique feature of the proposed control lies in integration of the robot's inherent dynamics property (EIC structure) and the GP-based model learning, compared with other learning-based control approach [18,22]. By integrating physics knowledge into model learning, we identified the conditions for nominal model selection. The overall model learning and control design framework forms a white-box-like, physics knowledge involved control, which differs from the reinforcement learning-based policy search approach [18]. The solution also has the potential to further incorporate the bounded GP prediction error for a robust control [4].
7 Conclusion
This paper presented a new learning-based modeling and control framework for underactuated balance robots. The proposed design was an extension and improvement of the EIC-based control with GP-enabled robot dynamics. The proposed new robot controllers preserved the structural design of the original EIC-based control and achieved both tracking and balance tasks. The PEIC-based control reshaped the coupling between the actuated and unactuated coordinates. The robot dynamics was transferred into a fully actuated subsystem and one reduced-order underactuated balance subsystem. The NEIC-based control compensated for uncontrolled motion in a subspace. We validated and demonstrated the new control design on two experimental platforms and confirmed that stability and balance were guaranteed. The comparison with the physical model-based EIC control and the MPC design confirmed superior performance in terms of the error bound. Extension of the GP-based learning control design for highly underactuated balance robots is one of the ongoing research directions.
Funding Data
U.S. National Science Foundation (NSF) (Award No. CNS-1932370; Funder ID: 10.13039/100000001).
Data Availability Statement
No data, models, or code were generated or used for this paper.
Nomenclature
- =
tracking, balance, and overall errors
- =
transformed and in p coordinates
- =
controlled and uncontrolled coordinates
- =
coordinates for actuated and unactuated subsystems
- =
partitioned actuated coordinates in (n − m)- and m-dimensions
- =
actual and estimated BEMs
- =
robot dynamics
- =
nominal and GP-based robot dynamics
- =
EIC-, PEIC-, NEIC-based control inputs
- =
trajectory tracking and balanced-embedded control inputs
- =
BEM stabilization control input
- =
trajectory tracking control inputs for and
- =
control input for
- =
convergence rate and error bound
- =
estimation errors of actuated and unactuated dynamics
Appendix A: Proofs
A1 Proof of Lemma 1.
where is used based on the fact that is a rectangular diagonal matrix.
The BEM depends only on , that is, the control effect in is not used when obtaining the BEM.
A2 Proof of Lemma 3.
where . Since is not obtained in the way as in Eq. (5), i.e., and is under active control. Meanwhile, drives in , given that and are designed to drive . Therefore, if the unperturbed system under the original EIC-based control is stable, it is also stable under the PEIC-based control.
A3 Proof of Lemma 4.
Clearly, dynamics is unchanged compared to Eq. (9).
A4 Proof for Theorem 1.
We present the stability proof for the PEIC- and NEIC-based controls using the Lyapunov method.
PEIC-Based Control: Plugging Eq. (24) into and considering Eq. (32), we obtain , where . The bounded variance leads to the bounded eigenvalue of matrix . Given the fact that , the eigenvalues of are real numbers.
. With the bounded perturbations ρ1 and ω1, the closed-loop system dynamics can be shown stable in probability as . Taking further analysis, we obtain a nominal estimation of the error convergence as and the error bound estimation with .
NEIC-Basd Control: Without the loss of generality, we select . We take as the Lyapunov function candidate for . If the control gains are the same as that in the PEIC-based control and α = 1 for compensation effect, . We choose control gains properly such that . The system can be shown stable as , where , and is defined same as ω1 containing the GP prediction uncertainties. A nominal estimation of error convergence and final error bound can also be obtained.
To show , i = 1, 2, the control gains should be properly selected. With a small predefined error limit as a stop criterion in BEM estimation, ci values can be shown as . Given the explicit form, di are estimated for and Q, P is obtained by solving Eq. (32). The matrix depends on the control gains associated with the reduction variance. Since the variance is bounded, we design such that satisfies the inequality and then . Thus, the stability is obtained.
Appendix B: Dynamics Model of Underactuated Balance Robots
B1 Rotary Inverted Pendulum.
where lr, Jr, and dr are the length, mass inertia, and viscous damping coefficient of the base link, lp, Jp, and dp are corresponding parameters of the pendulum, mp is the pendulum mass, g is the gravitational constant, and are robot constant. The values of these parameters can be found in Ref. [27]. The control input is the motor voltage, i.e., u = Vm.
B2 Three-Link Inverted Pendulum.
where mi, li, and Ji are the mass, length, and mass inertia of each link, and . Matrix C is obtained as , where Christoffel symbols . The physical parameters are kg, kg, kg, m, m, m, kg m2, kg m2, and kg m2.
Footnotes
The video of the experiment is available at https://www.youtube.com/watch?v=ZOYb0UW3KS8