## Abstract

Standing balance is a simple motion task for healthy humans but the actions of the central nervous system (CNS) have not been described by generalized and sufficiently sophisticated control laws. While system identification approaches have been used to extracted models of the CNS, they either focus on short balance motions, leading to task-specific control laws, or assume that the standing balance system is linear. To obtain comprehensive control laws for human standing balance, complex balance motions, long duration tests, and nonlinear controller models are all needed. In this paper, we demonstrate that trajectory optimization with the direct collocation method can achieve these goals to identify complex CNS models for the human standing balance task. We first examined this identification method using synthetic motion data and showed that correct control parameters can be extracted. Then, six types of controllers, from simple linear to complex nonlinear, were identified from 100 s of motion data from randomly perturbed standing. Results showed that multiple time-delay paths and nonlinear properties are both needed in order to fully explain human feedback control of standing balance.

## 1 Introduction

Standing balance is a simple motion task which allows researchers to easily investigate how the human central nervous system (CNS) uses feedback control. Understanding the CNS in human motions can not only help clinical treatments of movement disorders but also can inspire the design of controllers for assistive devices, such as exoskeletons and prostheses, to generate human-like movements. Numerous studies have treated the CNS in human standing balance as a postural feedback controller and system identification methods have been used to find its mathematical model [1–13]. In the majority of these studies, external mechanical stimuli, such as push/pull forces and standing platform motions, were used to evoke participants' body sway motions at variety statuses. Indirect identification approaches were normally used, in which a closed-loop mathematical model of the human standing balance system is required. It has been reported that the indirect approach (closed-loop system identification) can avoid the bias caused by the open-loop identifications that just rely on the information of CNS inputs (joint motions) and outputs (joint torques) [14,15]. Both nonparametric and parametric system identification methods have been used. However, limitations exist in both study directions which prevented the identification of good generalized models of the postural control system in the CNS.

In nonparametric postural controller identifications, the CNS is described by a frequency response function (FRF) [11,12,16]. The frequency domain identification method was often used to find the FRF of the closed-loop model, including the CNS that can best explain the experimental data [15,17–20]. Because the length of data does not matter when transferring to frequency domain through the Fourier transformation [21], this approach can be easily applied on long duration standing balance data that recorded from experiments where the random or multisine external perturbations were applied. In general, longer duration motion data can provide more information about the CNS, which allows identification of more complex, or more generalized models. However, the frequency domain approach requires multiple perturbation sources to identify a multi-input and multi-output system [21], which is typical for human standing balance task since at least the ankle and hip strategies were used. Developing the hardware to provide perturbations across multiple body segments is difficult for both research and clinical applications. In addition, this approach treats the system as linear, which eliminates the ability to identify nonlinear properties of the CNS. Nonlinear effects, such as threshold and saturation, are known to exist in sensory and spinal circuits. Furthermore, the human body has many nonlinear components, e.g., nonlinear multibody dynamics and nonlinear muscle properties. Studies have shown that healthy humans use different feedback control gains to control posture under different amplitudes of external perturbations [4,7], indicating that the generalized standing balance controller is nonlinear.

Parametric identification has been used to find gains and other parameters in a predefined control structure that best explain the experimental data. It has the advantage that a single stimulus or perturbation source can be used [22]. Trajectory optimization can be used to solve this identification problem [4], to find controller parameters that best explain the measured output of the system. In this approach, both the plant dynamics and the CNS controller can be nonlinear since identification is performed in the time-domain. However, only simple linear controllers have been identified, using data from short duration balance motions that were perturbed by short ramp displacements [4,7,23]. One difficulty is that the shooting method [24] used in these studies have difficulty with long duration trajectory optimizations, as the standing balance is naturally an unstable system. Eigenvalue constraints have been used which increased the chance of getting stable motions [4], but these are not necessary conditions for the stability of nonlinear systems. Shooting methods also have a tendency to find local optima [25] when data have detailed features in the time domain. Identifications on long duration data are essential to get generalized motion controllers that can explain human responses in different circumstances, such as under multiple ramp perturbations that have different amplitudes. Only recently, parametric identification has been done on long duration randomly perturbed balance data [22]. However, the objective was to compare the FRFs in the frequency domain, which still requires the assumption that the system is linear.

The goal of this study is to show that trajectory optimization with the direct collocation method can identify generalized and nonlinear standing balance controllers from long duration motion data. Direct collocation has recently been used in predictive simulations of human movement [26–28], performing better than shooting methods, but to our knowledge, has not been used for identification of human feedback control. Therefore, in this paper we first validate this method by doing identification on simulated data where the feedback control parameters are known (simulation study). In the simulation study, we also investigated the effect of data length on the accuracy of identified control gains. Then, we identify six types of feedback controller, from simple linear to complex nonlinear, on 100 s of measured human standing balance motions (experimental study).

## 2 Methods

An indirect identification approach was used in this study [3,4,17,22], in which a mathematical model was created to represent the standing balance system and an optimization method was used to fit to experimental data by optimizing the model parameters. The mathematical model was a closed-loop system, where a body dynamics (plant) model and a feedback controller model were both included. Body dynamics was simplified as a double-link pendulum without the knee joint, since ankle and hip strategies are mostly used for standing balance [29,30]. Equations of the body dynamics model can be found in Appendix A. The feedback control parameters *P* were optimized to produce the best fit between the output of the simulated system and the experimental data (Fig. 1).

### 2.1 Standing Balance Experiments.

Experiments were performed on eight able-bodied participants (seven males, one female, age 18–35 years) with approval from the Institutional Review Board of Cleveland State University (study No. IRB-FY2018-40). Participants were asked to stand on an R-Mill instrumented treadmill (Forcelink, Amsterdam, the Netherlands) with their arms crossed in front of their chest and instructed to keep balance without taking a step. The perturbation consisted of random anterior–posterior platform motions. The perturbation signal was designed using random square pulses with five position values ([−5, −2.5, 0, 2.5, 5] cm), and six durations ([0.25, 0.5, 0.75, 1.0, 1.25, 1.5] s). Amplitudes and durations were randomly combined to generate a 300 s perturbation signal. This random signal was sent to the treadmill control module in the d-flow software (Version 3.26.0, Motek, Amsterdam, the Netherlands) and executed through the treadmill's “sway” actuation. Twenty-seven reflective markers [31] were placed on each participant to record their reactions using a 10-camera motion capture system (Osprey, Motion Analysis Corp., Rohnert Park, CA). Hip and ankle joint angles were calculated from these recorded marker data after applying a double second-order Butterworth filter with the cutoff frequency of 16 Hz. Dynamic parameters of the human body model were generated for each participant, in which the segment lengths were calculated based on the recorded marker data and mass properties were calculated based on the body weight and the anthropometry table of Winter [32]. Five markers were placed on the rigid frame of the treadmill to record the executed sway motion. After applying the 16 Hz low pass filter, acceleration was calculated using a three-point finite difference on the averaged coordinates of these five markers [33]. All data were sampled at 100 Hz.

The experiment consisted of a 300 s quiet standing trial, followed by a 300 s perturbation trial. Then participants were asked to sit down and rest for 300 s. After the rest period, a 300 s perturbation trial, using the same perturbation as the previous trial, was recorded. Finally, participants were asked to do another trial of 300 s quiet standing. General information of the participants are shown in Table 1. The detailed description of the experiment, collected data, data processing, as well as its preliminary analysis were included in the Zenodo project.^{2}

Id | Gender | Age (yr) | Height (m) | Mass (kg) |
---|---|---|---|---|

1 | Male | 22 | 1.60 | 74.29 ± 0.26 |

2 | Female | — | — | 48.37 ± 0.21 |

3 | Male | 18 | 1.80 | 79.12 ± 0.20 |

4 | Male | 27 | 1.78 | 63.10 ± 0.16 |

5 | Male | 32 | 1.79 | 70.56 ± 0.19 |

6 | Male | 35 | 1.65 | 58.24 ± 0.27 |

7 | Male | 28 | 1.75 | 68.75 ± 0.17 |

8 | Male | 27 | 1.63 | 60.33 ± 0.19 |

Id | Gender | Age (yr) | Height (m) | Mass (kg) |
---|---|---|---|---|

1 | Male | 22 | 1.60 | 74.29 ± 0.26 |

2 | Female | — | — | 48.37 ± 0.21 |

3 | Male | 18 | 1.80 | 79.12 ± 0.20 |

4 | Male | 27 | 1.78 | 63.10 ± 0.16 |

5 | Male | 32 | 1.79 | 70.56 ± 0.19 |

6 | Male | 35 | 1.65 | 58.24 ± 0.27 |

7 | Male | 28 | 1.75 | 68.75 ± 0.17 |

8 | Male | 27 | 1.63 | 60.33 ± 0.19 |

### 2.2 Simulated Motion Data.

The identification method was first validated through a simulation study. The closed-loop mathematical model in the indirect identification platform was used to generate simulated data. The dynamic parameters of participant 4 were used for the double-link pendulum model. The perturbation input was the measured platform acceleration in the first perturbation trial of participant 4. The postural feedback controller, a full state feedback proportional-derivative (PD) controller with time delay, was the same as in the identification study by Goodworth and Peterka [22]. This controller is specified in Eq. (3). Pink noise (similar to Goodworth's study) was added at the motion feedback loop of each joint as sensor noise. One hundred seconds of simulation data was generated using the midpoint Euler method [34] and sampled at 100 Hz. Ten different realizations of sensor noise (with different random number seeds) were used to simulate ten different trials. Figure 2 shows one trial of simulated motion data with one realization of sensor noise under the external perturbation.

### 2.3 Controller Identification on Simulated Data.

Controller identifications on the simulated motion data were done through the described indirect approach. Five lengths (10, 30, 50, 70, and 90 s) of the simulation data (Fig. 2) were selected for the identification to check the effects of data length on the identification results. Lower and upper bounds of the identified control parameters were set as 0 and 2 times their true values. Ten optimizations were performed on each controller identification problem (controller identification on each period of data) with random initial guesses within the bounds to increase the chance of finding global optimum. The identification result with the lowest objective function among ten optimizations was selected as the best result for the identification problem and was shown in the results section.

### 2.4 Controller Identification on Experimental Data.

One hundred seconds of experimental data was used from the middle of each perturbation trial to avoid the adaptation period at the beginning, while also minimizing the effect of fatigue at the end. The quiet standing trials were not used. Data from the last six participants (Table 1) were used, since the first two participants were in the preliminary testing of the experiment protocol. In total, 12 periods of experimental data (six participants and two perturbation trials for each participant) were used to identify the standing balance controllers.

Six controller types, from simple linear to complex nonlinear, were identified on each selected data trial. Four of them are linear: the PD controller, the full-state proportional-derivative (FPD) controller, the full-state proportional-derivative feedback with time delay controller (FPDTD) (which was used by Goodworth [22]), and the linear state combination with time delay (LSCTD) controller. The other two are nonlinear: the neural network (NN) controller and the neural network with time delay (NNTD) controller.

The linear controllers were mathematically described as:

where $T1(t)$ and $T2(t)$ are the ankle and hip joint torques; $\theta 1(t)$ and $\theta 2(t)$ are the ankle and hip joint angles; *r*_{1} and *r*_{2} are the reference angles of the ankle and hip joints; *K*_{11} and *B*_{11} are the proportional and derivative gains for the ankle joint; *K*_{22} and *B*_{22} are the proportional and derivative gains for the hip joint; *K*_{21} and *B*_{21} are the proportional and derivative gains for the ankle joint from the hip joint angle and angular velocity feedback; *K*_{12} and *B*_{12} are the proportional and derivative gains for the hip joint from the ankle joint angle and angular velocity feedback; $Kpas1$ and $Kpas2$ are the passive proportional gains for the ankle and hip joints, respectively; $td1$ and $td2$ are the delay time for the ankle and hip joints in the FPDTD controller; $\theta 1(t\u2212i\xb7\Delta t)$ and $\theta 2(t\u2212i\xb7\Delta t)$ are the inputs of the LSCTD controller, representing the ankle and hip joint angles at $i\xb7\Delta t$ time prior to the current time *t*; *K ^{i}* and

*B*are the proportional and derivative gains that multiply with the state at $i\xb7\Delta t$ time prior to the current time

^{i}*t*; and

*D*is the number of delayed state feedback paths. The number of time delays

*D*was 3 and they were multiples of $\Delta t$ = 60 ms.

The NN controller was defined as regular feed-forward neural network [35] with one hidden layer and four hidden nodes. It is a nonlinear controller (policy), since its activation function is the smoothed leaky-ReLU function: $f(x)=x+0.7(x\u2212x2+0.00012)$. Inputs of the NN controller were the states of the closed-loop model (two joint angles and two angular velocities) and a constant node (unit input). The hidden layer included a constant node (unit input) also. Outputs of the NN controller were the two joint torques. This controller had 30 parameters (weights) that were optimized.

The NNTD controller also used the regular feed-forward neural network but with one hidden layer and eight hidden nodes. Inputs of the NNTD controller were the current state information and time-delayed state feedback from 60, 120, and 180 ms prior, as in the LSCTD controller. Outputs of the NNTD controller were the two joint torques. The activation function used was the same as in the NN controller. This controller had 154 parameters (weights) that were optimized.

### 2.5 Trajectory Optimization With Direct Collocation.

*N*discrete time points and the Midpoint Euler approximation for state derivative $x\u02d9$ was used to convert the system dynamics into nonlinear constraints. The controller identification problem can then be written as a large-scale nonlinear program (NLP)

where *h* is the time-step (20 ms); *x _{i}* is the state at node

*i*, comprised of ankle and hip joint angles and angular velocities: $x=[\theta 1,\theta \u02d91,\theta 2,\theta \u02d92]$; $\theta 1n,m,\theta 2n,m$ are the measured joint angles at time point

*n*; $f(x,x\u02d9,P,a)=0$ represents the dynamics of the closed-loop model, with platform acceleration

*a*(

*t*). Details are provided in Appendix A. The objective function, objective gradient, dynamic constraints, and constraint gradients were coded in Python (3.6.0). An interior point optimizer (Ipopt) [39] was used to solve the large-scale nonlinear program. Optimizations of the linear controllers were performed on a local laptop. The other optimizations were performed on a Linux cluster in the Ohio Supercomputer Center [40]. The optimization code can be found in this GitHub repository.

^{3}

For each identification problem, ten optimizations were performed with random initial guesses of control parameters. The motion data were used as initial guess for the state trajectory. An identification problem was defined as the identification of one type of controller on one trial of experimental data. The result with the lowest objective function was selected as the solution for the identification problem. For the NNTD controller type, only one initial guess was used for each identification problem due to the long solution time. In total, 72 identification problems (12 periods of data and 6 types of controllers) were solved, resulting in 612 optimizations.

For the PD and FPD controllers, stochastic identification [41] was used to help identify practically stable controllers by enforcing the closed-loop model generates practically stable motions with multiple episodes of process noise. Two and three episodes were used in the stochastic identifications for the PD and FPD controllers, respectively.

## 3 Results

### 3.1 Identification Results on Simulated Data.

Figure 3 shows the means and standard deviations of the identified control gains among ten realizations of sensor noise (identified gains in each realization of sensor noise were from the best fit result in ten random initial guess optimizations). The leftmost bar in each subplot indicates the “true” value of the corresponding control gain that was used to generate the simulated motion data. The other five bars from left to right in each subplot indicate the identified control gains from five lengths of simulated data (10, 30, 50, 70, and 90 s). In general, the identified control gains were close to the true values, except for the two passive proportional gains. However, the differences between the true values and identified results were small when adding the passive and active proportional control gains together (bottom two subplots). Another significant result was that with the increase of the data length, the standard deviations of identified gains among different realizations of sensor noise were decreasing.

Bias errors and variability across ten noise realizations, for all identified control parameters, are shown in Fig. 4. This result was from the identification of 50 s of simulated data. The bias errors of all identified parameters were less than 2.5% of the true values, except the two passive proportional gains. The variations in identified control gains among ten realizations of noise were below 10% of the averaged values, except for the two passive proportional gains.

### 3.2 Identification Results on Experimental Data.

Figure 5 shows the root-mean-square (RMS) of the measured motions (first two box plots) and the fits (difference) between data and the outputs from the identified closed-loop models with six controller types (other box plots). With increasing controller complexity, the RMS tended to decrease, indicating a better fit. The RMS value of the PD controller was about half of the variation in the data, showing that the PD controller can only explain about 50% of the variance in standing posture. The RMS value of the NNTD controller was about one fourth of the variation in experimental data, showing that the NNTD controller can explain about 75% amplitude of the standing balance motion. Considering there was only one optimization in the NNTD controller identification, whereas the PD controller result was the best of ten optimizations, the RMS of NNTD controller could be even lower. Fits of ankle and hip joint motions of participant 3 are shown in Appendix B.

The identified control gains of the PD, FPD, and FPDTD controllers are shown in Figs. 6–8. Before averaging, gains were normalized to a typical male with weight of 70 kg and height of 1.75 m [42]. Joint torques were normalized by the product of body mass and height; joint angular velocities were normalized by the square root of gravity divided by the body height. In general, the identified feedback control gains were mostly consistent among subjects and two perturbation trials. In all three types of controllers, proportional control gains *K*_{11} ($K11+Kpas1$) of the ankle joint had larger values than the hip joint's proportional gains *K*_{22} ($K22+Kpas2$). The $K21,B21,K12,B12$ in the FPD and FPDTD controller types were significantly different from zero. In the FPDTD controller, the time delay parameter of the ankle joint was smaller than the hip joint's, which is not as expected, as the distance for the nervous signal translation is longer for ankle joint so that larger delay time should exist. The proportional feedback gains *K*_{22} ($K22+Kpas2$) of the hip joint had higher values in the second perturbation trial than the first perturbation trial. Parameter values for the NN and NNTD controllers are not shown because no meaningful interpretation is possible.

## 4 Discussion

### 4.1 Identifications on Simulated Data.

Identifications on simulated data showed that the identification method (trajectory optimization with direct collocation) was able to find correct control parameters. More importantly, the identified control gains had less bias error and variation comparing to Goodworth and Peterka's identification study [22]. One possible reason is that the external stimulus in this study was much larger. The peak value of the stimulus in their study was only 5 N·m. We used a stimulus which the peak value of equivalent torque perturbation reached around 30 N·m for the hip joint and 50 N·m for the ankle joint. In general, large perturbation generates a low noise-to-signal ratio (NSR) which helps find true control parameters. This indicates that the external stimulus used in this study was large enough to collect useful human reaction data from experiments. This is consistent with recent results of Schut et al. [43] who showed that the 10 cm peak-to-peak perturbation amplitude is sufficient to collect low NSR data in a standing balance experiment.

Another possible reason that Goodworth and Peterka's results have larger variation is their frequency-domain approach to parameter optimization. Optimizations were performed to fit the FRF of the closed-loop model to the FRF of the experimental data. It is hard to calculate the analytical derivatives of the FRF with respect to the identified parameters, which limited the optimization performance. In our work, the gradient of the objective function and the Jacobian matrix of dynamic constraints were provided analytically helped the optimizer quickly find good solutions.

### 4.2 Identifications on Experimental Data.

The RMS fit errors for the six controller types suggest that both multiple time delay paths and nonlinear properties are needed to fully explain the human standing balance control under random external perturbations. The FPD controller had lower RMS than the PD controller indicating that cross-joint feedback is important for human standing balance. However, the FPDTD controller did not have a significant lower RMS than the FPD controller, suggesting that the additional time delay component was not sufficient to better explain the CNS. The LSCTD controller had a more generalized time delay structure, which generated better fit with experimental data than the FPDTD controller with one time delay for each torque. The NN controller also had lower RMS than the FPD controller, showing the importance of nonlinearity in the controller. Finally, the NNTD controller had both generalized time delay and nonlinearity, which resulted the lowest RMS among all six types of controllers. This helps explain previous studies [4,44] which found that control gains depended on the amplitude of the ramp perturbations, suggesting nonlinearity.

Feedback control gains identified in this study were similar to those found by Park et al. [4]. The proportional control gains were larger for ankle torque than for the hip torque in all shown controller types, possibly because both need to balance the large trunk mass, which is farther away from the ankle. The nonzero cross-joint control gains confirmed that humans use cross-joint feedback to control standing balance. Cross-joint feedback is likely neural, rather than due to elastic properties of two-joint muscles because there are no muscles that cross both hip and ankle.

Identification results from the second perturbation trial were generally more consistent between participants than those from the first perturbation trial. We observed that participants had a relatively smaller body sway motion in second trial, indicating that participants had adapted and no longer needed to explore to optimize their performance. This learning effect may also be responsible for the larger hip feedback gains in the second perturbation trial.

Due to the long computation time, only one optimization was applied in the NNTD controller identification, possibly resulting in a local minimum. With a more thorough optimization, the fit might even have been better. Neural network controllers have a general nonlinear property which is good for testing the potential of nonlinear controllers. However, it may not be a good controller in practice because the control performance is not predictable beyond the observed range of joint motions. Controller performance could be evaluated by simulating the identified system with other perturbation inputs that were not used for the identification and comparing to measured motions. This was not included in this study because the NNTD controller was just used to demonstrate the ability of the proposed identification method. For practical nonlinear controllers, one might also use deterministic nonlinear structures, such as polynomial functions, in which the identified controller parameters have a physical meaning and their performance is more predictable when operating in different conditions.

The identified cross-joint control gains *K*_{12} in the FPD controller type were mostly negative, which suggesting a positive cross-joint feedback. However, the cross-joint feedback gains were all positive in the identified FPDTD controllers. This might be because the FPD controller is too simple to represent the human CNS. We would suggest future studies can examine to what extent simple models can extract unbiased information about the control system.

The identified controllers in the experimental study were all from male participants (the last six). Thus, the control parameters showed in this paper may not represent the overall properties of the standing balance CNS of all humans due to the potential differences across gender. However, this paper is mainly focusing on proposing a new identification method. The standing balance controllers of females can be directly identified through the proposed identification method once the motion data of female participants was collected.

In the experimental study, only the middle period (150–250 s) of the recorded balance data was used for the controller identification. This was done because each trial started with a slow drift in the posture, possible due to an adaptation process. This was needed because we assume constant controller parameters. The initial period, however, may contain important information that is clinically relevant. Proper identification would require modeling the dynamics of the adaptation process, which is conceptually possible, but beyond the scope of this study and left for future work.

The identification approach used in this study does not require assuming a linear model and can extract human postural controllers from a long duration experimental data. As far as we aware, it is the first time that complex nonlinear controllers were identified from 100 s of standing balance data under random perturbations. This method can be applied to more complex controllers and motion tasks, such as identification of balance control during human walking. We would like to encourage others who are trying to understand the human motion control using this identification method to explore complex control structures from complex human motions.

## 5 Conclusions

In this work, we showed that trajectory optimization with the direct collocation method can correctly identify control parameters from synthetic motion data. Using 100 s of standing balance kinematic data, under random perturbations, six types of postural controllers, from simple linear to complex nonlinear, were identified. Identification results suggested that the human CNS, even in the simple standing balance task, has nonlinear properties and multiple time delay paths.

## Acknowledgment

Thanks go to Dr. Jason Moore also, for the fundamental work of the postural controller identification and toolboxes that he developed and contributed: Opty, Sympy, and PyDy.

## Funding Data

National Science Foundation (Grants Nos. 1344954 and 1544702; Funder ID: 10.13039/100000001).

## Footnotes

### Dynamic Model of the Human Standing Balance

where *θ*_{1} and *θ*_{2} represent the ankle and hip joint angles, respectively; *l _{L}* represents the length of leg;

*m*and

_{L}*m*represent the masses of leg and trunk;

_{T}*d*and

_{L}*d*represent the center of mass location in leg and trunk; $\tau 1$ and $\tau 2$ represent the joint torques at ankle and hip joints;

_{T}*g*is the gravity; and

*a*is the acceleration of the standing plate due to external perturbation, defined positive in the posterior direction.

where *f*_{1} and *f*_{2} are the control equations of the ankle and hip joints; **P**_{1}and **P**_{2} represent the control gains inside the ankle and hip controllers.

where **q** = [*θ*_{1}; *θ*_{2}] includes the joint angles of the system; **M**(**q**) is the mass matrix; $C(q,q\u02d9)$ represents the Coriolis and centrifugal forces; **G**(**q**) is the gravity term; **D**(**q**;**a**) is the joint torque term that caused by mechanical perturbation; and $F(q,q\u02d9,P)$ is the state feedback control term.

where $x=[q,q\u02d9]$ represents the system state.

### Motion Fit of the Identified Six Types of Controllers

Motion fits of the identified six types of controllers of participant 3 are shown here (Fig. 9). Top two subplots are the external stimulus signal (standing platform translation). Left side subplots are the ankle joint motion fits. Right side subplots are the hip joint motion fits. It is clear that the fits got better with the increase of the controller complexity.