Abstract
In this study, we developed an offline, hierarchical intent recognition system for inferring the timing and direction of motion intent of a human operator when operating in an unstructured environment. There has been an increasing demand for robot agents to assist in these dynamic, rapid motions that are constantly evolving and require quick, accurate estimation of a user’s direction of travel. An experiment was conducted in a motion capture space with six subjects performing threat evasion in eight directions, and their mechanical and neuromuscular signals were recorded for use in our intent recognition system (XGBoost). Investigated against current, analytical methods, our system demonstrated superior performance with quicker direction of travel estimation occurring 140 ms earlier in the movement and a 11.6 deg reduction of error. The results showed that we could also predict the start of the movement 100 ms prior to the actual, thus allowing any physical systems to start up. Our direction estimation had an optimal performance of 8.8 deg, or 2.4% of the 360 deg range of travel, using three-axis kinetic data. The performance of other sensors and their combinations indicate that there are additional possibilities to obtain low estimation error. These findings are promising as they can be used to inform the design of a wearable robot aimed at assisting users in dynamic motions, while in environments with oncoming threats.
1 Introduction
1.1 Overview.
Unstructured environments, such as construction sites, search-and-rescue, and war-zone areas, are dynamic and uncertain. Performing tasks in such situations can be dangerous for a human operator, and thus, there is a significant need for smart robotics to enhance human safety and task completion efficiency, leading to a recent push for robot agents to collaborate in human–robot teaming as wearable robots, humanoids, unmanned ground vehicles, and swarm robots [1–3]. When environments or tasks become more dynamic, wearable robots offer particular promise because they can leverage the superior agility and mobility of their human operators to provide physical assistance to augment human safety. Existing works in the field of wearable robots have heavily focused on using different controllers to provide assistance for steady-state locomotion [2,3]. While analytical methods and control theory have dominated the field, intent recognition can provide more accurate estimation of when and how to apply better assistance for augmenting the performance of human operators. More recently, intent recognition has been introduced to the field to estimate the needed assistance in repeated locomotion, such as walking or running, or specific scenarios, such as lifting or pathology-specific ambulation [4]. Given the adaptability of intent recognition, we can utilize such machine learning techniques to solve many of the challenges posed by dynamic, nonlinear motions performed by human operators.
We are centering the interaction between the human operator and wearable robot on human behavior during dynamic motions. The novelty of this study is determining human intent of dynamic motions to improve reaction and task execution for wearable robots to smartly assist operators. If a robot misunderstands intention, it may impose forces that are counterproductive or even dangerous. We aim to rapidly classify timing and directions of travel of the human operator using intent recognition and determine which on-board sensors are most critical. The main contribution of this study is the design of an intent recognition system that can (1) hierarchically predict when a human intends to move and estimate the intended direction of movement, (2) reduce the error of direction classification compared to analytical methods for fast, accurate estimation of directions of travel, and (3) quantify the contribution of candidate sensors for this motion.
In this study, analytical methods (referred to as the baseline) were used to determine directions of travel and were computed with temporal information, such as integrating inertial data. We hypothesize that intent recognition machine learning algorithms will reduce the estimation error rate when classifying direction of movements compared to the baseline. In addition, another novel aspect of our approach is predicting future desire to initiate movement, a capability not possible using a purely analytical method. This study utilizes the following sensors to investigate performance: electromyography (EMG), inertial measurement units (IMUs), force plates, and motion capture markers. The system overview, shown in Fig. 1, relies on a multisensor collection of a human subject performing dynamic, nonlinear motions to predict the timing of movement start and estimate their direction of travel with our machine learning techniques.

Mechanical and physiological behavior captured through a human dynamic motion collection provides an intent recognition system the ability to optimize and estimate future motions
1.2 Background.
Control systems and analytical approaches have been used to track key metrics, such as position and orientation of robot agents [5–7]. Recent work has shown great promise with kinetic and inertial derivations [8,9]. However, such methods may not be fast or accurate enough to predict and estimate dynamic behaviors that evolve as the movement progresses. In this study, we investigate the performance of such a baseline compared to our intent recognition system.
Intent recognition systems based in machine learning have been successful in the field of robotics for pattern recognition, socially assistive robots, and human–robot interaction [10,11]. As a relatively recent development, machine learning for human movement and locomotion is seen in systems that provide assistance for various pathologies, prosthetic devices, and exoskeletons for walking using classification techniques [4,12,13]. As these studies get more specific to a certain type of use, there is a gap in the human motion intent field to design similar systems for more rapid, dynamic motions of the lower limbs [14–16]. Because intent recognition can enable rapid direction tracking of transient responses in other applications, we are examining sudden movements based in human motion intent since this rapid locomotion has not been explored yet with such techniques [17,18].
The design of intent recognition systems for dynamic motions requires an initial prediction of when the movement is going to start during a human operator's reaction time followed by an immediate estimate of the direction of travel. Mechanical and physiological inputs can be used concurrently to capture the unique attributes of this motion and enhance estimating the intended direction of travel based off of previous work in classifying modes in locomotion [14,15,19]. Studies have investigated physiological methods on the lower limbs to understand the various components of movement [20–22]. Contact methods of monitoring muscle activation aid in measuring a quick response that is necessary for determining human motion intention and can serve as inputs into intent recognition systems [23,24]. Moreover, they can start an intent recognition system during a predictive range, which is between the start of the human agent’s reaction and actual start of movement [18].
From the mechanical perspective, previous studies examine lower limb dynamics, such as inverse kinematics and inverse kinetics, to reveal information about the components and orientation of specific motions [25,26]. In human motion analysis, ground reaction forces (GRFs), center of mass velocity, and center of pressure (CoP) have been key in analyses of various cases; therefore, their inclusion would provide important information for the intent recognition system [27,28]. Center of mass velocity has been a primary metric studied for gait tracking and was utilized in this work to determine when movement starts with precise and reliable motion capture [29]. Inertial sensors have also illustrated stable position tracking in unstructured environments [30,31].
We developed a hierarchical architecture using mechanical and physiological sensors to predict movement start and estimate which direction an operator intended to move to provide rapid, optimized assistance for dynamic motions in unstructured environments. We obtained the overall system performance, which was defined as how fast and accurately we could estimate direction compared to an inertial baseline.
2 Methods
2.1 Experiment Procedure.
A set of experiments was conducted to understand how subjects perform dynamic motions by collecting outputs from a set of external sensors during such movements. This study focused on direction-dependent motions. Six able-bodied subjects gave written, informed consent for a protocol that was approved by the Georgia Institute of Technology’s Institutional Review Board (H18363).
Each subject stood in the middle of a six force plate configuration, which represented the center of a labeled circle (r = 2 m). Figure 2 demonstrates the procedure of the experiment. Eight directions were chosen at 45 deg increments. After the subject was standing at rest for a randomized time (1 s–10 s), a visual instruction of a top-down arrow of one of the eight directions was displayed on a television. The subject was told to escape the labeled circle in the given direction as fast as they possibly could. This labeled circle ensured jumping and hopping were not performed since they were not representative of the more distinct, multiple step threat-evasion strategy. After successfully crossing the circle, the trial ended and the subject returned to their centered position. Each subject had a training period to become accustomed with the environment. The eight directions and a null condition were tested ten times, resulting in 90 trials per subject. The order of directions tested were randomized to prevent directional bias.

Subjects, starting from rest, rapidly escaped a prelabeled circle in the direction randomly displayed on a television. A subject is completing a trial in the 0 deg direction.
GRF and CoP were captured with six force plates (Bertec, Columbus, OH). Three IMUs and 14 channels of EMG were placed on each subject (Delsys Avanti & Trigno Platforms, Natick, MA). A 43-reflective marker set, modified from the Cleveland Clinic Standard Lower Limb Set, was also used in the motion capture space (Vicon, Oxford, UK). Sensor placements are illustrated in Fig. 3 [32].

Sensor and marker locations are annotated on anterior and posterior sides. EMG locations per lower limb were tibialis anterior, rectus femoris, gastrocnemius lateralis, biceps femoris, tensor fasciae latae, adductor magnus, gluteus maximus. IMU locations were the lower back and the left and right upper thighs.

Sensor and marker locations are annotated on anterior and posterior sides. EMG locations per lower limb were tibialis anterior, rectus femoris, gastrocnemius lateralis, biceps femoris, tensor fasciae latae, adductor magnus, gluteus maximus. IMU locations were the lower back and the left and right upper thighs.
2.2 Signal Processing.
The collected sensor data shown in Fig. 3 are filtered as follows:
EMG: Muscle activation, bandpass (20–400 Hz)
IMU: Three-axis accelerometer (Accel), no applied filter
IMU: Three-axis gyroscope (Gyro), no applied filter
Force plates: GRF, lowpass (20 Hz)
Force plates: CoP, lowpass (20 Hz)
Motion capture: Marker trajectories, lowpass (6 Hz)
In this study, EMG, IMU, GRF, and CoP were either already relative or transformed to the local body frame to be consistent with readings from most wearable sensors [33]. opensim v4.0 was used to calculate the sagittal and frontal hip joint angles [34]. GRFs were integrated to obtain impulse over time. For this dataset, we examined each trial for all directions from 1 s prior to the start of movement to before a limb made contact with ground outside of the force plate configuration.
To determine the key sensor information required for dynamic motions, the experiment data were grouped into three categories based on data retrievable from current sensors commonly used in wearable devices. These sensor groups are as follows:
All kinetic: Vertical kinetic + Shear kinetic
Vertical kinetic: Vertical GRF (one axis) + CoP
Shear kinetic: Horizontal GRF (two axis) + impulse
EMG: Muscle activation
Kinematic: IMU (Gyro, Accel) + hip joint angles
2.3 Intent Recognition Pipeline.
An intent recognition pipeline using machine learning was created to predict and estimate threat-evasive behavior. Common practices in machine learning were utilized [4].
Supervised Learning: Type of model that trains on known inputs, or features, and known outputs, or labels. Performance is found by testing the model on known inputs but unknown outputs.
Binary Classification: Model that has only two possible outputs.
Multiclass Classification: Model that has a known number of possible outputs >2.
Feature Engineering: Extraction technique to determine interesting attributes from a window of data to yield a set of representative values, or features, of that window.
Dimensionality Reduction: Reduction of dimensions in the feature space by selecting the optimal features that provide the most information.
Forward Feature Selection: Type of dimensionality reduction that iteratively adds features to the model to determine which set of features best improve the model’s performance [35].
Sweep and Tuning: Optimization technique to find the best model parameters and hyperparameters through an exhaustive grid search.
Cross Validation: Evaluation method to determine a model’s overall robustness by rotating out different testing sets.
By using the experiment outputs as system inputs after signal processing, as illustrated in Fig. 4, the inputs went through their respective feature engineering techniques [36]. The system was optimized with dimensionality reduction, a sweep of window sizes and increment lengths, and hyperparameter tuning before leading into the hierarchical structure to classify movement start and direction of travel.

Intent recognition pipeline illustrates the steps required from system inputs to feature engineering to optimization of algorithms to predict movement start and estimate direction of travel
The system was broken down into two models: (1) a timing model to predict when the movement started from a resting position at the millisecond level and (2) a direction model to estimate in what direction the subject intended to move. A selection of a machine learning algorithm for each of these models was needed. A case study of machine learning in human motion intent recognition has demonstrated that XGBoost, a new and robust machine learning algorithm, had the best performance in steady and transitional states of movement against current state-of-the-art models in the field [37]. XGBoost is a parallel tree boosting system that uses ensemble learning and gradient boosting to efficiently develop a set of trees from a supervised learning approach. Its benefits include regularization to prevent overfitting, controllable pruning of tree complexity, and working with sparse datasets [38]. It was selected for both timing and direction models.
2.3.1 Timing Model: Prediction and Sensor Optimization.
The timing model used a form of supervised learning, binary classification (intended to move or not). Its ground truth label for start of movement, or the absolute kinematic start, was based off of when the subject’s center of mass velocity exceeded a set threshold compared to their resting position’s velocity, using motion capture marker trajectories. Feature extraction was completed on the system inputs, excluding impulse. This omission was necessary as GRF readings were not changing at rest, so impulse was negligible. The primary metric used for the performance analysis was total classification error during the entire motion and classification error only during the transition window between no movement and movement.
Because the highest classification error in the binary timing model was likely to occur in this transition window, the timing model went through a prediction analysis in the transition phase, where absolute kinematic start labels were changed to switch earlier than the actual start of movement. The estimation, 0 ms prior, was compared to predictions, 60 ms, 100 ms, 200 ms, and 300 ms prior to absolute kinematic start. These times were chosen to illustrate the progressive trend of predictions. Increment size for the model was 20 ms; therefore, selected predictive times were multiples of this increment. Forward feature selection was run at the best predictive time to determine the critical features that obtain low error. The sensor groups were analyzed at the best predictive time by their classification errors during the transition window. A one-way analysis of variance (ANOVA) was performed to determine statistical differences between the sensor groups with additional one-way ANOVA tests to compare different groupings.
2.3.2 Direction Model: Dimensionality Reduction, Estimation Over Time, and Estimation Per Direction.
Once the absolute kinematic start was reached, the direction model was activated. This model was formulated as a supervised, multiclass classification trained on the features in Fig. 4 and assigned labels as the displayed arrow on the television representing the intended direction of movement. Each direction represented a class. The primary performance metric for this model was mean absolute error (MAE) for a directional analysis of degrees.
Forward feature selection was executed to determine the optimal number of features for best performance and its contributing sensors. The results of the direction model were cross validated (k = 5) during dimensionality reduction and hyperparameter optimization. The hyperparameters tuned were the learning rate, maximum allowable tree depth, and minimum gain to split a tree node.
The performance of the direction model was evaluated with all directions compounded by the average estimation errors of various sensor groups and their combinations. A one-way ANOVA with a Bonferroni correction (α = 0.05) for pairwise comparison was conducted to distinguish which group of sensors significantly reduced the estimation error. Error over time was also analyzed to determine which sensor suite could provide a precise, continued estimation as the dynamic motion evolved. Directions were then examined separately, and the optimal error for each direction was reported at when 95% of the stabilized estimation error was reached in time.
2.3.3 Kinetic Baseline: Direction Estimation Over Time and Performance Per Direction.
After developing and optimizing the timing and direction models of the intent recognition system, the next step was to compare our architecture against the direction estimation of an analytical approach common in similar works. After a preliminary testing of analytical methods found with wearable sensors (data not shown), we chose the best performing method, which was to calculate the total impulse from GRFs. The baseline was established as a kinetic response because of its advantageous performance, both spatially and temporally, unlike other sensors [39]. Therefore, the comparison of the kinetic baseline against the intent recognition system was not biased in favor to the proposed system by using poor performing baselines. The direction of travel was determined from the impulse components as θ = atan2(impulsey, impulsex).
The direction model was studied against the kinetic baseline to determine which method efficiently achieved low error of direction estimation over time and for each direction. For each direction examined separately, eight paired t-tests evaluated the statistical differences in MAE for each direction between the direction model and baseline. The reported error for the baseline was captured at the time of when the direction model had reached 95% of its stabilized response to represent a precise comparison in time. Analysis was broken down further by cardinal (0 deg, 90 deg, 180 deg, and 270 deg) and diagonal directions (45 deg, 135 de, 225 deg, and 315 deg).
3 Results
3.1 Timing Model: Prediction and Sensor Optimization.
The timing model’s performance in the transition phase in Fig. 5 shows an increasing classification error as predictions were pushed further back before absolute kinematic start. As further predictive times were examined, there was an increasing spread in time in which the errors occurred over time and a linear increase in the total classification error of the entire motion. Classification during the transition window also worsened as more times prior were examined. At 300 ms prior, the transition classification error constituted the majority of the total classification error.

The transition phase was examined for estimation (0 ms prior) and predictions (60 ms prior, 100 ms prior, 200 ms prior, and 300 ms prior) of the start of threat-evasive movement, or absolute kinematic start. Steady-state error for the estimation system was extremely low (); thus, the primary differentiator was error during the specified transition window. Data zoomed to transition window.

The transition phase was examined for estimation (0 ms prior) and predictions (60 ms prior, 100 ms prior, 200 ms prior, and 300 ms prior) of the start of threat-evasive movement, or absolute kinematic start. Steady-state error for the estimation system was extremely low (); thus, the primary differentiator was error during the specified transition window. Data zoomed to transition window.
The timing model’s forward feature selection at 100 ms prior indicated that the critical features to obtain a total error of 4.8% were vertical GRFs (left and right limbs) and IMU Gyro (left limb). The sensor group breakdown of the timing model in Fig. 6, which was trained at 100 ms prior, illustrated statistical differences in all five standalone sensor groups (p < 0.05). Additional one-way ANOVA tests demonstrated that all kinetic, kinematic, and EMG as well as shear kinetic, kinematic, and EMG had statistical differences, meaning these kinetic groups performed better in error reduction than the EMG and kinematic sensor groups (p < 0.05). However, vertical kinetic, EMG, and kinematic (p = 0.08) as well as only the three kinetic groups (p = 0.93) demonstrated no statistical difference in error reduction. These findings indicate a statistically superior performance of the shear kinetic sensor group both standalone and in combination with the vertical kinetic sensor group.

The timing model’s classification performance was examined by sensor groups at 100 ms prior to absolute kinematic start. Classification errors were averaged over six subjects, and error bars represented ±1 standard error of the mean (SEM).
3.2 Direction Model: Dimensionality Reduction.
The direction model’s forward feature selection demonstrated that the estimation error quickly decreased as features were individually added to the model, illustrated in Fig. 7. The optimal feature set was found when the added features no longer had any significant reduction in error, which corresponded to 13.8 deg MAE. The optimal features were segmented by sensor group to demonstrate the trend of the large contribution of kinetic data. From the set of features that produced the optimal error listed in Fig. 8, shear components of the kinetic data were common earlier in the set, meaning they greatly assisted in the error reduction. In addition, out of the seven muscles per limb examined, only four of the muscles were selected in this set: biceps femoris, adductor magnus, tibialis anterior, and gastrocnemius lateralis. The kinematic contribution in this optimized set included all three IMUs.

Forward feature selection of the direction model was performed, and an optimal error was found at n = 27 features. The breakdown of this feature set is also illustrated.
3.3 Direction Model: Estimation Breakdown and Performance Over Time.
As combinations of the sensor groups in Fig. 9 were examined with the post hoc pairwise comparison, EMG, kinematic, and vertical kinetic groups had worse average degree errors than the kinetic baseline. The grouping of two or more sensors as well as the shear sensors alone produced a lower error than the baseline. The vertical kinetic group showed significant improvement in its estimation error as either kinematic or shear kinetic group was added to form groups of two (vertical kinetic + kinematic: p = 0.02; all kinetic: p < 0.0001), but not when added with EMG (vertical kinetic + EMG: p = 1.0). As groups of three were examined against these groups of two, there was no significant improvement in any of these additions (p > 0.05), except when shear kinetic was included with vertical kinetic and EMG (all kinetic + EMG: p = 0.002).

For all directions compounded, the intent recognition system’s direction model was trained by sensor group and their combinations. Average MAE was reported for each sensor group combination and the kinetic baseline. MAE was averaged over six subjects and error bars represent ±1 SEM.
The addition of shear kinetic to any sensor group to form a group of two had significant reduction in error (p < 0.05 for all combinations of two). However, after shear kinetic was included to form a group of two, the addition of any other sensors to form a group of three or four had no statistically significant improvement in estimation error based on pairwise comparison. This indicated that shear kinetic sensors had optimal performance and greatly reduced estimation error as standalone and in combination with other sensor groups.
As subjects continued in their path, MAE over time was computed offline, as shown in Fig. 10, and demonstrated significantly better temporal performance of the direction model trained on the optimized feature set over the EMG sensor group, kinematic sensor group, and kinetic baseline. The direction model trained on the all kinetic sensor group had a similar performance to the model trained on the optimized feature set.

The baseline’s direction estimation was compared to the intent recognition system’s estimation (XGBoost) over time by sensor group and optimized feature set. Annotated in the legend are these errors averaged across the time shown. The baseline had the greatest error than the other direction models’ sensor breakdowns in the first 0.3 s of movement. Stabilized errors are shown as dotted lines for the respective analyses.

The baseline’s direction estimation was compared to the intent recognition system’s estimation (XGBoost) over time by sensor group and optimized feature set. Annotated in the legend are these errors averaged across the time shown. The baseline had the greatest error than the other direction models’ sensor breakdowns in the first 0.3 s of movement. Stabilized errors are shown as dotted lines for the respective analyses.
The kinetic baseline produced a stabilized response 500 ms into the evolution of the dynamic movement to MAE of 15.5 deg. The optimized feature set direction model had a stabilized response of 6.9 deg at 360 ms. The EMG direction model and kinematic direction model stabilized later in the motion with both obtaining a minimization of error within the first 300 ms of movement. This minimum of MAE was consistent at the second toe-off of the stance leg.
3.4 Baseline Versus Direction Model Per Direction.
At 95% of the intent recognition system’s stabilized error shown in Fig. 11, the direction model consistently obtained error <15 deg for each direction, while the baseline varied between 5 deg and 45 deg. For three of the four cardinal directions, the intent recognition system had no significant improvement over the baseline (0 deg: p = 0.024, 90 deg: p = 0.37, 180 deg: p = 0.05, 270 deg: p = 0.13). However, the intent recognition system performed significantly better than the baseline in three out of the four diagonal directions (45 deg: p = 0.01, 135 deg: p = 0.1, 225 deg: p = 0.003, 315 deg: p = 0.001). This indicated that the intent recognition system was statistically better than the baseline in most diagonal directions and can match or outperform the baseline in all cardinal directions.

For each direction analyzed, the intent recognition system’s error at 95% of the steady-state error was determined and averaged over six subjects. At the time of the intent recognition’s annotated error, the baseline’s error was found to illustrate the respective values according to an early estimation. The intent recognition’s direction estimation (XGBoost) results presented used all available sensors.

For each direction analyzed, the intent recognition system’s error at 95% of the steady-state error was determined and averaged over six subjects. At the time of the intent recognition’s annotated error, the baseline’s error was found to illustrate the respective values according to an early estimation. The intent recognition’s direction estimation (XGBoost) results presented used all available sensors.
4 Discussion
We successfully developed a hierarchical intent recognition system that (1) predicted movement start intent at 100 ms prior to absolute kinematic start with a total classification error of 4.8% and (2) estimated movement direction within 8.8 deg of the correct vector with the optimal sensor suite of all kinetic (shear kinetic and vertical kinetic). In addition, the intent recognition system had significantly lower estimation error, consistent performance in all directions, and reached a steady-state direction earlier than the baseline.
4.1 Predictive Timing for Intent Recognition System.
The timing model’s estimation of the absolute kinematic start performed well due to low total classification error (<3%) and temporal performance with most of the error coming from the −50 ms to 50 ms range surrounding the switch between at rest and movement. Comparing this estimation to the predictive times, the percent increase of transition classification error for 60 ms prior was 17% and for 100 ms prior was 40%. The percent increase was 162% for 200 ms prior and 328% for 300 ms prior. In addition, 200 ms prior and 300 ms prior had a larger spread in time where error occurred in this transition phase, while 60 ms prior and 100 ms prior did not. Therefore, as you look further before the start of movement, there is a disruption to the two surrounding phases, increasing the overall error of estimating the time of movement start. At 200 ms prior and 300 ms prior, the transition classification error constitutes a substantial portion of the total classification error, resulting in an inaccurate estimation of the switch. At 60 ms prior, the other phases are not as affected, and errors remain low. Under the same conditions, the findings at 100 ms prior are acceptable as well, and this predictive time looks further prior to absolute knematic start. These offline findings indicate that 100 ms prior is the optimal choice for a forward prediction window as it maintains low transition and total classification errors, while still predicting movement start without significantly affecting the other phases. Physical requirements of a wearable system can start up at this time, earlier than 0 ms prior. If the system is started too early at 200 ms prior or 300 ms prior, there would be a great loss in resolution. If the system is started at 60 ms prior, there is more time prior to the start of movement that the system can account for given the performance of 100 ms prior.
Compared to recent work on human motion estimation, our work provides a quicker, more general model with predictive capabilities. Other studies have examined different classification techniques for recognizing the start of movement. In Lee’s 2015 work, the motion classification of movement had good estimation for all but the deep learning approach [40]. Our XGBoost estimation method ( error) as well as our prediction capabilities ( error) perform better than their examined algorithms during movement estimation except the supervised MTRNN. However, XGBoost has lower computation time and is more flexible in its design. In addition, it enhances our ability to predict earlier in time of when movement is about to occur rather than estimate, which is a new approach in the field. While some works rely heavily on EMG for predicting earlier times for dynamic motions, our findings indicate the all kinetic sensor group is the optimal choice to monitor for movement start with better accuracy [18,41]. If needed, EMG and kinematic sensors can be used to determine the predictive time for our problem, but with a 3% gain in error (Fig. 6). Now that absolute kinematic start can be anticipated through this offline analysis, we can track dynamic motions over time as they are evolving with minimized delay to their actual movements.
4.2 Intent Recognition System’s Direction Estimation and Sensor Contribution.
Once the user has reached the absolute kinematic start, we can proceed through the hierarchical architecture to user direction estimation. The intent recognition’s direction estimation results provide the necessary temporal information, set of features, and sensor contributions to obtain the most advantageous sensor suite that produces a low error for accurate direction estimation of an operator. Kinetic sensors demonstrate a greater contribution in obtaining precise direction estimation than kinematic and EMG sensors, as seen in its prevalence in the optimized feature set (Fig. 7), its role in greatly reducing error in its addition to sensor combinations (Fig. 9), and its significantly lower error over time (Fig. 10), averaged at 9 deg.
Our findings show that the inclusion of shear kinetic sensors drive the boost in performance of direction estimation. Shear kinetic sensors can be combined with vertical kinetic sensors to produce the lowest, statistically significant direction estimation error of 8.8 deg, which is a 2.4% error of the 360 deg range of potential directions. If shear kinetic sensors are readily available, EMG and kinematic sensors would not be required for precise direction estimation; however, shear kinetic sensors are difficult to engineer, and very few wearable sensors and devices have them fully integrated to date. If these sensors are not accessible, our results indicate that combining sensor groups can reduce both averaged and temporal direction estimation error. Specifically, vertical kinetic and kinematic sensor groups used together produce a direction estimation error of 15.5 deg, or a 4.3% error of the 360 deg range (Fig. 9). This combination has no statistical difference if EMG was included in this set.
The previous work has relied on kinematic data to provide orientation estimation. Aminian’s work on lower limb orientation relied solely on inertial sensors and had a 1.7 deg error in thigh orientation estimation for fast movements in a 30 deg range of joint movement, which is 5.7% error [42]. Our work can obtain half that error with three-axis kinetic data in a much wider range of movement.
An interesting minimization of error was found to correspond to a specific gait event during threat evasion (Fig. 10). The minimization occurred earliest for the EMG sensor group as muscle activity foreshadows mechanical action. This trackable event of the second toe-off from the stance leg is a possible estimation technique for real-time implementation of this system. Previous literature demonstrates that online gait phase detection is possible with one or more sensors [15].
4.3 Performance Comparison of Baseline and Intent Recognition.
The kinetic baseline method was compared to our algorithm architecture’s direction estimation. The kinetic baseline took about 140 ms longer to reach a steady-state response than the direction model with the optimized feature set (Fig. 10). Given the need for rapid, accurate estimation, the baseline method would not be sufficient to provide correct direction estimation for a user in a timely manner because it had greater error in the first 300 ms of threat evasion and stabilized to a steady-state error later and worse than the optimized direction model. Our architecture provides fast, accurate estimations of agile motion much closer to their actual start of threat evasion. Similar to our kinetic baseline, other works have focused on an analytical approach, such as orientation sensors, to estimate direction. There are drawbacks to using an analytical method to inform direction. One study showed increased difficulty at estimating negative angles and an error of 18.1 deg in the yaw angle [43]. As we fuse sensor groups together, we are making our estimation faster and more accurate.
The intent recognition system had a more uniform performance for each direction and was more versatile in its ability to estimate a variety of directions than the baseline. The baseline could be used for moving in the cardinal directions, but it had poorer performance than the intent recognition system for the diagonal directions. As such, other works have focused only on one plane of travel and inertial components since it has been shown as one of the best strategies in the literature [9,44]. We have developed a method that is more robust than the analytical approach and allows for a wider range of possible directions to be examined with precise direction estimation of dynamic motions.
4.4 Limitations.
The error estimation and performance characterized in this study are reflective of noise and inaccuracies in the sensors used. We chose to include sensors that could be implemented in real-world applications to add to the robustness of translating this system to an online analysis. A few of the sensors were standard wearable sensors that could be embedded easily in wearable applications, such as EMG and IMUs. Other sensor readings were laboratory grade, such as GRFs and some kinematic data, and would likely be less precise in a wearable application.
Although we have shown that 3D GRF (shear and vertical) sensors greatly improve the direction of model’s performance in this study, our offline analysis utilizes shear sensing with high-precision, six-degrees-of-freedom force plates. Translating such measurements to a wearable system is not easy, but force sensing insoles may be a viable future option. Researchers continue to develop insoles capable of measuring 3D GRF and have shown MAEs of [45,46]. Sensor errors will reduce classification performance, and their metrics (noise and bias) will have to be rigorously analyzed either through perturbation analyses or new experiments.
5 Conclusions
This study contributed a design of an intent recognition system that quantified and determined candidate sensors best suited for dynamic, rapid motions as well as hierarchically predicted motion intent offline and reduced direction classification error once the estimation of direction of travel pipeline was activated. This study indicated that our intent recognition system would provide critical information in a precise and timely manner to interpret user intention when performing threat evasive motions. These findings can inform the design of a wearable device to provide physical assistance for users dynamically evading oncoming threats in unstructured environments to protect human safety.
Acknowledgment
This research is supported in part by the National Science Foundation (NSF), National Robotics Initiative (NRI) 1830498, and NSF Traineeship Program (NRT) 1545287.
Conflict of Interest
There are no conflicts of interest.