Accurate prediction of muscle and joint contact forces during human movement could improve treatment planning for disorders such as osteoarthritis, stroke, Parkinson's disease, and cerebral palsy. Recent studies suggest that muscle synergies, a low-dimensional representation of a large set of muscle electromyographic (EMG) signals (henceforth called “muscle excitations”), may reduce the redundancy of muscle excitation solutions predicted by optimization methods. This study explores the feasibility of using muscle synergy information extracted from eight muscle EMG signals (henceforth called “included” muscle excitations) to accurately construct muscle excitations from up to 16 additional EMG signals (henceforth called “excluded” muscle excitations). Using treadmill walking data collected at multiple speeds from two subjects (one healthy, one poststroke), we performed muscle synergy analysis on all possible subsets of eight included muscle excitations and evaluated how well the calculated time-varying synergy excitations could construct the remaining excluded muscle excitations (henceforth called “synergy extrapolation”). We found that some, but not all, eight-muscle subsets yielded synergy excitations that achieved >90% extrapolation variance accounted for (VAF). Using the top 10% of subsets, we developed muscle selection heuristics to identify included muscle combinations whose synergy excitations achieved high extrapolation accuracy. For 3, 4, and 5 synergies, these heuristics yielded extrapolation VAF values approximately 5% lower than corresponding reconstruction VAF values for each associated eight-muscle subset. These results suggest that synergy excitations obtained from experimentally measured muscle excitations can accurately construct unmeasured muscle excitations, which could help limit muscle excitations predicted by muscle force optimizations.

## Introduction

The ability to determine muscle and joint contact forces reliably during walking could be useful for developing improved treatments for disorders such as osteoarthritis, stroke, Parkinson's disease, and cerebral palsy [1]. However, experimental approaches do not currently exist for measuring muscle and joint contact forces in vivo under normal conditions. Similarly, computational approaches for predicting these quantities are limited by the redundancy inherent in movement coordination (i.e., many more muscles actuating the skeleton than degrees-of-freedom in the skeleton) [2].

To address the indeterminacy issue, computational studies generally use either optimization or electromyographic (EMG) driven methods to estimate muscle and joint contact forces in musculoskeletal models. Optimization methods include static optimization, which is an inverse dynamic approach that utilizes prescribed kinematics and net joint moments to find muscle excitations, and dynamic optimization, which is a forward dynamic approach that solves for kinematics, net joint moments, and muscle excitations simultaneously [1,3–5]. While these methods can provide unique muscle and joint contact force solutions, they depend on subjective cost functions that make assumptions about what the body is maximizing or minimizing. Furthermore, predicted muscle excitations frequently do not agree well with EMG measurements. In contrast, EMG-driven methods use processed experimental EMG data to directly control muscle excitations in a musculoskeletal model [6–9]. Similar to static optimization, these methods require prescribed kinematics and net joint moments, but unlike optimization methods in general, they avoid the need for an assumed cost function. However, variations in the quality of EMG data, especially for muscles not accessible by surface electrodes, limit these methods [10,11]. Consequently, even EMG-driven methods often use optimization methods to calculate unmeasured muscle excitations, which may be inconsistent with the actual muscle excitations [12].

Muscle synergy analysis has been proposed as a way to reduce the indeterminacy of muscle excitations predicted by optimization methods [13,14]. Muscle synergy analysis is a computational approach that uses non-negative matrix factorization (NMF) to decompose a large number of experimentally measured EMG signals into a smaller number of independent time-varying “synergy excitations,” along with weights that describe how each synergy excitation contributes to each measured muscle excitation [15–17]. Typically, only three to five synergy excitations are needed to account for over 90% of the variability in up to 32 muscle excitations measured experimentally during walking [18–20]. To date, muscle synergy analysis has been used primarily in descriptive studies to analyze experimental EMG data [18,19]. For example, muscle synergy analysis has been applied to EMG data collected from individuals poststroke or with Parkinson's disease during walking revealing that, compared to healthy individuals, a reduced number of synergy excitations is often required to reach 90% variance accounted for (VAF) [20,21]. Recent studies have also incorporated muscle synergy concepts into muscle force optimizations [22–28]. One study improved prediction of knee contact forces by replacing a high-dimensional set of 44 independent muscle excitations with a lower-dimensional set of 5 synergy excitations [27]. The synergy excitations were extracted from 13 experimental EMG signals and were used to construct 44 interdependent muscle excitations. However, no evidence exists that synergy information obtained from a small number of experimentally measured EMG signals can accurately construct muscle excitations associated with unmeasured EMG signals.

This study explored the feasibility of using synergy excitations extracted from eight surface EMG signals collected during walking to accurately construct muscle excitations derived from surface and fine-wire EMG signals collected simultaneously from up to 16 other muscles. EMG data collected from one healthy subject (24 channels from one leg) and one subject poststroke (16 channels from each leg) during treadmill walking were subdivided into multiple subsets of eight commonly collected surface EMG signals and all remaining surface and fine-wire EMG signals. Two novel measures were investigated to quantify the ability of synergy excitations extracted from each eight-muscle subset to construct the remaining muscle excitations not included in the subset, a process we call synergy extrapolation. The most appropriate measure was used to quantify how the choice of muscles in each eight-muscle subset affected construction accuracy for the remaining muscle excitations. Thus, the two main questions we addressed were: (1) Is synergy extrapolation using surface EMG data theoretically possible for walking, and if so, (2) Which muscles accessible by surface EMG provide the best information for synergy extrapolation?

## Methods

### Experimental Data Collection and Reduction.

Experimental EMG data were collected from one healthy subject and one subject poststroke while walking at different speeds on an instrumented treadmill. Institutional review board approval and subject written informed consent were obtained prior to participation. Eleven treadmill walking trials ranging from 0.4 to 1.4 m/s were collected from the healthy subject, and seven trials ranging from 0.2 to 0.8 m/s were collected from the stroke subject. From each trial, ten consecutive gait cycles were chosen for analysis. A total of 24 EMG signals (17 surface, seven fine-wire) were collected from lower extremity muscles of the healthy subject's dominant leg, while a total of 16 EMG signals (12 surface, four fine-wire) were collected from each leg of the stroke subject (Table 1). For the healthy subject, two recordings of rectus femoris activity were taken to capture fully its biarticular action. Whereas the SENIAM project (Surface ElectroMyoGraphy for the Non-Invasive Assessment of Muscles) guidelines [29] state that the rectus femoris electrode should be placed 50% of the way between the superior aspect of the patella and the anterior superior iliac spine, we chose electrode placements located more proximally and distally compared to this location to capture both hip flexion and knee extension components, respectively. EMG signals were band-pass filtered (zero-lag fourth-order Butterworth) from 10 to 400 Hz for surface EMG and 20 to 400 Hz for fine-wire EMG to account for additional soft tissue motion artifacts. All EMG data were then demeaned, full-wave rectified, and low-pass filtered (zero-lag fourth-order Butterworth) at seven/(cycle time) Hz (6 Hz for a typical cycle time of approximately 1.2 s) to account for speed-related changes in motion artifacts [30]. The resulting processed EMG signals defined experimental muscle excitation profiles. Muscle excitations in EMG-driven studies are typically normalized to the excitation level recorded during a maximum voluntary contraction [6,8,31,32]. Alternatively, since maximum voluntary contraction trials were not performed for either subject in this study, we normalized each muscle excitation by its maximum value observed across all walking trials [9].

### Synergy Extrapolation Approach.

In preparation for performing thousands of muscle synergy analyses, we grouped muscle excitations from both subjects into multiple pairs of included and excluded subsets. Muscle excitations placed in an included subset were used in an associated synergy analysis while those placed in an excluded subset were omitted from that synergy analysis. Each included subset was composed of muscle excitations from eight muscles with surface EMG data while the corresponding excluded subset was composed of the remaining 16 (dominant leg of healthy subject) or eight (each leg of stroke subject) muscle excitations from muscles with either surface or fine-wire EMG data. The total number of included subsets for each subject (24,310 for the healthy subject and 495 for each leg of the stroke subject) was determined by the number of possible combinations of eight muscles with surface EMG data. Eight muscle excitations were used in each included subset since this number is consistent with the number of experimental EMG signals commonly used in synergy analysis studies [20,25,33]. In practice, included muscle excitations would be those that can be measured easily using surface electrodes, while excluded muscle excitations would be those that cannot be measured easily due to limitations in number of available EMG channels or the need for fine-wire electrodes.

where *W* is an *n* time points × *p* synergies matrix containing synergy excitations (presumably representing both included and excluded muscle excitations) in columns, $Hi$ is a *p* synergies × *m _{i}* muscles matrix containing the synergy vectors associated with the included muscle excitations in rows, $Ai$ is an

*n*time points ×

*m*muscles matrix containing the included (subscript

_{i}*i*) muscle excitations in columns, and $\Vert \Vert F$ represents the Frobenius norm [34]. Synergy vectors were allowed to vary between gait cycles, and therefore synergy analysis using Eq. (1) was performed on each gait cycle separately. Prior to synergy analysis, we normalized each muscle excitation profile to have unit variance to allow the variations in each muscle to be considered with equal importance in the NMF solution approach [15]. Three to five synergies were extracted from each included muscle excitation subset, which is typical of other studies involving synergy analysis [18–20]. The resulting synergy excitations were normalized by a scaling factor to have a maximum value of one, consistent with the maximum value of muscle excitations used in muscle force optimizations. Corresponding synergy vectors were multiplied by the same scaling factor to keep reconstructed included muscle excitations unaltered.

Several algorithms are available to factor matrix $Ai$ into *W* and $Hi$. In our study, we used a combination of the default algorithms included with the matlab nnmf function (The MathWorks Inc., Natick, MA). The multiplicative update algorithm was used to generate initial guesses for *W* and $Hi$, and these initial guesses were then used in the alternating least squares algorithm, which is faster and allows for more evaluations of the objective function. To avoid finding local minima, we performed each synergy analysis ten times using ten different initial guesses for the multiplicative update algorithm, taking the solution with lowest final cost function value as our best estimate of the global minimum. For a detailed explanation of both algorithms, refer to Berry et al. [34].

*W*obtained from each included muscle excitation subset $Ai$ were able to construct each corresponding excluded muscle excitation subset

*A*—a process we call synergy extrapolation. Synergy extrapolation assumes that the synergy excitations obtained from the included muscle excitations apply equally well to the excluded muscle excitations, though the additional synergy vector weights $He$ for the excluded muscle excitations remain unknown. How to find these unknown weights in practice remains a topic of ongoing research. For the present study, our goal was to determine whether synergy extrapolation could work in theory by taking advantage of our knowledge of the excluded muscle excitations in each included–excluded subset pair. Specifically, given

_{e}*W*as found using Eq. (1) and $Ae$ from the same subset pair, we calculated $He$ using linear least-squares regression, as indicated by the following matrix equations:

*W*is the previously found

*n*time points ×

*p*synergies matrix containing synergy excitations in columns, $He$ is a

*p*synergies ×

*m*muscles matrix containing the synergy vectors associated with the excluded muscle excitations, and $Ae$ is an

_{e}*n*time points ×

*m*muscles matrix containing the excluded (subscript

_{e}*e*) muscle excitations in columns. In our study, we solved Eq. (2) using the matlab backslash operator, which did not force the recovered synergy vector weights to be positive. Once $He$ was calculated, we reconstructed the corresponding excluded muscle excitations using matrix multiplication

where $A\u0303e$ is the synergy-based approximation of $Ae$. We performed this synergy extrapolation approach for every included–excluded muscle subset pair, thereby providing a comprehensive set of extrapolation solutions. Variance accounted for, a measure of goodness of fit that accounts for magnitude and shape, was utilized to evaluate all extrapolation solutions for excluded muscle excitations [15]. Each available data set (healthy subject leg, stroke subject nonparetic, and paretic leg) was analyzed separately. An overview of our entire computational approach is provided in Fig. 1.

### Synergy Extrapolation Evaluation.

Variance account for values calculated for reconstructed excluded muscle excitations through synergy extrapolation were averaged across excluded muscles and gait cycles to create a single extrapolation VAF for each excluded muscle subset. We sorted included muscle subsets based on the extrapolation VAF of their corresponding excluded muscle subset. The 10% of included muscle subsets that resulted in excluded muscle subsets with the highest extrapolation VAF were identified as the top 10% of subsets. Extrapolation ability was measured in two different ways: (1) the number of subsets that exceeded 90% extrapolation VAF, and (2) the mean extrapolation VAF for the top 10% of subsets at each walking speed. We used these two measures to define “acceptable” muscle subsets, which were then compared to determine which included muscle subsets produced the most accurate extrapolation results.

Using the measure that best described extrapolation ability, we developed four different heuristics for selecting included muscle subsets likely to extrapolate with the highest accuracy possible. Since these heuristics were based on the results of the preceding synergy analysis, we describe the four heuristics in detail within the Results section and describe here only the rationale for their development. To define the number of synergies for developing heuristics, we determined how many synergies were needed such that extrapolation VAF exceeded 90% on average for all three data sets. For each data set, we calculated the frequency with which each included muscle appeared in an acceptable subset across all trials. Next, we developed heuristics based on overall muscle frequency and importance of having muscles in different functional groups. Included muscle subsets selected from these heuristics were evaluated based on their ability to construct accurately excluded muscle excitations using the presented synergy extrapolation method, noting that under normal conditions, the excluded muscle excitations would not be known. In addition, the individual reconstruction VAF values for included muscle excitations chosen by each heuristic were calculated and averaged together for three to five extracted synergy excitations. Reconstruction VAF results for included muscles were compared to extrapolation VAF results for excluded muscles to determine the number of synergy excitations required to achieve a desired extrapolation accuracy.

## Results

Overall, synergy excitations extracted from subsets of eight included muscle excitations derived from surface EMG data were able to construct the remaining excluded muscle excitations derived from surface and fine-wire EMG data with high accuracy, as demonstrated by both measures of extrapolation ability. The number of eight-muscle subsets able to achieve 90% extrapolation VAF increased as the number of synergies was increased (Table 2). Acceptable combinations were found for most walking speeds when using four and five synergy excitations, while almost no acceptable combinations were found when using three synergy excitations. While variations were also observed across walking speeds, these variations were found to be sensitive to small reductions in the VAF cutoff value (Fig. 2). For example, for the paretic leg of the stroke subject analyzed using five synergies, three walking speeds had virtually no combinations that achieved the 90% VAF cutoff value. However, when the VAF cutoff was reduced to 85%, more than 200 muscle combinations met the VAF requirement for all walking speeds. Furthermore, reducing the VAF cutoff to 80% allowed over 90% of muscle combinations to meet the cutoff for every trial in all three experimental data sets. Variations across speeds were also observed in the mean extrapolation VAF for the top 10% of subsets, although not as acutely (Fig. 3). While this measure is contrary to cutoffs traditionally used in synergy analysis, it eliminates the subjectivity of a 90% VAF value and may be a better indicator of extrapolation ability. Consequently, we chose the mean extrapolation VAF for top 10% of subsets as the most appropriate measure for identifying heuristics that produce high extrapolation VAF.

Muscle selection heuristics developed using muscle frequency in the top 10% of subsets across all walking trials (Table 3) led to reliable identification of eight-muscle subsets with high extrapolation accuracy (Table 4). Muscle combinations selected by all four heuristics achieved average extrapolation VAF of 86.9%, 89.7%, and 91.9% for three, four, and five synergy excitations, respectively, with only small variations observed between heuristics (Table 5). These combinations achieved average reconstruction VAF of 91.8%, 95.7%, and 97.8% for three, four, and five synergy excitations, respectively. Thus, average reconstruction VAF values were greater than extrapolation VAF values by an average of 5.5%. The combinations of eight muscles selected by heuristics represented all primary lower extremity muscle functions during walking.

## Discussion

This study addressed two primary questions: (1) Is synergy extrapolation—using synergy excitations obtained from one set of muscle excitations to construct another set of muscle excitations—theoretically possible for walking, and (2) if so, which muscles accessible by surface EMG provide the most information for synergy extrapolation? Our results clearly demonstrate that numerous subsets of eight muscle excitations derived from surface EMG data (i.e., the included muscles) during walking are able to accurately construct between eight and 16 additional muscle excitations derived from surface and fine-wire EMG data collected simultaneously (i.e., the excluded muscles), indicating that synergy extrapolation is theoretically possible for walking. However, many included muscle subsets did not yield high extrapolation VAF values, indicating that the choice of included muscles for calculating the necessary synergy excitations is important. We also developed heuristics for selecting combinations of eight commonly collected surface EMG signals that provided high extrapolation accuracy. Based on these results, synergy extrapolation may be useful for limiting predicted muscle excitations in muscle force optimizations, since the problem of finding a time-varying muscle excitation for each muscle can be reduced to the problem of finding only 3, 4, or 5 time-varying synergy excitations with associated synergy vector weights. It may also be useful for limiting predicted muscle excitations in EMG-driven models with missing EMG signals, since the problem of finding a time-varying muscle excitation for each unmeasured muscle is reduced to the problem of finding only 3, 4, or 5 synergy vector weights. Furthermore, because our results were consistent between our two subjects, our findings may be equally applicable to healthy individuals as well as individuals with neurological impairments.

While our study demonstrates that synergy extrapolation is theoretically possible, it does not provide practical information regarding how to find the unknown synergy vector weights. In real-life conditions, some muscles will have available surface (and possibly fine-wire) EMG data while many others will not. To evaluate whether or not synergy extrapolation could work on a theoretical basis, we divided our large number of available EMG signals into included and excluded categories, where the included muscles represented those from which experimental EMG could be collected—thus synergy analysis could be performed, while the excluded muscles represented those from which experimental EMG data would be more difficult to collect. We took advantage of our knowledge of excluded muscle excitations to determine how well they could be constructed using synergy excitations obtained from the included muscles. Of course, finding the unknown synergy vector weights for the excluded muscles this way would not be possible in practice. Determining how to find these unknown weights remains a topic of ongoing research.

At a minimum, by showing that synergy extrapolation can work for walking, we have demonstrated that it is reasonable to reduce the indeterminacy in muscle force optimization problems by using a small number of synergy excitations to construct all muscle excitations [27]. For example, consider a static optimization problem for walking where a 5 degrees-of-freedom leg model is controlled by 50 muscle-tendon actuators and the gait cycle is divided into 100 time points. Theoretically, we would have 5 × 100 = 500 inverse dynamic moments to be matched by 50 × 100 = 5000 unknown muscle excitation values (ignoring activation dynamics), which is underdetermined by a factor of 10. If we now model all muscle excitations using 5 synergies, we would have 5 × 100 = 500 unknown synergy excitation values and 5 × 50 = 250 unknown synergy vector weights for a total of 750 unknowns. For this situation, the problem would be underdetermined by only a factor of 1.5, making the design space much smaller and the solution much closer to unique. However, because the unknown synergy vector weights apply to all time frames, the optimization problem must now be solved over all time frames simultaneously, increasing the computational cost of the solution process significantly (e.g., Refs. [27,35]). If the unknown synergy excitations could be obtained from experimentally measured muscle excitations, they could serve as subject-specific basis functions to constrain and simplify the construction of all muscle excitations [26]. In this case, only the synergy vector weights would be unknown, and the problem formulation could theoretically become overdetermined.

While synergy-based muscle force optimizations seem promising, technical challenges remain with implementing such approaches. When synergy solutions are calculated, a constraint must be placed on the synergy excitations *W* or the synergy vectors *H* such that the product *WH* is unique. Some investigators require that the maximum value of each synergy vector be one [36,37], while others require that the maximum value of each synergy excitation be one [38]. The matlab nnmf algorithm normalizes the magnitude of each synergy vector to be one, which means that the amplitude of the corresponding synergy excitations can vary significantly as a function of the number of muscle excitations included in the analysis. In a muscle force optimization problem, the simplest approach is to add a constraint requiring the magnitude of each synergy vector to be one, as done in a recent study [28]. Forcing all predicted synergy vectors or synergy excitations to have a maximum value of one would be more difficult to implement, since maximum value constraints are not differentiable. When experimental EMG data are available from muscles with high extrapolation VAF, how to use the calculated synergy excitations also requires further investigation. No reliable method currently exists for normalizing experimental muscle excitations, and the shapes of calculated synergy excitations are affected by how the experimental muscle excitations are normalized. Thus, it is not clear how synergy excitations calculated from experimental muscle excitations should be used to construct muscle excitations predicted by muscle force optimizations.

Despite these challenges, at least three recent studies have attempted to use synergy-based methods to estimate muscle forces using static optimization. Sartori et al. [26] constructed 34 muscle excitations using 5 “excitation primitives” fitted to 5 synergy excitations obtained from 16 experimentally measured muscle EMG signals. Each excitation primitive was a task-generic Gaussian-shaped impulsive curve representing data collected from two subjects performing four different gait tasks. Using this low-dimensional representation, the authors were able to match experimental hip, knee, and ankle moments and experimental muscle excitations reasonably well for both subjects over all four gait tasks. Walter et al. [27] constructed 44 muscle excitations using 5 synergy excitations extracted from 13 experimental muscle EMG signals collected from a subject implanted with a force-measuring knee replacement. They showed that use of 5 synergy excitations instead of 44 independent muscle excitations improved the accuracy of knee contact force predictions. However, muscle excitations predicted by the 5 synergy excitations were only slightly closer to the 13 experimental muscle excitations than were those predicted by the 44 independent muscle excitations. Serrancolí et al. [35] found 44 muscle excitations that could be closely fit by 5 synergy excitations derived from ten experimental muscle EMG signals collected from a different instrumented knee subject. Neuromusculoskeletal model parameter values were calibrated using a two-level optimization approach, and the calibrated model predicted knee contact and leg muscle forces for walking trials withheld from the calibration process. Keeping predicted muscle excitations close to an experimental synergy solution resulted in accurate knee contact force predictions only when model parameter values were well calibrated. The results of the present study provide an experimental justification for using synergy-based methods to construct muscle excitations, including those of deep muscles requiring fine-wire EMG, in these recent muscle force optimization studies.

We defined two measures, 90% extrapolation VAF and mean extrapolation VAF, for the top 10% of subsets, to evaluate the ability of synergy excitations derived from included muscle subsets to construct muscle excitations not included in the synergy analysis. In this way, we hoped to eliminate the potential bias of a single measure in selecting acceptable muscle subsets. We averaged the individual muscle VAF values into a single extrapolation VAF to facilitate interpretation of extrapolation ability for any muscle combination. The 90% extrapolation VAF measure was defined based on typical VAF cutoffs used in synergy analyses reported in the literature [18–20]. This measure was able to identify subsets that extrapolated well, but it omitted subsets that might be deemed acceptable if a small reduction in cutoff was allowed. This observation motivated us to use the Mean extrapolation VAF for the top 10% of subsets measure for the remainder of the study. This measure provided finer granularity and showed clearly that a large number of subsets were on the cusp of achieving 90% extrapolation VAF.

The four muscle selection heuristics in Table 4 were defined in an attempt to choose, in a methodological manner, subsets of muscle excitations that contain the most information (i.e., that extrapolate well). Heuristics were chosen to reflect obvious interpretations of the muscle frequency results in Table 3. All heuristics performed well, identifying muscle combinations representative of all primary lower extremity functions (e.g., a hip extensor, a hip adductor, a knee flexor, a knee extensor, either head of biarticular gastrocnemius, soleus, tibialis anterior, and peroneus longus). Reconstruction VAF values were greater than extrapolation VAF values by 5.5%, on average, indicating that knowledge of reconstruction VAF may be useful for estimating the level of extrapolation VAF and that extrapolation VAF can be expected to be slightly lower than reconstruction VAF.

Though our muscle selection heuristics were generated using EMG data from only two subjects, the available EMG data were extensive, and our findings provide unique information about which muscle EMG signals should be prioritized to maximize information content (Table 5). Overall, our muscle selection heuristics suggest that given a limited number of EMG channels, researchers should collect surface EMG data from muscles that would commonly be selected—uniarticular and biarticular flexor and extensor muscles from each major muscle group. For uniarticular muscles, these include a hip extensor (GlutMax), a knee extensor (VasLat over VasMed), an ankle plantarflexor (Sol), and an ankle dorsiflexor (TibAnt). No uniarticular hip flexor (Iliopsoas) or uniarticular knee flexor (BiFemShort) was included as these muscles are difficult to measure with surface electrodes. For biarticular muscles, selected muscles include a posterior thigh muscle (SemiMemb, SemiTend, or BiFemLong—no clear preference), possibly an anterior thigh muscle (RF), and a posterior calf muscle (GasMed or GasLat). Our muscle selection heuristics also suggest that researchers should consider collecting surface EMG data from some muscles that are not commonly selected—frontal plane muscles (AddLong or AddMag, TFL, and PerLong) spanning all three joints. Identification of these additional muscles is not surprising, given the unique stabilizing roles they play in the frontal plane. Other frontal plane stabilizers (GlutMed, Grac) were not identified, possibly due to the difficulty in measuring these muscles reliably with surface EMG. However, from Table 2, if we added one more muscle to the list, it appears that GlutMed would be a reasonable choice. Interestingly, VasLat appears to be preferable over VasMed, while GasMed and GasLat, or SemiMem and BiFemLong, appear in the same list, suggesting that EMG data from these muscles may contain more independent information than commonly believed. While these muscle selection heuristics are useful, we emphasize that they should be viewed with caution given the limited number of subjects analyzed.

Our methodology was motivated by the potential utility of muscle synergies in predictive modeling applications. Rather than calculating a VAF value for the entire constructed muscle set, VAF values were calculated for each individual muscle and then averaged. In this way, a poorly constructed muscle would have greater influence on the measures of extrapolation ability. This behavior was desired since a poorly constructed muscle excitation can significantly influence a predictive walking optimization, especially if the muscle is a major contributor to a particular lower extremity joint moment. Since muscle excitations in walking simulations are constrained to be between 0 and 1, synergy excitations obtained from synergy analysis of included muscle subsets were normalized to be between 0 and 1 prior to extrapolation. Interestingly, when synergy vectors were normalized to be between 0 and 1 instead, with the corresponding synergy excitations scaled accordingly, synergy vectors found by synergy extrapolation were less consistent with those found directly from synergy analysis of the excluded muscle excitations.

While VAF is a useful and commonly accepted measure of muscle excitation reconstruction quality, how well a reconstructed muscle excitation approximates the original muscle excitation may not be fully captured in a single percentage value. To illustrate this issue, we took sample EMG data from the paretic leg of the stroke subject walking at 0.7 m/s and applied our synergy extrapolation procedure. For this example, a single included muscle subset was selected using our muscle selection heuristics. Figure 4 shows the synergy-constructed muscle excitations using 3, 4, and 5 synergies for muscles in the included and excluded subset pair, and VAF values corresponding to each approximation are provided in Table 6. As the number of synergies used in the analyses was increased, VAF values increased and all muscles reached 90% VAF at 5 synergies. The included muscle excitations reached higher VAF values and had superior muscle reconstruction quality in general, matching the shape and magnitude of the original muscle excitations at 5 synergies. For the excluded muscle subset, some muscle excitations were not approximated quite as well, failing to match well the shape and magnitude of the original muscle excitation. Notably, both the iliopsoas and rectus femoris in the excluded muscle subset were barely able to achieve 90% VAF at 5 synergies and were only slightly improved from their 3 and 4 synergy reconstructions. Underperforming excluded muscle excitation reconstructions such as these suggest that additional synergies extracted from this included muscle subset may not improve construction quality further. Furthermore, to represent these excluded muscle excitations accurately in a predictive modeling simulation, it may be necessary to include these muscles in the experimental EMG recordings.

One unexpected observation was that for a given number of synergies, the nonparetic leg of the stroke subject had a slightly higher VAF than did the paretic leg (Table 5). This observation is contrary to results reported by Clark et al. [20]. However, when we performed synergy analysis using the same eight muscles used in that study, the VAF for the two legs became nearly equal, though the paretic leg still had higher variability. Furthermore, methodological differences exist between how we performed our synergy analyses and how Clark et al. performed theirs. The fact that our stroke subject was high functioning may have also affected the comparison between nonparetic and paretic legs. Importantly, as other studies have suggested, the number and choice of muscles had a significant influence on the synergy solutions in the present study [39]. Given the known heterogeneity in this clinical population, analysis of EMG data from a single stroke subject is a significant limitation.

In conclusion, this study has demonstrated that for both a healthy subject and a stroke subject, synergy excitations extracted from eight muscle excitations derived from commonly measured surface EMG signals during walking can be used to construct all remaining muscle excitations from the same data set with good accuracy. Even though the extrapolation process was more challenging for the healthy subject data set (16 excluded muscle excitations to be constructed by synergy extrapolation versus only eight for each leg of the stroke subject), extrapolation VAF was similar for both subjects, suggesting that proper selection of included muscles, and use of a sufficient number of included muscles, are the most critical issues. These findings support the use of synergy excitations derived from experimental EMG signals to construct muscle excitations for missing EMG signals as part of a muscle force optimization. However, construction of the missing EMG signals is most likely to work well if the collected EMG signals follow our proposed muscle selection heuristics.

## Funding Data

Directorate for Engineering NSF REU fellowship (Grant No. CBET 1430584).

Directorate for Engineering NSF (Grant No. CBET 1159735).

## Nomenclature

- AddLong =
adductor longus

- AddMag =
adductor magnus

- BiFemLong =
biceps femoris long head

- BiFemShort =
biceps femoris short head

- EMG =
electromyography

- GasLat =
lateral gastrocnemius

- GasMed =
medial gastrocnemius

- GlutMax =
gluteus maximus

- GlutMed =
gluteus medius

- Grac =
gracilis

- PerLong =
peroneus longus

- RF =
rectus femoris

- RF2 =
rectus femoris (second recording location)

- SemiMemb =
semimembranosus

- SemiTend =
semitendinosus

- Sol =
soleus

- TFL =
tensor fasciae latae

- TibAnt =
tibialis anterior

- VAF =
variance accounted for

- VasLat =
vastus lateralis

- VasMed =
vastus medialis