Abstract

Sophisticated modeling and simulation, based on rigid and flexible multibody dynamics, are nowadays a standard procedure in the design and analysis of vehicle systems and are widely adopted for on-road driving. Off-road driving for both terrestrial wheeled and tracked vehicles, as well as wheeled and legged robots and rovers for extra-terrestrial exploration pose additional modeling and simulation challenges, a primary one being that of the vehicle–terrain interaction, modeling of deformable terrain, and terramechanics in general. Techniques for modeling deformable terrain span an entire range varying in complexity, representation accuracy, and ensuing computational effort. While formulations such as fully resolved granular dynamics, continuum representation of granular material, or finite element can provide a high level of accuracy, they do so at a significant cost, even when the implementation leverages parallel computing and/or hardware accelerators. Real-time or faster than real-time terramechanics is a highly desired capability (in applications such as training of autonomous vehicles and robotic systems) or critical capability (in applications such as human-in-the-loop or hardware-in-the-loop). We present a real-time capable deformable soil implementation, extended from the soil contact model (SCM) developed at the German Aerospace Center which in turn can be viewed as a generalization of the Bekker-Wong and Janosi-Hanamoto semi-empirical models for soil interaction with arbitrary three-dimensional shapes and arbitrary contact patches. This SCM implementation is available, alongside more computationally intensive deformable soil representations, in the open-source multiphysics package Chrono. We describe the overall implementation and the features of the Chrono SCM model, the efficient underlying data structures, the current multicore parallelization aspects, and its scalability properties for concurrent simulation of multiple vehicles on deformable terrain.

1 Introduction

Accurate and efficient simulation of off-road environments is a growing need, with military, civilian and exploratory applications. Military vehicle design is a main driver, as these vehicles operate away from a road grid more often than not. Such vehicles are often large and costly to operate, driving the desire to perform tests and research in simulation rather than in the real world. Research oriented toward civilian use has the same desire to bring down testing costs through use of simulation, although civilian cars are of course less frequently subject to off-road conditions. While the particular soil model discussed in the paper does not apply to tilling or earth-moving, agriculture and construction tasks take place in off-road environments and the mobility studies targeting these areas can benefit from simulation. Extraterrestrial rover missions are another area where soil deformation and resulting vehicle mobility are of key interest, and the need here is augmented by the fact that extraterrestrial environments are by definition challenging to test in.

While in all of these areas a faster simulation is always desirable and beneficial, there is increasing interest in simulations that can run at or faster than real-time. For example, human-in-the-loop and hardware-in-the-loop simulations have a hard real-time requirement. When a human controls a vehicle in simulation, the simulation must run at least as fast as real-time in order for the world the human sees to be displayed to them in real-time. When coupling simulation with a physical actuators or sensor (e.g., in development of embedded systems), hardware requirements impose restrictions that require real-time response from the plant simulation. In both case, the simulation must always be faster than real-time, it is not sufficient for to average faster than real-time speeds.

In addition, data-driven modeling and simulation, including the use of metamodels and machine learning approaches, represent areas with a continuously increasing need for large amounts of simulated data which can therefore benefit immensely from ever faster simulation capabilities. The quality of simulation-based surrogate models and simulation-based machine learning training is largely affected by the amount of available data and therefore the time required to construct such metamodels is proportional top the cost of simulation.

A relevant metric for simulation speed is the so-called real time factor (RTF), defined as the ratio between simulation and simulated time. A real-time capable simulation software must therefore achieve values of RTF below 1.

This work exists as part of the larger Chrono framework [1] and leverages many existing components of this existing body of work. In particular, the Chrono::Vehicle library [2] provides for efficient, physics-based simulation of both wheeled and tracked vehicles. Chrono also contains libraries dedicated to fluid–solid interaction and granular dynamics, which, as will be discussed in more detail shortly, are alternative means of modeling deformable soil. Having multiple methods of modeling deformable soil in one library allows easy comparison of these methods while using the same Chrono::Vehicle formulation for the vehicle that moves on the deformable soil.

To put the formulation discussed here in context, consider the continuum of modeling techniques that capture soil deformation and vehicle–terrain interaction. These techniques vary in fidelity, accuracy and corresponding computational complexity, and we illustrate some of these in Fig. 1.

Fig. 1
Different approaches to simulating deformable terrain. All simulations were conducted with Chrono. (a) Curiosity rover over DEM granular terrain. (b) Curiosity rover over SPH continuum representation of granular terrain. (c) Curiosity rover over SCM deformable terrain.
Fig. 1
Different approaches to simulating deformable terrain. All simulations were conducted with Chrono. (a) Curiosity rover over DEM granular terrain. (b) Curiosity rover over SPH continuum representation of granular terrain. (c) Curiosity rover over SCM deformable terrain.
Close modal

Arguably the highest fidelity is offered by fully resolved granular dynamics in which individual soil particles (potentially in the millions) interact through contact and friction, using the discrete element method (DEM). Figure 1(a) is a snapshot from a hill climbing simulation of the Mars Curiosity rover performed with the Chrono::Gpu module [3]. This DEM problem involved around 1.07 × 106 particles and, using a step size of 0.02 ms, completed a 19-s simulation in about 28 h on an NVIDIA RTX 2080 Ti GPU. This produces an RTF in the thousands, which is typical for DEM.

A different approach is to use homogenization techniques in a continuum representation (CR) of granular material; see Fig. 1(b). For more details on this method in Chrono, which relies on a smoothed particle hydrodynamics (SPH) approach, we direct the reader to [4]. This SPH problem included 6.5 × 106 particles and, with a step size of 0.25 ms, completed a 25-s simulation in about 10 h on an NVIDIA A100 GPU. This approach can result in RTF values in the hundreds, although an ongoing refactoring of the Chrono::FSI module is expected to improve upon this number.

There are other higher-fidelity formulations for deformable terrain, such as Finite Element or hybrid FEA-granular methods (such as the very interesting hierarchical approach described in [5]). But the third possible formulation for which we show a similar Curiosity rover simulation in Fig. 1(c) is based on the semi-empirical theory of Bekker [6], Wong [7], and Janosi and Hanamoto [8] and its generalization to arbitrary collision objects given by the soil contact model (SCM) [9,10], which is the topic of this paper. With SCM, one can achieve real-time or close-to-real-time on commodity processors using multithreaded parallelization with OpenMP.

The rest of this paper focuses exclusively on the SCM formulation. In Sec. 2, we provide details on the vehicle model used in assessing performance of the proposed SCM implementation. In Sec. 3, we provide a brief overview of the SCM approach and then describe the data and algorithmic changes done in the Chrono implementation to achieve real-time performance. The computational cost is further discussed in Sec. 4 where we show scaling analysis results both intra- and internode (for distributed simulation of multiple vehicles on deformable terrain). Conclusions and directions of future work are given in Sec. 5.

2 Vehicle Model

The primary goal of this work was to accelerate the simulation of one wheeled vehicle (modeled at the highest fidelity possible in Chrono::Vehicle) on deformable SCM terrain to the point where real-time performance can be achieved for one vehicle per process. This enables human-in-the-loop scenarios, for a single vehicle or, using distributed simulation, even when considering other vehicles sharing the same virtual environment with the ego vehicle.

Chrono::Vehicle [2] adopts a so-called template-based approach which provides a library of vehicle system and subsystem models that are fully parameterized (and as such, only define the topology and interface of any given subsystem). A concrete vehicle is defined by specifying actual parameters (geometry, inertia properties, force elements) in a set of templates that correspond to the particular vehicle being modeled. As such, the resulting vehicle model is a complex multibody system that accounts for all principal vehicle moving parts and includes full models of the engine, transmission, and driveline.

In this work, we consider an all wheel drive, off-road vehicle with two axles and four wheels. This model includes front and rear double wishbone suspension, a Pitman arm steering mechanism, and a full multibody driveline modeled with 1-D shaft elements which account for the rotational inertia and torque conversion properties of its various components (internal combustion engine, torque converter, transmission box, differentials, conic gears, and final axles). The resulting multibody model consists of 23 bodies, 39 kinematic joints, 17 1-D shaft elements, and 8 nonlinear spring-damper force elements. This translates into a mathematical model taking the form of a differential-algebraic system with 155 states and 146 constraints.

In the case of the SCM deformable terrain formulation, the vehicle–terrain interaction is achieved through forces generated by the terrain subsystem (full spatial forces consisting of a force and a moment, reduced to the wheel center) and applied to the vehicle tires, assumed rigid here, as external forces. From this point of view, the tire bodies are not treated in any special way; indeed, interaction of any other body that carries collision geometry with the SCM terrain is done in exactly the same way.

A typical simulation of a vehicle model as described above operating over a flat rigid terrain can be performed with Chrono with an RTF of about 0.3 on commodity hardware, such as the AMD Ryzen or Xeon processors used in the numerical experiments below, when simulated with a step-size of 2 ms. In off-road scenarios, where soil deformation must be considered both in calculating the vehicle-terrain interaction forces and to maintain and manage persistent soil deformation, the bottleneck is always terramechanics. Complex and higher-fidelity deformable soil (such as fully resolved granular dynamics, finite element, or continuum representation of granular dynamics) cannot be conducted in real-time. However, with a proper implementation—such as the one described herein, the semi-empirical formulation embedded in SCM has the ability to complete all required calculations in about 1 ms per step and thus provide real-time simulation capabilities. More details on performance of the proposed implementation are provided in Sec. 4.

3 Soil Contact Model for Real-Time Performance

The soil contact model in Chrono is based on the one developed at DLR, the German Space Agency [9,10]. SCM is a generalization of the Bekker formula which relates the normal pressure p to the sinkage z for a wheel of width b using a semi-empirical, experiment-based curve fitting with parameters Kc, Kϕ, and n [6]
p=(Kcb+Kϕ)zn
(1)
The SCM generalizes this to arbitrary collision shapes and terrains with arbitrary topology. Interacting shapes are found by casting rays from multiple terrain nodes and doing ray intersection tests with all collision shapes present in the simulation. With this, disjoint contact patches can be identified and Eq. (1) applied at each point, using an approximation for b based on the contact patch area (Apatch) and perimeter (Ppatch)
b12PpatchApatch
(2)
The Bekker-Wong formula is augmented with calculation of shear stress using the Janosi-Hanamoto approach [8]
τ=τmax(1ej/k)
(3)
τmax=c+ptan(ϕ)
(4)

where j is the accumulated shear, c is cohesion, ϕ the internal friction angle, and k the so-called Janosi parameter. Together, Eqs. (1) and (4) can be used to apply normal and tangential contact forces on the impactor object, all while keeping track of the soil deformation (which is assumed to be only along the SCM normal direction). For more details on the core principles of SCM, the reader is directed to [911].

Soil contact model is by no means the most accurate representation of deformable soil. But it is suitable for vehicle mobility studies as it is able to provide a full spatial vector (force and moment) representing the interaction between the vehicle running gear and the terrain and do so at a relatively low computational cost. Among the main shortcomings of SCM (stemming from the underlying semi-empirical Bekker formulas) we mention: (i) its inability to model soil flow under the running gear and (ii) its reliance on experimentally-measured data which is difficult to obtain, requires specialized experimental gear, and has high variability.

3.1 Previous Soil Contact Model Implementation.

While fully functional and robust for most use cases, the previous implementation of SCM in Chrono [11] could not approach real-time simulation. To motivate the recent improvements, we provide here a brief overview of the previous implementation. The main data structure in the approach described in [11] was a triangular mesh whose vertices represent the SCM nodes. With this, the core data processing proceeded as follows:

  1. Loop over all vertices of the SCM triangular mesh, perform ray casting into the global collision system, and store information at all ray hits.

  2. Use a flood-filling algorithm on the hit vertices to generate contact patches.

  3. From the list of contact patches, perform physics updates on the soil.

  4. If enabled, loop through all vertices and perform refinement, adding additional vertices in areas with contact.

The key slow point in the old implementation was step 1. In most use cases, vehicles will drive over only a fraction of the total environment. Looping over every vertex is a step proportional to the total size of the terrain, rather than to the area that vehicles actually interact with. While introduction of moving patches could limit expensive ray casting operations to vertices “near” an object of interest (e.g., a vehicle), the method's efficiency was still limited by tracking all vertices, even those that have not interacted with the vehicle, thus having space complexity proportional to the overall terrain size.

Step 4 attempted to alleviate this slowness by allowing the terrain mesh to be refined only in areas that a vehicle interacts with. While the specifics of the refinement algorithm are no longer relevant, the idea was to add additional vertices to the mesh in areas where the mesh had collided with an object. While theoretically sound, the mesh refinement itself tends to add un-necessary computational effort owing to the need of maintaining and updating connectivity information of an increasingly larger triangular mesh.

Steps 2 and 3 are essential to any vertex-based algorithm and are maintained in the new implementation. The flood filling step takes a list of vertices that saw a ray casting collision, and turns them into a set of contact patches (for example, one contact patch for each wheel of a car). The creation of contact patches is critical for the SCM algorithm, as a necessary step in approximating the parameter b using Eq. 2.

To put things in perspective and provide a baseline for the necessary code efficiency improvements, we start by mentioning the RTF obtain with the previous SCM implementation for the type of vehicle of interest. Details on the wheeled vehicle used in this and all subsequent performance tests are provided in Sec. 2. We target a scenario where the vehicle operates on a flat deformable terrain patch with an SCM grid spacing of 0.05 m. As indicated above, the previous SCM implementation was based on a data structure represented by a mesh with an adaptive refinement option. Because of this underlying representation, the cost of terramechanics was proportional with the size of the SCM terrain patch which was set here at 50 m x 50 m. The processor used for this test was an AMD Ryzen 7 3800X with 32 GB memory. With an SCM grid spacing of 0.05 m and no mesh refinement, the code ran with RTF = 4.03. With adaptive refinement enabled, using an initial grid spacing of 0.1 m and a maximum grid cell size of 0.05 m, we obtained RTF = 12.72 (which indicates that the cost of mesh refinement was in fact offsetting the benefits of a reduced number of necessary ray intersection tests).

3.2 Redesigned Soil Contact Model Implementation.

The decision to redesign the SCM implementation in Chrono was driven primarily by the desire to improve computational efficiency. Better performance is achieved by using efficient data structures and fast algorithms. The underlying data structure is a virtual regular grid whose nodes are populated only as they are contacted by some external object (be it a wheel, track shoe, or any other object with collision shapes). The deformed nodes are managed in a hash map (with hash based on the integral grid coordinates). This allows SCM terrain patches of virtually any dimension with computational performance slowly degrading only as more and more nodes are deformed. The adopted data structures also allow efficient O(1) extraction of nodes last deformed (i) to update the deformed terrain mesh for a visualization system (if one is attached to the simulation) and (ii) for effective communication in a distributed simulation context.

The adoption of the virtual grid as the base data structure pushed the computational bottleneck to the cost of ray-casting. This is addressed with a two-pronged approach. First, we minimize the number of required ray-casts by limiting them to nodes that are likely to interact with collision shapes (vehicle tires, track shoes, or any other impacting bodies). The user can specify an arbitrary number of oriented bounding boxes (OBB) attached to vehicle parts that can interact with the deformable SCM terrain. If no such OBBs are provided by the caller, one is defined at each time-step as the axis-aligned bounding box of all collision shapes present in the simulation. The resulting set of OBBs (updated at each time-step to reflect motion of the associated bodies) is then projected onto the SCM reference plane and rays are cast only from nodes falling in the shadow of one such OBB. This implements a so-called “moving patch” approach and can significantly improve over-all performance. Second, ray-casting is conducted in parallel using a MapReduce algorithm, provided the underlying collision system supports parallelization of the ray casting operations. The effects of these two approaches on the computational performance are further discussed in Sec. 4.

In its final form, the core SCM computation in Chrono consists of 4 loops (see Algorithm 1):

  1. The first loop, over all moving patches (or single patch if none defined by the user) is a MapReduce algorithm for ray casting and gathering of all nodes hit at the current step;

  2. Next, neighboring hit nodes, from the global hit record generated in stage 1, are combined into contact patches using a flood-filling algorithm;

  3. The SCM generalization to the classical Bekker formula is effectively embedded in the 3rd stage in which we calculate the perimeter and area of each contact patch and approximate the parameter b in the pressure-sinkage relationship, and the accumulated shear based on Janosi-Hanamoto model;

  4. Finally, the soil is updated at each hit node while collecting and applying normal and tangential forces on the colliding object.

For the force calculation in the last step of this algorithm, each hit node of the underlying SCM grid, that is each node for which ray-casting flags a hit at a height lower than the current height of that node, is associated with the solid body that carries the intersected collision geometry. Each such node contributes normal and tangential forces calculated with the stresses given by Eqs. (1) and (4), respectively, using a grid cell area based on the specified SCM grid spacing and assuming an application point at the current node location.

Algorithm 1

Chrono SCM algorithm

   / * MapReduce ray casting*/
1 foreach moving patchdo
        / * Map (parallel)*/
2    foreach node in patch shadowdo
3    Generate ray in SCM normal direction
4    Ray intersection with collision system
     / *  Reduce (sequential) */
5   Combine hits in global list
    / *  Queue-based flood filling patches */
6 foreach hit nodedo
7     ifnot assigned to a contact patchthen
8       Create new contact patch
9    Add hit node to contact patch and queue
10     foreach neighbor hit nodedo
11       Add to current contact patch
12       Enqueue neighbor
    / * Process contact patches */
13 foreach contact patchdo
14   Calculate 2-D convex hull (ch) of the contact patch
15   Approximate b0.5Pch/Ach
    / * Compute deformable soil forces */
16 foreach hit nodedo
17   Calculate penetration (sinkage) and relative velocity
18   Calculate pressure (B-W)
19   Accumulate shear (J-H)
20   Add normal and tangential force to contact object
   / * MapReduce ray casting*/
1 foreach moving patchdo
        / * Map (parallel)*/
2    foreach node in patch shadowdo
3    Generate ray in SCM normal direction
4    Ray intersection with collision system
     / *  Reduce (sequential) */
5   Combine hits in global list
    / *  Queue-based flood filling patches */
6 foreach hit nodedo
7     ifnot assigned to a contact patchthen
8       Create new contact patch
9    Add hit node to contact patch and queue
10     foreach neighbor hit nodedo
11       Add to current contact patch
12       Enqueue neighbor
    / * Process contact patches */
13 foreach contact patchdo
14   Calculate 2-D convex hull (ch) of the contact patch
15   Approximate b0.5Pch/Ach
    / * Compute deformable soil forces */
16 foreach hit nodedo
17   Calculate penetration (sinkage) and relative velocity
18   Calculate pressure (B-W)
19   Accumulate shear (J-H)
20   Add normal and tangential force to contact object

The redesign of the Chrono SCM implementation maintains the important features available in the previous version. This includes the option for an arbitrary orientation of the SCM reference frame with respect to the global reference frame, as well as the ability of initializing the SCM terrain as a rectangular patch of given dimensions, from a height-map image whose gray-levels are translated to height information, or else from resampling a given triangular mesh provided as a Wavefront OBJ file. Furthermore, like in the original implementation, the current SCM code includes an optional postprocessing step for heuristic build-up of material at the boundary of contact patches, as described in Sec. 3.3.

In addition, we included new features and capabilities, including the ability to combine soil-soil and soil-tread parameters (if such data are available, the two parameter sets are included as a weighted sum based on a user-prescribed fraction of “tread-to-tire” area) and the ability to define spatially-dependent soil parameters (through a user-supplied callback mechanism).

3.3 Bulldozing Algorithm.

The SCM implementation in Chrono also provides an optional bulldozing algorithm (which has remained essentially the same from the previous version, except for corresponding efficiency gains stemming from the new data structures). Figure 2 provides a visual illustration of the optional bulldozing algorithm. Without bulldozing effects enabled, a simulation of a cylinder rolling over an SCM patch (that is an implementation of the previous algorithm) is shown in Fig. 2(a). The purpose of including bulldozing effects is essentially to conserve volume. The heuristics are simply a mechanism for redistributing the displaced soil to nodes adjacent to the contact patch. To this end, the displaced mass is accumulated and assigned to nodes in the instantaneous boundary of the contact patch, as shown in Fig. 2(b). Next, the erosion front is extended (by a user-specified amount) in directions outside the contact patch (Fig. 2(c)). Finally, a topological smoothing operator is applied iteratively. This is essentially a diffusion process and the only stage of the bulldozing algorithm that could potentially be parallelized (with standard techniques used in parallelization of Laplace equation). However, since only a handful of iterations are typically used, this parallelization is currently not implemented in the Chrono SCM code, although this may be done in the future. The final outcome is illustrated in Fig. 2(d).

Fig. 2
Stages in algorithm for including bulldozing effects. (a) Step 0: SCM simulation without bulldozing effects. (b) Step 1: Accumulate displaced mass at contact patch boundary. (c) Step 2: Extend erosion front. (d) Step 3: Diffuse displaced material throughout erosion front.
Fig. 2
Stages in algorithm for including bulldozing effects. (a) Step 0: SCM simulation without bulldozing effects. (b) Step 1: Accumulate displaced mass at contact patch boundary. (c) Step 2: Extend erosion front. (d) Step 3: Diffuse displaced material throughout erosion front.
Close modal

4 Performance Results

Having described the overall algorithm and its implementation, we turn our attention to measuring performance of the resulting SCM code.

While all simulations used in the following performance experiments were conducted on a flat SCM terrain patch, we note that this does not affect any of our conclusions. Indeed, the terrain geometry is relevant only at the initialization stage (when the height at each SCM grid node is set, for example, from a given triangular mesh or else from a height field). All subsequent run-time calculations are independent of the undeformed height at any given grid node.

4.1 Strong Scaling With Multicore Parallel Computing.

We begin by discussing the effects of multicore parallelization of a single vehicle on SCM simulation. After the first implementation, code profiling revealed that the single most expensive operation in the previous algorithm is the ray casting (recall that one ray must be intersected with the overall collision system, technically from every single node of the underlying SCM grid). This costly operation is first mitigated by using moving patches so that rays are cast only from the 2-D shadow projections of these moving patches.

Additional performance gains can be obtained by performing the ray casting in parallel, for example, within an OpenMP parallel for loop (the Map stage in lines 2–4 of Algorithm 1). This requires that the collision detection system implements a thread-safe ray casting algorithm, which is the case in Chrono.

The results of a first implementation (which performed ray casting and hash map loading of the hit nodes simultaneously) are shown in Fig. 3(a), using up to 8 OpenMP threads.

Fig. 3
Strong scaling performance of SCM ray casting. Results represent the RTF for simulating a single vehicle over deformable SCM terrain and the corresponding processing time, using different numbers of OpenMP threads for the parallel ray casting loop. These experiments were conducted on an AMD Ryzen 7 3700X 8-core processor. (a) Simultaneous ray casting and hash map loading (requiring a critical section). (b) Map-reduce algorithm with independent parallel ray casting.
Fig. 3
Strong scaling performance of SCM ray casting. Results represent the RTF for simulating a single vehicle over deformable SCM terrain and the corresponding processing time, using different numbers of OpenMP threads for the parallel ray casting loop. These experiments were conducted on an AMD Ryzen 7 3700X 8-core processor. (a) Simultaneous ray casting and hash map loading (requiring a critical section). (b) Map-reduce algorithm with independent parallel ray casting.
Close modal

The first thing to emphasize is that the seemingly modest improvements are a clear expression of Amdahl's law [12] which caps the performance gains due to parallelization by the fraction of code that remains sequential. The reported RTF includes the overall simulation execution time (including all vehicle and soil dynamics) and ray casting is a relatively small fraction of that. Furthermore, shared memory parallelization is affected by other memory issues (such as false sharing). Second, we can clearly see the effect of the necessary critical section embedded in the parallel region (required for synchronizing access to the global shared data), especially as we move to larger number of threads.

To address the latter issue, we switched to the MapReduce [13] implementation shown in Algorithm 1. While this eliminates the costs associated with the use of a critical section, the other performance limiting issues (especially those related to Amdahl's law) remain. Nonetheless, this shows the real-time capabilities of the current implementation on a commodity processor (Fig. 3(b)).

4.2 Weak Scaling in Distributed Parallel Simulations.

Some applications require simultaneous simulation of multiple vehicles, for example, during assessment of autonomous vehicle formation holding. Conventional simulation of such scenarios can provide at best linear scaling. The solution is then distributing the simulation over multiple hardware assets while maintaining a coherent state of the shared environment.

In Chrono this is provided by the Synchrono module which provides an infrastructure based on message passing interface (MPI) for distributed, multi-agent simulation [14]. While direct physical interaction between different vehicles is currently not supported, Synchrono maintains a coherent virtual world in which all vehicles can sense each other and, when using SCM deformable terrain, in which the soil deformation is synchronized. As shown in Fig. 4, such an infrastructure can provide O(1) scaling for simulations involving many vehicles.

Fig. 4
Synchrono scaling (rigid terrain)
Fig. 4
Synchrono scaling (rigid terrain)
Close modal

In this section, we present weak scaling performance of Synchrono when simulating off-road vehicles on SCM deformable terrain. By weak scaling, we mean an analysis where the global problem size is increased while maintaining the same local problem size (per MPI rank in this case). For weak scaling analysis of SCM, we use variants of the simulation shown in Fig. 5, by assigning one vehicle per Synchrono MPI rank and varying the number of nodes and vehicles/node. In other words, we perform simultaneous vehicle simulations like the single-vehicle simulation discussed previously.

Fig. 5
Concurrent simulation of multiple vehicles with Synchrono. While somewhat difficult to see in this image, the global state of the SCM deformable terrain is maintained consistent on all ranks so that any vehicle drives on soil potentially deformed by any other vehicle.
Fig. 5
Concurrent simulation of multiple vehicles with Synchrono. While somewhat difficult to see in this image, the global state of the SCM deformable terrain is maintained consistent on all ranks so that any vehicle drives on soil potentially deformed by any other vehicle.
Close modal

For several reasons, related to the current state of our cluster, we were only able to utilize up to 9 nodes. For simulations lasting 10 s each, we record the effective RTF (that is, the RTF from the slowest rank) for the distributed simulation when assigning 1 vehicle (that is 1 MPI rank) per node, then 2 vehicles per node and finally, 3 vehicles per node.

The plot in Fig. 6(a) shows the scaling performance when each individual simulation uses a single OpenMP thread for parallelization of the local SCM ray casting. We notice a relatively minor effect of Gustafson's law (the weak scaling equivalent of Amdahl's law) [15] with the slight increase of the effective RTF related to the increase in size of the SCM grid hash tables due to incorporating deformations from all other ranks and, less so, due to the overhead of MPI broadcast communication.

Fig. 6
Weak scaling performance of distributed simulation of multiple vehicles. (a) Using a single OpenMP thread per rank for SCM ray-casting. (b) Using two OpenMP threads per rank for SCM ray-casting. (c) Using three OpenMP threads per rank for SCM ray-casting.
Fig. 6
Weak scaling performance of distributed simulation of multiple vehicles. (a) Using a single OpenMP thread per rank for SCM ray-casting. (b) Using two OpenMP threads per rank for SCM ray-casting. (c) Using three OpenMP threads per rank for SCM ray-casting.
Close modal

Overall performance (meaning lower RTF) improves significantly when allocating two OpenMP threads per vehicle (rank) for SCM ray casting (see Fig. 6(b)). Finally, with three OpenMP threads per vehicle we manage to conduct even the most challenging configuration (involving 27 vehicles) in real-time (that is RTF < 1).

We note that the processors available on the cluster partition we used for this analysis are relatively older Xeons. Furthermore, the most challenging configuration used here (three vehicles per node, each using three OpenMP threads) reaches the limit beyond which threads would be assigned to virtual cores – with hyper threading negatively affecting OpenMP performance. This was another reason to limit this particular analysis to only 9 nodes.

5 Conclusions

We have presented the latest incarnation and implementation of the Chrono SCM deformable soil which can offer real-time vehicle-terrain simulation capabilities for both single vehicle or (using the Synchrono infrastructure) for multiple vehicles sharing the same virtual environment.

The improved SCM implementation leverages smarter data structures which eliminate almost entirely the cost of managing and querying the SCM virtual grid, while also allowing terrains of arbitrary dimension without efficiency penalties. Further improvements are obtained by minimizing the require number of ray intersection tests using a moving patch approach and multicore parallel the ray casting operations.

The ability of simulating SCM deformable terrain at or faster than real-time enables several key use cases and greatly facilitates others. Recent work [16] focuses on interfacing the Synchrono module in Chrono with the National Advanced Driving Simulator (NADS) to conduct geographically distributed simulations for analysis of mixed traffic involving conventional and autonomous vehicles, with a human in the loop driving the NADS vehicle. Because of the real-time requirements, only on-road scenarios are considered, but the work presented here opens the door to moving such virtual experiments off-road.

Efficient off-road mobility on deformable terrain is also critical in facilitating machine learning (ML) training through simulation, such as the use of reinforcement learning for end-to-end autonomous vehicle control policies. Such ML-derived policies have been shown to be successful in controlling autonomous vehicles in convoy scenarios [17] or in “A to B” navigation with obstacle avoidance on rough deformable terrain [18]. The accuracy and quality of such policies derived through reinforcement learning depend on the availability of extensive training data. When such data is the result of simulation, efficiency becomes a crucial feasibility requirement.

Yet another use case which can benefit from faster than real-time simulation is the generation of metamodels (such as surrogate models) used in stochastic simulations for uncertainty quantification [19]. In this type of analysis, physics-based simulations are used to provide the data for constructing surrogate models (e.g., though dynamic kriging) which are then used in Monte Carlo uncertainty quantification simulations. As in the case of ML, efficient physics simulation capabilities allow for more effective and accurate construction of metamodels.

Ongoing and future work focuses on two different aspects: (i) leveraging this capability for various applications, such as Human-in-the-loop scenarios (e.g., interaction between conventional and autonomous off-road vehicles) which require real-time capabilities and in ML training which requires “as fast as possible” simulation capabilities; and (ii) continuing to improve the performance of the simulation tools for deformable terrain, including the more accurate but computationally intensive formulations mentioned in Sec. 1, as well as further improving the SCM performance (for example, by using GPU acceleration).

Funding Data

  • U.S. Department of Defense (Award No. W56HZV-17-C-0095; Funder ID: 10.13039/100000005).

Data Availability Statement

The data and information that support the findings of this article are freely available online.2

References

1.
Tasora
,
A.
,
Serban
,
R.
,
Mazhar
,
H.
,
Pazouki
,
A.
,
Melanz
,
D.
,
Fleischmann
,
J.
,
Taylor
,
M.
, et al.,
2016
, “
Chrono: An Open Source Multi-Physics Dynamics Engine
,”
High Performance Computing in Science and Engineering – Lecture Notes in Computer Science
,
Springer International Publishing
,
T.
Kozubek
, ed., Springer, Cham, Switzerland, pp.
19
49
.
2.
Serban
,
R.
,
Taylor
,
M.
,
Negrut
,
D.
, and
Tasora
,
A.
,
2019
, “
Chrono::Vehicle Template-Based Ground Vehicle Modeling and Simulation
,”
Int. J. Veh. Performance
,
5
(
1
), pp.
18
39
.10.1504/IJVP.2019.097096
3.
Fang
,
L.
,
Zhang
,
R.
,
Vanden Heuvel
,
C.
,
Serban
,
R.
, and
Negrut
,
D.
,
2021
, “
Chrono::GPU: An Open-Source Simulation Package for Granular Dynamics Using the Discrete Element Method
,”
Processes
,
9
(
10
), p.
1813
.10.3390/pr9101813
4.
Hu
,
W.
,
Zhou
,
Z.
,
Chandler
,
S.
,
Apostolopoulos
,
D.
,
Kamrin
,
K.
,
Serban
,
R.
, and
Negrut
,
D.
,
2022
, “
Traction Control Design for Off-Road Mobility Using an SPH-DAE co-Simulation Framework
,”
Multibody Syst. Dyn.
,
55
(
1–2
), pp.
165
188
.10.1007/s11044-022-09815-2
5.
Yamashita
,
H.
,
Chen
,
G.
,
Ruan
,
Y.
,
Jayakumar
,
P.
, and
Sugiyama
,
H.
,
2019
, “
Hierarchical Multiscale Modeling of Tire–Soil Interaction for Off-Road Mobility Simulation
,”
ASME J. Comput. Nonlinear Dyn.
,
14
(
6
), p.
061007
.10.1115/1.4042510
6.
Bekker
,
M. G.
,
1956
,
Theory of Land Locomotion; the Mechanics of Vehicle Mobility
,
University of Michigan Press
,
Ann Arbor, MI
.
7.
Wong
,
J. Y.
,
2001
,
Theory of Ground Vehicles
,
Wiley
,
New York
.
8.
Janosi
,
Z.
, and
Hanamoto
,
B.
,
1961
, “
The Analytical Determination of Drawbar Pull as a Function of Slip for Tracked Vehicles in Deformable Soils
,”
Proceedings of the 1st International Conference on Mechanical Soil–Vehicle Systems
,
Turin, Italy
.
9.
Krenn
,
R.
, and
Hirzinger
,
G.
,
2009
, “
SCM – A Soil Contact Model for Multi-Body System Simulations
,” 11th European Regional Conference of the International Society for Terrain-Vehicle Systems (
ISTVS
), Bremen, Germany, Oct. 5–8.https://core.ac.uk/download/pdf/11138706.pdf
10.
Krenn
,
R.
, and
Gibbesch
,
A.
,
2011
, “
Soft Soil Contact Modeling Technique for Multi-Body System Simulation
,”
Trends in Computational Contact Mechanics
,
Springer
, Berlin, pp.
135
155
.
11.
Tasora
,
A.
,
Mangoni
,
D.
,
Negrut
,
D.
,
Serban
,
R.
, and
Jayakumar
,
P.
,
2019
, “
Deformable Soil With Adaptive Level of Detail for Tracked and Wheeled Vehicles
,”
Int. J. Veh. Performance
,
5
(
1
), pp.
60
76
.10.1504/IJVP.2019.097098
12.
Amdahl
,
G.
,
1967
, “
Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities
,”
AFIPS Conf. Proc.
,
30
, pp.
483
485
.https://dl.acm.org/doi/pdf/10.1145/1465482.1465560
13.
Dean
,
J.
, and
Ghemawat
,
S.
,
2008
, “
MapReduce: Simplified Data Processing on Large Clusters
,”
Commun. ACM
,
51
(
1
), pp.
107
113
.10.1145/1327452.1327492
14.
Taves
,
J.
,
Elmquist
,
A.
,
Young
,
A.
,
Serban
,
R.
, and
Negrut
,
D.
,
2020
, “
Synchrono: A Scalable, Physics-Based Simulation Platform for Testing Groups of Autonomous Vehicles and/or Robots
,” Proceedings of 2020 International Conference on Intelligent Robots and Systems (
IROS
),
Las Vegas, NV, Oct. 25–29, pp. 2251–2256
.10.1109/IROS45743.2020.9341585
15.
Gustafson
,
J. L.
,
1988
, “
Reevaluating Amdahl's Law
,”
Commun. ACM
,
31
(
5
), pp.
532
533
.10.1145/42411.42415
16.
Benatti
,
S.
,
Schwarz
,
C.
,
Young
,
A.
,
Elmquist
,
A.
,
Serban
,
R.
, and
Negrut
,
D.
,
2022
, “
A Geographically Distributed Simulation Framework for the Analysis of Mixed Traffic Scenarios Involving Conventional and Autonomous Vehicles
,”
SAE
Paper No. 2022-01-0839.10.4271/2022-01-0839
17.
Young
,
A.
,
Taves
,
J.
,
Elmquist
,
A.
,
Benatti
,
S.
,
Tasora
,
A.
,
Serban
,
R.
, and
Negrut
,
D.
,
2022
, “
Enabling Artificial Intelligence Studies in Off-Road Mobility Through Physics-Based Simulation of Multi-Agent Scenarios
,”
ASME J. Comput. Nonlinear Dyn.
,
17
(
5
), p.
051001
.10.1115/1.4053321
18.
Benatti
,
S.
,
Young
,
A.
,
Elmquist
,
A.
,
Taves
,
J.
,
Tasora
,
A.
,
Serban
,
R.
, and
Negrut
,
D.
,
2022
, “
End-to-End Learning for Off-Road Terrain Navigation Using the Chrono Open-Source Simulation Platform
,”
Multibody Syst. Dyn.
,
54
(
4
), pp.
399
414
.10.1007/s11044-022-09816-1
19.
Choi
,
K.
,
Jayakumar
,
P.
,
Funk
,
M.
,
Gaul
,
N.
, and
Wasfy
,
T. M.
,
2019
, “
Framework of Reliability-Based Stochastic Mobility Map for Next Generation Nato Reference Mobility Model
,”
ASME J. Comput. Nonlinear Dyn.
,
14
(
2
), p. 021012.10.1115/1.4041350