Implementation of parallel processing in computational fluid dynamics (CFD) is shortening the time required to design products and systems, and is bringing once-elusive problems under a new measure of control. Parallel processing, which is making ever larger models practical, is based on an idea called domain decomposition. The software divides the flow domain into segments of roughly equal computational work. The technology is influencing automotive and aerospace testing by getting more and more work out of the wind tunnel and onto a computer, where it costs less to perform. DaimlerChrysler in Stuttgart, Germany, explored the aerodynamics of its Mercedes E-class sedan with a 10 million cell model. Besides saving time and money by parallel processing, CFD simulation permits review of arbitrary locations in a flow field, including particle tracking, and can rapidly highlight areas of concern about heat build-up. There are several factors that can inhibit a high degree of parallel scaling. Domain decomposition algorithms affect the load balancing between processors. Imbalances slow the system because the solution isn't in until the last processor is finished
Simulations of problems ranging from engine cooling to environmental air currents are reaching a speed and scale that only a few years ago would have been out of reach for anything less than the most advanced, and expensive, supercomputer.
Readily available hardware and software are supporting flow models involving millions of cells that have explored the vortices behind the wings of jets and breathed the atmosphere permeating an airport in Singapore.
The simulations rely on parallel processing-that is, on breaking up complex, large-scale models and dividing the parts among several processors to run at the same time.
Parallel processing has taken large-scale CFD modeling out of the realm of the exotic and made it commercially practical. It is shortening the time required to design products and systems, and is bringing once-elusive problems under a new measure of control.
The technology is influencing automotive and aerospace testing by getting more and more work out of the wind tunnel and onto a computer, where it costs less to perform. CFD simulation is used to guide smarter tests, for example eliminating preliminary models at the start of the design process and using the expensive wind tunnel for detailed models that are more representative of the fin al design.
A recent case at DaimlerChrysler in Auburn Hills, Mich.—a study of the thermal effects of airflow into the engine compartment-would have been impractical only using real-world models. It also would have taken 27 hours to simulate on one processor, but divided among 64, the job required just 27 minutes.
As recently as three years ago, high resolution in computational fluid dynamics was prohibitively costly, because of supercomputer prices and the time needed to solve equations. But much has changed. Technological breakthroughs on several fronts have substantially reduced meshing and solution turnaround times. Software and hardware developments allow simulations to be apportioned among dozens of processors.
NASA has a setup that has run problems through as many as 256 processors at one time.
The system consists of two merged Silicon Graphics Inc. Origin 2000 computers, each with 128 processors. The linked system, designed in a joint research program between SG I and NASA Ames, has 64 gigabytes of memory and nearly a terabyte of disk storage. Its maiden run was a computation of the 35 million point CFD calculation using Overflow/ MLP code to study the airflow over a new aircraft design for a commercial customer. Overflow- MLP is a NASA Ames highly parallel version of the production Overflow code originally developed at Ames for Cray vector machines.
According to Jim Taft, a physicist at NASA Ames in Mountain View, Calif., where the research was done, the aircraft simulation is the largest NASA has handled so far on the system, and the agency aims "to run a 200 million point simulation" before the year is out.
"Large-scale and high-fidelity simulations are intended to test designs before they're put into our wind tunnels," Taft said.
As large as it was, the aircraft simulation only approximated the complex events created by airflow over an aircraft's wing. Even 200 million points of reference won't yield all the answers, but greater resolution provides more confidence in the solutions. This translates into a shortening of the time it takes to work out a final design.
High-fidelity simulation" dramatically reduces the quantity of wind tunnel testing," Taft said.
Aerospace engineers can wait years for a turn to run a design through a series of tests in a wind tunnel. A simulation, Taft said, "is a winnowing tool to get you close to a design that might work." The result, he added, is shorter time to market.
Singapore's Institute of High Performance Computing created a 1.8 million cell model of a plan for a new parking area at Changi International Airport, to study the effectiveness of the proposed air circulation system.
Parallel processing, which is making ever larger models practical, is based on an idea called domain decomposition. The software divides the flow domain into segments of roughly equal computational work. Each part of the model is apportioned among processors and solved simultaneously. Messages pass among the processors to keep the solution coherent, and the solution completes in the time it takes the longest-working processor to finish its piece of the task.
DaimlerChrysler worked with Silicon Graphics and Fluent Inc. on a simulation of underhood airflow. The model included a coarse treatment of external aerodynamics, and contained more than one million cells. The flow conditions were incompressible, isothermal, and steady. A K-epsilon mathematical model represented turbulence since direct simulations are impractical.
Boundary conditions included a uniform inlet velocity and a constant pressure outlet. The simulation used Fluent software and a Silicon Graphics Origin 2000 computer system with 64 processors. The simulation, which would have required an estimated 27 hours to run on a single processor, finished up in as many minutes.
Space under the hood is always at a premium, and excessive heat can build up in areas where stagnant air cannot be adequately swept out by incoming cooler air. What's more, consideration of underhood cooling often comes late in the design of a vehicle, when much of the package detail is fixed.
One solution to heat buildup is to enlarge the opening in front of the radiator to increase airflow, but the design compromises aerodynamic efficiency. As the size of the opening increases, so does aerodynamic drag.
The number of conditions and configurations that would potentially require testing to understand underhood cooling requirements would be impractical with the exclusive use of wind tunnel experiments. The detail of each prototype configuration would require an enormous amount of preparation time.
Besides saving time and money by virtue of parallel processing, CFD simulation permits review of arbitrary locations in a flow field, including particle tracking, and can rapidly highlight areas of concern about heat buildup.
A mechanical engineering consulting firm on Long Island, Analysis and Design Application Co., uses parallel processing with Star-HPC software for a variety of automotive applications on Origin 2000 platforms. Adapco, which is based in Melville, N.Y., has run CFD simulations to answer questions about external aerodynamics, underhood thermal management, engine cooling, and passenger climate systems. Model sizes range from 500,000 to 10 million cells.
"Our view of what CFD can accomplish is changing unbelievably fast," said Greg Failla, senior CFD analyst with Adapco, " Multiple design iterations are easily performed on problems that were formidable giants only a year ago."
Among the most computationally involved problems that Adapco handles are in-cylinder transient analyses. In addition to presenting complex motion characteristics of the piston, these problems also require the accurate modeling of the spray and combustion phenomena. While model sizes are generally on the order of 500,000 elements, the long cycle times for the simulations make these problems ideal candidates for parallel processing.
With a single processor, the intake-compression analysis cycle takes roughly four weeks. Adapco typically uses five processors for transient in-cylinder analyses-that is, four for the solver and one for the geometry engine, the software that computes a new mesh at each time step in the cycle. This reduces the analysis time to about a week.
Adapco has analyzed a complete seven million element model of a torque converter on eight processors. Since the number of blades in the stator and rotor are different, it is often necessary to analyze the full 360-degree model to accurately capture blade passing effects. The analysis time for a single revolution required about two weeks.
The automotive industry has been the largest user of conu11ercial CFD solutions, and this trend is likely to continue. The characteristics of automotive CFD demand that large and geometrically complex models be constructed to accurately capture flow field detail to a level that would influence early design decisions. Aerodynamic characteristics in particular are important during the early stages of the design process before a vehicle's shape becomes fixed. To replace clay models in wind tunnels, aerodynamic CFD simulation would require high resolution and the ability of software to accurately predict such flow detail as locations of shear layer separation and reattachment, and accurately capture unsteady wake behavior.
Other fluid flow complexities arise from the effects of a moving ground plane and rotating wheels. Consequently, detailed automotive aerodynamic CFD simulations are usually not attempted for quantitative information, owing to the limited design value of detailed simulations that generally require long turnaround times and uncertain accuracy. Current automotive aerodynamic simulations are typically restricted to relatively moderate resolution models that produce mostly qualitative design information. The value that scalable CFD performance can deliver in this process is the ability to increase model sizes for high-fidelity resolution, while maintaining adequate solution turnaround times that fit within design cycle times.
A Virtual Wind Tunnel
But some encouraging results are beginning to change automotive aerodynamics.
DaimlerChrysler in Stuttgart, Germany, rendered a 10 million cell model of a Mercedes E-class sedan for aerodynamic simulation using Star-HPC software from Computational Dynamics Ltd. of London. The results of this simulation have convinced the company's engineers that the virtual wind tunnel is within close range." It was a real breakthrough for us to see that the achievable accuracy is indeed of the same quality as wind tunnel experiments," said one Mercedes-Benz aerodynamicist.
Based on talks with auto industry executives, the authors estimate that a typical fullscale wind tunnel costs approximately $20 million to build and an additional $5 million to $10 million in annual maintenance costs. As more alternative underhood simflow simulations are conducted, the savings on wind tunnel expense can be applied to complex studies such as drag prediction and aeroacoustics, which are not currently as suited to CFD simulation as underhood modeling.
An application of parallel processing that included a wide range of scale in its mesh studied the effects on air quality of a planned renovation at Singapore's Changi International Airport.
The Land Transport Authority of Singapore needed to evaluate the ventilation system proposed for a redesigned car parking area at Changi, so the agency turned to the Institute for High Performance Computing, a government-supported organization that promotes advanced computer simulation.
The agency wanted to make sure that the ventilation system would protect workers and travelers from high levels of pollutants and, at the same time, determine that the system was not overdesigned so as to waste energy.
The source of pollutants are idling taxicabs on the first floor of the parking area, where they wait to pick up passengers from the terminal. There can be as many as 213 cabs with their engines running at one time.
The ventilation system removes as much of the exhaust fumes as possible and dilutes whatever remains by supplying fresh air.
According to Kurichi R. Kumar, manager of the computing institute's CFD division, the flow model was built based on the drawings provided by the LTA. The outlets of the taxis' exhaust pipes were included in the model in the form of circular openings or inlet sources. The model included all six levels of the parking garage and gross features of terminal buildings.
The model consisted of more than J.8 million cells of an unstructured mesh, ranging in scale from a few millimeters for the exhaust pipe openings to several meters for various structural forms.
According to Kumar, the IHPC ran the steady and unsteady state simulations on Fluent software.
Autocad from Autodesk of San Rafael, Calif., was used for CAD geometry importing and some amount of geometry cleaning. Geometry building and surface meshing were mainly done using Hypermesh from Altair Computing Inc. of Troy, Mich. Post-processing was conducted with EnSight from CEI of Morrisvilie, N.C.
The steady state calculations took about 34 hours with four CPUs dedicated to the run.
The unsteady calculations were carried out on an Origin 2000 at Silicon Graphics' corporate headquarters in California. The simulation was run in parallel, using 32 dedicated CPUs, and took approximately 95 hours to simulate 210 seconds (using a time step of one second) of transient time history.
The simulation revealed that there are certain locations in the car parking area where the levels of CO could be higher than desired. This was mainly due to the structural restrictions of the building. Recommendations to remove a part of the wall and to supply more fresh air have been made, and will be implemented in the operation.
There are several factors that can inhibit a high degree of parallel scaling. Domain decomposition algorithms affect the load balancing between processors. Imbalances slow the system because the solution isn't in until the last processor is finished.
Software also has to determine the size of partition boundaries and consequent message-passing requirements. Careful planning of what data to pass between processors, and when to do so, is important. System architecture, communications subsystems, and messagepassing software will affect parallel scaling.
SGT's studies suggest that a CFD engineer can expect well-designed parallel CFD software to yield a linear increase in computing speed to a limit of 16 processors. Up to that point, doubling the number of processors will cut computing time in half.
More complex problems can see some tapering in the curve of increase. But a nearly linear increase is not impossible. In DaimlerChrysler's underhood examination, for instance, 64 processors worked 60 times faster than a single processor could have done.