For the aerodynamic design of multistage compressors and turbines Computational Fluid Dynamics (CFD) plays a fundamental role. In fact it allows the characterization of the complex behaviour of turbomachinery components with high fidelity.
Together with the availability of more and more powerful computing resources, current trends pursue the adoption of such high-fidelity tools and state-of-the-art technology even in the preliminary design phases. Within such a framework Graphical Processing Units (GPUs) yield further growth potential, allowing a significant reduction of CFD process turn-around times at relatively low costs.
The target of the present work is to illustrate the design and implementation of an explicit density-based RANS coupled solver for the efficient and accurate numerical simulation of multi-dimensional time-dependent compressible fluid flows on polyhedral unstructured meshes. The solver has been developed within the object-oriented OpenFOAM framework, using OpenCL bindings to interface CPU and GPU and using MPI to interface multiple GPUs.
The overall structure of the code, the numerical strategies adopted and the algorithms implemented are specifically designed in order to best exploit the huge computational peak power offered by modern GPUs, by minimizing memory transfers between CPUs and GPUs and potential branch divergence occurrences. This has a significant impact in terms of the speedup factor and is especially challenging within a polyhedral unstructured mesh framework. Specific tools for turbomachinery applications, such as Arbitrary Mesh Interface (AMI) and mixing-plane (MP), are implemented within the GPU context.
The credibility of the proposed CFD solver is assessed by tackling a number of benchmark test problems, including Rotor 67 axial compressor, C3X stator blade with conjugate heat transfer and Aachen multi-stage turbine. An average GPU speedup factor of approximately S ≃ 50 with respect to CPU is achieved (single precision, both GPU and CPU in 100 USD price range). Preliminary parallel scalability test run on multiple GPUs show a parallel efficiency factor of approximately E ≃ 75%.