The parallel linear equations solver capable of effectively using 1000+ processors becomes the bottleneck of large-scale implicit engineering simulations. In this paper, we present a new hierarchical parallel master-slave-structural iterative algorithm for the solution of super large-scale sparse linear equations in a distributed memory computer cluster. Through alternatively performing global equilibrium computation and local relaxation, the specific accuracy requirement can be met in a few iterations. Moreover, each set/slave-processor majorly communicates with its nearest neighbors, and the transferring data between sets/slave-processors and the master-processor is always far below the communication between neighboring sets/slave-processors. The corresponding algorithm for implicit finite element analysis has been implemented based on the MPI library, and a super large 2-dimension square system of triangle-lattice truss structure under randomly distributed loadings is simulated with over 1 × 109 degrees of freedom (DOF) on up to 2001 processors of the “Exploration 100” cluster in Tsinghua University. The numerical experiments demonstrate that this algorithm has excellent parallel efficiency and high scalability, and it may have broad applications in other implicit simulations.

References

1.
Wing
,
O.
, and
Huang
,
J. W.
,
1980
, “
A Computation Model of Parallel Solution of Linear-Equations
,”
IEEE Trans. Comput.
,
29
(
7
), pp.
632
638
.10.1109/TC.1980.1675634
2.
Arnold
,
C. P.
,
Parr
,
M. I.
, and
Dewe
,
M. B.
,
1983
, “
An Efficient Parallel Algorithm for the Solution of Large Sparse Linear Matrix Equations
,”
IEEE Trans. Comput.
,
32
(
3
), pp.
265
273
.10.1109/TC.1983.1676218
3.
Oleary
,
D. P.
, and
White
,
R. E.
,
1985
, “
Multi-Splittings of Matrices and Parallel Solution of Linear-Systems
,”
SIAM J. Algebraic Discrete Methods
,
6
(
4
), pp.
630
640
.10.1137/0606062
4.
Abur
,
A.
,
1988
, “
A Parallel Scheme for the Forward Backward Substitutions in Solving Sparse Linear-Equations
,”
IEEE Trans. Power Syst.
,
3
(
4
), pp.
1471
1478
.10.1109/59.192955
5.
Heath
,
M. T.
,
Ng
,
E.
, and
Peyton
,
B. W.
,
1991
, “
Parallel Algorithms for Sparse Linear-Systems
,”
SIAM Rev.
,
33
(
3
), pp.
420
460
.10.1137/1033099
6.
Szyld
,
D. B.
, and
Jones
,
M. T.
,
1992
, “
2-Stage and Multisplitting Methods for the Parallel Solution of Linear-Systems
,”
SIAM J. Matrix Anal. Appl.
,
13
(
2
), pp.
671
679
.10.1137/0613042
7.
Saad
,
Y.
, and
Sosonkina
,
M.
,
1999
, “
Non-Standard Parallel Solution Strategies for Distributed Sparse Linear Systems
,”
Parallel Comput.
,
1557
, pp.
13
27
.10.1007/3-540-49164-3
8.
Censor
,
Y.
,
Gordon
,
D.
, and
Gordon
,
R.
,
2001
, “
Component Averaging: An Efficient Iterative Parallel Algorithm for Large and Sparse Unstructured Problems
,”
Parallel Comput.
,
27
(
6
), pp.
777
808
.10.1016/S0167-8191(00)00100-9
9.
Filippone
,
S.
, and
Colajanni
,
M.
,
2000
, “
PSBLAS: A Library for Parallel Linear Algebra Computation on Sparse Matrices
,”
ACM Trans. Math. Softw.
,
26
(
4
), pp.
527
550
.10.1145/365723.365732
10.
Henson
,
V. E.
, and
Yang
,
U. M.
,
2002
, “
Boomeramg: A Parallel Algebraic Multigrid Solver and Preconditioner
,”
Appl. Numer. Math.
,
41
(
1
), pp.
155
177
.10.1016/S0168-9274(01)00115-5
11.
Schenk
,
O.
, and
Gartner
,
K.
,
2004
, “
Solving Unsymmetric Sparse Systems of Linear Equations With Pardiso
,”
FGCS, Future Gener. Comput. Syst.
,
20
(
3
), pp.
475
487
.10.1016/j.future.2003.07.011
12.
Guermouche
,
A.
,
Amestoy
,
P. R.
,
L'excellent
,
J. Y.
, and
Pralet
,
S.
,
2006
, “
Hybrid Scheduling for the Parallel Solution of Linear Systems
,”
Parallel Comput.
,
32
(
2
), pp.
136
156
.10.1016/j.parco.2005.07.004
13.
Roman
,
J.
,
Agullo
,
E.
,
Giraud
,
L.
, and
Guermouche
,
A.
,
2011
, “
Parallel Hierarchical Hybrid Linear Solvers for Emerging Computing Platforms
,”
C. R. Mec.
,
339
(
2–3
), pp.
96
103
.10.1016/j.crme.2010.11.005
14.
Collignon
,
T. P.
, and
Van Gijzen
,
M. B.
,
2011
, “
Fast Iterative Solution of Large Sparse Linear Systems on Geographically Separated Clusters
,”
Int. J. High Perform. Comput. Appl.
,
25
(
4
), pp.
440
450
.10.1177/1094342010388541
15.
Buttari
,
A.
,
Langou
,
J.
,
Kurzak
,
J.
, and
Dongarra
,
J.
,
2009
, “
A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures
,”
Parallel Comput.
,
35
(
1
), pp.
38
53
.10.1016/j.parco.2008.10.002
16.
Manguoglu
,
M.
,
Sameh
,
A. H.
, and
Schenk
,
O.
,
2009
, “
pspike: A Parallel Hybrid Sparse Linear System Solver
,”
Euro-Par 2009: Parallel Processing, Proceedings
,
Springer-Verlag Berlin
.
17.
Li
,
X. Y. S.
, and
Demmel
,
J. W.
,
2003
, “
Superlu_Dist: A Scalable Distributed-Memory Sparse Direct Solver for Unsymmetric Linear Systems
,”
ACM Trans. Math. Softw.
,
29
(
2
), pp.
110
140
.10.1145/779359.779361
18.
Amestoy
,
P. R.
,
Duff
,
I. S.
, and
L'excellent
,
J. Y.
,
2000
, “
Multifrontal Parallel Distributed Symmetric and Unsymmetric Solvers
,”
Comput. Methods Appl. Mech. Eng.
,
184
(
2–4
), pp.
501
520
.10.1016/S0045-7825(99)00242-X
19.
Balay
,
S.
,
Brown
,
J.
,
Buschelman
,
K.
,
Gropp
,
W. D.
,
Kaushik
,
D.
,
Knepley
,
M. G.
,
Mcinnes
,
L. C.
,
Smith
,
B. F.
, and
Zhang
,
H.
,
2011
, “
Petsc Web Page
,” http://www.mcs.anl.gov/petsc/
20.
Jones
,
J. E.
,
1999
, “
A Parallel Multigrid Tutorial
,”
Proceedings of the Ninth Copper Mountain Conference on Multigrid Methods
,
Copper Mountain, CO
, April 11–16, Paper No. UCRL-MI-133748.
21.
Law
,
K. H.
,
1986
, “
A Parallel Finite-Element Solution Method
,”
Comput. Struct.
,
23
(
6
), pp.
845
858
.10.1016/0045-7949(86)90254-3
22.
Farhat
,
C.
,
Pierson
,
K.
, and
Lesoinne
,
M.
,
2000
, “
The Second Generation FETI Methods and Their Application to the Parallel Solution of Large-Scale Linear and Geometrically Nonlinear Structural Analysis Problems
,”
Comput. Methods Appl. Mech. Eng.
,
184
(
2–4
), pp.
333
374
.10.1016/S0045-7825(99)00234-0
23.
Farhat
,
C.
, and
Roux
,
F. X.
,
1992
, “
An Unconventional Domain Decomposition Method for an Efficient Parallel Solution of Large-Scale Finite-Element Systems
,”
SIAM J. Sci. Stat. Comput.
,
13
(
1
), pp.
379
396
.10.1137/0913020
24.
Oden
,
J. T.
,
Patra
,
A.
, and
Feng
,
Y. S.
,
1997
, “
Parallel Domain Decomposition Solver for Adaptive Hp Finite Element Methods
,”
SIAM J. Numer. Anal.
,
34
(
6
), pp.
2090
2118
.10.1137/S0036142994278887
25.
Tezduyar
,
T. E.
, and
Sameh
,
A.
,
2006
, “
Parallel Finite Element Computations in Fluid Mechanics
,”
Comput. Methods Appl. Mech. Eng.
,
195
(
13–16
), pp.
1872
1884
.10.1016/j.cma.2005.05.038
26.
Paszynski
,
M.
, and
Demkowicz
,
L.
,
2006
, “
Parallel, Fully Automatic HP-Adaptive 3d Finite Element Package
,”
Eng. Comput.
,
22
(
3–4
), pp.
255
276
.10.1007/s00366-006-0036-8
27.
Wang
,
W. Q.
,
Kosakowski
,
G.
, and
Kolditz
,
O.
,
2009
, “
A Parallel Finite Element Scheme for Thermo-Hydro-Mechanical (THM) Coupled Problems in Porous Media
,”
Comput. Geosci.
,
35
(
8
), pp.
1631
1641
.10.1016/j.cageo.2008.07.007
28.
Kim
,
J. H.
,
Lee
,
C. S.
, and
Kim
,
S. J.
,
2004
, “
Development of a High-Performance Domain-Wise Parallel Direct Solver for Large-Scale Structural Analysis
,”
Proceedings of the Seventh International Conference on High Performance Computing and Grid in Asia Pacific Region
, Tokyo, July 20–22,
pp. 267–27
4
.10.1109/HPCASIA.2004.1324044
29.
Fish
,
J.
, and
Belsky
,
V.
,
1997
, “
Generalized Aggregation Multilevel Solver
,”
Int. J. Numer. Methods Eng.
,
40
(
23
), pp.
4341
4361
.10.1002/(SICI)1097-0207(19971215)40:23<4341::AID-NME261>3.0.CO;2-C
30.
Stuben
,
K.
, and
Trottenberg
,
U.
,
1982
, “
Multigrid Methods—Fundamental Algorithms, Model Problem Analysis and Applications
,”
Lect. Notes Math.
,
960
, pp.
1
176
.10.1007/BFb0069927
31.
Parsons
,
I. D.
, and
Hall
,
J. F.
,
1990
, “
The Multigrid Method in Solid Mechanics: Part I—Algorithm Description and Behavior
,”
Int. J. Numer. Methods Eng.
,
29
(
4
), pp.
719
737
.10.1002/nme.1620290404
32.
Papadrakakis
,
M.
,
Stavroulakis
,
G.
, and
Karatarakis
,
A.
,
2011
, “
A New Era in Scientific Computing: Domain Decomposition Methods in Hybrid CPU-GPU Architectures
,”
Comput. Methods Appl. Mech. Eng.
,
200
(
13–16
), pp.
1490
1508
.10.1016/j.cma.2011.01.013
33.
Adams
,
M. F.
,
Bayraktar
,
H. H.
,
Keaveny
,
T. M.
, and
Papadopoulos
,
P.
,
2004
, “
Ultrascalable Implicit Finite Element Analyses in Solid Mechanics With Over a Half a Billion Degrees of Freedom
,”
Proceedings of the ACM/IEEE
SC2004
Conference, Pittsburgh, PA, November 6–12.10.1109/SC.2004.62
34.
Cyr
,
E. C.
,
Shadid
,
J. N.
, and
Tuminaro
,
R. S.
,
2012
, “
Stabilization and Scalable Block Preconditioning for the Navier–Stokes Equations
,”
J. Comput. Phys.
,
231
(
2
), pp.
345
363
.10.1016/j.jcp.2011.09.001
35.
Schenk
,
O.
, and
Gartner
,
K.
,
2006
, “
On Fast Factorization Pivoting Methods for Sparse Symmetric Indefinite Systems
,”
Electron. Trans. Numer. Anal.
,
23
, pp.
158
179
.
36.
Gropp
,
W.
,
Lusk
,
E.
,
Doss
,
N.
, and
Skjellum
,
A.
,
1996
, “
A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard
,”
Parallel Comput.
,
22
(
6
), pp.
789
828
.10.1016/0167-8191(96)00024-5
37.
Gropp
,
W.
,
Lusk
,
E. L.
, and
Skjellum
,
A.
,
1996
,
Using MPI—Portable Parallel Programming With the Message-Passing Interface
,
MIT Press
,
Cambridge, MA
.
38.
Chen
,
P.
, and
Sun
,
S. L.
,
2005
, “
New High Performance Sparse Static Solver in Finite Element Analysis With Loop-Unrolling
,”
Acta Mech. Solida Sinica
,
18
(
3
), pp.
248
255
.10.1016/S0965-9978(02)00128-X
You do not currently have access to this content.