Execution performance of NAS NBP 2.3
Results obtained on March, 2002.
Performance of program execution always was very important factor, mainly determining the success and spreading of programming languages intended for development of computing programs.
For DVM-programs the performance of program execution is principal question as the programs at startup should be customized dynamically (without recompilation) on a number of and performance of the processors, selected for their execution. To estimate the performance of DVM-programs appropriate versions of NAS tests (NPB 2.3) were developed.
These tests well reflect a nature of computing tasks of various classes, except for the tasks with irregular grids. Table 1 shows the brief characteristics of tests and their sizes in lines for three versions of each program – the sequential version, the MPI-version and the DVM-version are presented below.
Table 1. The brief characteristics of NAS NPB 2.3
Test | Test characteristics | SEQ | MPI | DVM | MPI/SEQ | DVM/SEQ |
---|---|---|---|---|---|---|
BT | 3D Navier-Stokes Alternating Direction Implicit (ADI) approximate factorization | 3929 | 5744 | 3991 | 1.46 | 1.02 |
CG | Estimation of the largest eigenvalue of a symmetric positive definite sparse matrix | 1108 | 1793 | 1118 | 1.62 | 1.01 |
EP | Generation of pairs of Gaussian random deviates | 641 | 670 | 649 | 1.04 | 1.01 |
FT | FFT-based 3D spectral method | 1500 | 2352 | 1605 | 1.57 | 1.07 |
IS | Parallel sorting | 925 | 1218 | 1067 | 1.32 | 1.17 |
LU | Navier-Stokes 3D Symmetric Successive Over-Relaxation (SSOR) method | 4189 | 5497 | 4269 | 1.31 | 1.02 |
MG | 3D scalar Poisson equation Multigrid method | 1898 | 2857 | 2131 | 1.50 | 1.12 |
SP | Navier-Stokes 3D Beam-Warning approximate factorization | 361 | 5020 | 3630 | 1.49 | 1.08 |
S | 17551 | 25151 | 18460 | 1.43 | 1.05 |
SEQ | serial code |
---|---|
MPI | parallel code in Fortran 77 or C (IS) + MPI |
DVM | parallel code in FORTRAN-DVM or C-DVM (IS) |
The ratio of execution time of MPI-version to execution time of the DVM-version for each test on MVS-1000m
Note. In the table there are no results of comparison for tests IS and MG on 512 processors as one could not run the MPI-version of these tests.
Certainly, comparison on NPB 2.3 test is no quite lawful – they are written at very high professional level and are the object of steadfast attention of many experts. At development of real parallel programs, as a rule, reaching of high performance requires multiple changes of the program for search of the best scheme of its parallelization. Success of such search is determined by a simplicity of modification of the program. Moreover it is difficult for applied programmer to realize many frequently used methods of parallezation as effectively as they are realized by programming systems. Therefore on real programs the MPI-approach very frequently is lost on efficiency to the DVM-approach.