Execution performance of NAS NBP 2.3

Results obtained on March, 2002.

Performance of program execution always was very important factor, mainly determining the success and spreading of programming languages intended for development of computing programs.

For DVM-programs the performance of program execution is principal question as the programs at startup should be customized dynamically (without recompilation) on a number of and performance of the processors, selected for their execution. To estimate the performance of DVM-programs appropriate versions of NAS tests (NPB 2.3) were developed.

These tests well reflect a nature of computing tasks of various classes, except for the tasks with irregular grids. Table 1 shows the brief characteristics of tests and their sizes in lines for three versions of each program – the sequential version, the MPI-version and the DVM-version are presented below.

Table 1. The brief characteristics of NAS NPB 2.3
Table 1. The brief characteristics of NAS NPB 2.3
Test Test characteristics SEQ MPI DVM MPI/SEQ DVM/SEQ
BT 3D Navier-Stokes Alternating Direction Implicit (ADI) approximate factorization 3929 5744 3991 1.46 1.02
CG Estimation of the largest eigenvalue of a symmetric positive definite sparse matrix 1108 1793 1118 1.62 1.01
EP Generation of pairs of Gaussian random deviates 641 670 649 1.04 1.01
FT FFT-based 3D spectral method 1500 2352 1605 1.57 1.07
IS Parallel sorting 925 1218 1067 1.32 1.17
LU Navier-Stokes 3D Symmetric Successive Over-Relaxation (SSOR) method 4189 5497 4269 1.31 1.02
MG 3D scalar Poisson equation Multigrid method 1898 2857 2131 1.50 1.12
SP Navier-Stokes 3D Beam-Warning approximate factorization 361 5020 3630 1.49 1.08
S 17551 25151 18460 1.43 1.05
SEQ serial code
MPI parallel code in Fortran 77 or C (IS) + MPI
DVM parallel code in FORTRAN-DVM or C-DVM (IS)
The ratio of execution time of MPI-version to execution time of the DVM-version for each test on MVS-1000m
MVS-1000m-classA
Fig. 1. The ratio of execution time of MPI-version to execution time of the DVM-version for each test (class A) on MVS-1000m
MVS-1000m-classC
Fig. 2. The ratio of execution time of MPI-version to execution time of the DVM-version for each test (class C) on MVS-1000m

Note. In the table there are no results of comparison for tests IS and MG on 512 processors as one could not run the MPI-version of these tests.

Certainly, comparison on NPB 2.3 test is no quite lawful – they are written at very high professional level and are the object of steadfast attention of many experts. At development of real parallel programs, as a rule, reaching of high performance requires multiple changes of the program for search of the best scheme of its parallelization. Success of such search is determined by a simplicity of modification of the program. Moreover it is difficult for applied programmer to realize many frequently used methods of parallezation as effectively as they are realized by programming systems. Therefore on real programs the MPI-approach very frequently is lost on efficiency to the DVM-approach.