Execution performance of NAS NBP 2.3

Results obtained on July, 2001.

The tables, presented below, contain information about sizes and performance of MPI-programs and DVM-programs for NAS tests.

In comparison with sequential program a size of DVM-program is increased on average by 5%, whereas the size of MPI-program is increased on average by 40%. Note, that the size of DVM-program is increased because of inserting special comments independent from array sizes and a number of processors. Additional code of MPI-program is complicated system of managing programs to pass messages, which depend on array sizes and the number of processors.

Performances of DVM-programs and MPI-programs are comparable. However sometimes DVM-program performance is less by 50-60%. It is caused by two reasons. First, DVM-system doesn’t use MPI collective operations, which are performed on some parallel systems more efficiently than their realization via point-to-point communications. Second, MPI-versions of some tests use parallelization along two dimensions of processor grid, whereas DVM-versions of all tests are performed now only on a line of processors. At present the works to eliminate these two reasons are performed.

Table 1. Sizes of NAS NPB 2.3 sources (in lines)
Table 1. Sizes of NAS NPB 2.3 sources (in lines)
TestSEQMPIDVMMPI/SEQDVM/SEQ
BT4059574441461.411.02
CG1108179311181.621.01
EP6416706491.041.01
FT1500235216051.571.07
IS925121810851.321.17
LU4189549742691.311.02
MG1898285719921.501.05
SP3361502035801.491.06
S1768125151184441.421.04
SEQserial code
MPIparallel code in Fortran77 or C (IS) + MPI
DVMparallel code in FORTRAN-DVM or C-DVM (IS)
Performance of MPI-programs and DVM-programs for NAS NPB 2.3
NCI-clusterPentium III/500+Mayrinet, Windows NT, MPI-FM, Visual C++ 6.0, Digital Fortran 5.0
RCC-clusterPentium III/500 + SCI, Red Hat Linux release 6.1 (Cartman), ScaMPI, Portland Group C compiler, Portland Group F77 compiler
MVS-1000/16Pentium III/800 + Fast Ethernet, Red Hat Linux release 7.0 (Guinness), Router, LAM-MPI, GNU C compiler version 2.96, GNU Fortran compiler version 2.96
Table 2. BT test execution times in seconds (class A)
Table 2. BT test execution times in seconds (class A)
NPNCI-cluster(Peking)RCC-cluster(MSU)MVS-1000/16(KIAM)
MPIDVMDVM/MPIMPIDVMDVM/MPIMPIDVMDVM/MPI
12548,5
2
4656,9716,71,09606,1712,31,17568,2571,11,00
8446,3390,40,87284,7380,61,34314,8303,50,96
16271,4270,81,00220,8231,21,04208,9
Table 3. CG test execution times in seconds (class A)
Table 3. CG test execution times in seconds (class A)
NPNCI-cluster(Peking)RCC-cluster(MSU)MVS-1000/16(KIAM)
MPIDVMDVM/MPIMPIDVMDVM/MPIMPIDVMDVM/MPI
143,745,41,0441,442,91,0430,630,91,01
222,024,91,1328,322,80,8116,719,41,16
412,014,01,1711,713,61,1612,013,11,09
86,49,01,416,39,11,447,39,91,36
165,08,91,785,07,01,408,6
Table 4. EP test execution times in seconds (class A)
Table 4. EP test execution times in seconds (class A)
NPNCI-cluster(Peking)RCC-cluster(MSU)MVS-1000/16(KIAM)
MPIDVMDVM/MPIMPIDVMDVM/MPIMPIDVMDVM/MPI
1434,3414,40,95389,3393,11,01306,7305,70,99
2217,1 207,30,95179,7196,71,09153,2153,01,00
4108,6103,70,9597,798,41,0177,477,31,00
854,351,90,9548,949,31,0138,738,91,01
1628,026,90,9624,575,01,0221,1
Table 5. FT test execution times in seconds (class A)
Table 5. FT test execution times in seconds (class A)
NPNCI-cluster(Peking)RCC-cluster(MSU)MVS-1000/16(KIAM)
MPIDVMDVM/MPIMPIDVMDVM/MPIMPIDVMDVM/MPI
1130,2136,11,04
288,275,80,8858,1
447,545,90,9742,542,61,0033,732,90,98
827,124,70,9121,226,01,2319,819,81,00
1621,214,80,7013,314,51,0913,5
Table 6. IS test execution times in seconds (class A)
Table 6. IS test execution times in seconds (class A)
NPNCI-cluster(Peking)RCC-cluster(MSU)MVS-1000/16(KIAM)
MPIDVMDVM/MPIMPIDVMDVM/MPIMPIDVMDVM/MPI
118,319,61,0715,719,51,2410,113,21,31
211,714,91,2710,713,51,2611,914,81,24
47,78,61,125,27,21,388,39,01,09
85,04,60,922,93,91,345,45,00,92
163,83,20,842,33,401,483,3
Table 7. LU test execution times in seconds (class A)
Table 7. LU test execution times in seconds (class A)
NPNCI-cluster(Peking)RCC-cluster(MSU)MVS-1000/16(KIAM)
MPIDVMDVM/MPIMPIDVMDVM/MPIMPIDVMDVM/MPI
11796,61739,70,971581,51886,01,191186,2
2911,9820,50,90989,5974,40,98617,5624,91,01
4452,8448,90,99361,5512,31,41323,4349,61,08
8202,4248,51,23265,9265,91,60172,9198,61,15
16111,3172,21,55143,2143,21,69141,4
Table 8. MG test execution times in seconds (class A)
Table 8. MG test execution times in seconds (class A)
NPNCI-cluster(Peking)RCC-cluster(MSU)MVS-1000/16(KIAM)
MPIDVMDVM/MPIMPIDVMDVM/MPIMPIDVMDVM/MPI
177,771,50,92
247,936,50,7633,030,50,92
420,722,21,0722,218,80,8518,216,10,88
89,313,51,459,710,51,089,59,10,96
165,99,71,647,06,70,966,5
Table 9. SP test execution times in seconds (class A)
Table 9. SP test execution times in seconds (class A)
NPNCI-cluster(Peking)RCC-cluster(MSU)MVS-1000/16(KIAM)
MPIDVMDVM/MPIMPIDVMDVM/MPIMPIDVMDVM/MPI
11681,02040,01,211670,72132,21,281616,51534,40,95
2
4435,4562,41,29435,2616,61,42472,4450,30,95
8271,9309,51,14207,7311,71,50274,9258,20,94
16150,2222,71,48201,6201,61,38196,8