RSD 09/2015

Russian Supercomputing Days 2015 was held on Semptember 28th - 29th, 2015 in Russia, Moscow, hotel Holiday Inn Moscow - Sokolniki.

N.A. Kataev has proposed there a speaker paper:

Automated parallelization of sequential C-programs on the example of two applications from the field of laser material processing

It is important to understand the information structure of programs for their parallelization. This helps to realize which kind of transformations may be necessary, and which parts of source code can be executed in parallel. Systems for automated analysis and transformation of programs may be useful to explore the structure of programs and to improve the performance of parallelization within a reasonable period of time. The paper proposes an approach to developing of such kind of systems. The process of program transformation is split into a set of basic operations. These operations are performed automatically in the order which is determined by the user. The offered approach has been successfully applied to parallelize two applications from the field of laser material processing.

This article is written by a team of the following authors M.S. Baranov, D.I. Ivanov, N.A. Kataev, A.A. Smirnov.

V.A. Bakhtin has proposed there a speaker paper:

Methods of dynamic tuning of DVMH programs on clusters with accelerators

DVM system is intended for development of parallel programs of scientific-technical calculations in C-DVMH and Fortran-DVMH languages. These languages use single model of parallel programming (DVMH model) and they are extensions of standard C and  FORTRAN languages by the specifications of parallelism issued in the form of directives to the compiler. DVMH model allows to create effective parallel programs for heterogeneous computing clusters with accelerators. Using DVMH model the programmer doesn’t use explicit copy operations of  the data located in the memory of central processor (CPU) or accelerators. For program fragments (regions) which can be executed on accelerators, he specifies input and output data, and also those data which are updated or used out of regions. It allows to select dynamically the devices on which the region will be executed, to distribute job between devices according to their productivity, to execute repeatedly regions for selection of optimum configuration. The influence of the listed methods on execution efficiency of some tests (from NAS NPB benchmarks) and real applications is shown in the article.

This article is written by a team of the following authors V.A. Bakhtin, A.S. Kolganov, V.A. Krukov, N.V. Podderugina, M.N. Pritula.