Latest news:

Stéfano got his Master!

ERAD 2010 in Passo Fundo.

Mission in Berlin.

João Lima has been accepted in PhD.




Dynamicity in Parallel, MPI Programs.

(You can also have a look at the slides of my talk at the Technische Universität Berlin, on Oct. 21st 2008, to have a more visual look at these works.)

Support for adaptability and dynamicity turns to be critical in parallel programs

However, when it comes to parallel APIs, it is hard to find a candidate which really provides dynamicity, and at the same time is widely used in the HPC community. Projects as Cilk (MIT) or Kaapi (INRIA) are good options, but are mainly used by the research community.

MPI is the de-facto standard for parallel programming on distributed architectures, at least in HPC. It enables highly performant codes in static, homogeneous architectures. MPI-2 includes the dynamic management of processes (creation, insertion in a communicator, communication with the newly created processes, etc…), Remote Memory Access and parallel I/O and is now widely available.

Supporting and improving dynamicity in MPI programs is the leading track of our research.

One classical way of adapting the MPI program behavior in function of the architecture is to parameterize the distribution of the data, typically in blocks, whose size can be fine-tuned at compile-time. This is for instance what Claudio Schepke has done in his master, and you can read his paper published at SBAC-PAD’07 to see how good this approach is to speed-up an application of fluid dynamics. However, nothing is done at runtime. Claudio is currently working on a dynamic version of his Lattice-Boltzmann program.

If you want to have a dynamic program with MPI, then you probably have to adapt your programming style. Of course, if the program is designed to run exactly one process by CPU from the start until the end, there is no room for dynamic adaptation. The solution is to program with more tasks than physical processors (parallel slackness), and possibly with a dynamic creation of tasks. To this end, the most well-known technique is to use recursive creation of tasks – the recursion allows a fine control of the extent of parallelism that is efficient. This is the approach of Cilk, for instance. Divide & Conquer is usually used for such programs, which fits well with most highly parallel algorithms. Note that a loop-based iterative program can always be transformed into a D&C program, if the dependencies are not too strong (but if they are, you probably will not want to parallelize your algorithm…)

So how do you program a D&C algorithm with MPI, with dynamic creation of tasks? This has been the subject of Guilherme Pezzi’s master, and has resulted in a paper that has been published at the SBAC’07 conference. If you speak Portuguese, you can also have a look at his paper published at the Brazilian national conference WSCAD’06.

Using D&C with a dynamic MPI program, where a task is a regular MPI process, raises two main issues that we have tackled:

  1. Routing the communication between the parent processes and the children processes that are dynamically spawned with MPI-2. The inter-communicators that are defined in MPI only enable “parent-children” communication. In order to use whatever point-to-point communication between calculating tasks, we are using an overlay of “routing” processes.
  2. Scheduling the dynamic processes on the CPUs. This is part of Marcia Cera’s PhD work. Again, we have used two kinds of algorithms. A centralized scheduling daemon has been integrated into a LAM-MPI distribution, with the results presented at the Euro-PVM/MPI’06 conference. A distributed scheduler has also been used, based on Work Stealing, and the results are in the SBAC’07 paper.

These results assume that a task is a MPI process. The next natural question is the dynamic control of the granularity of these spawned tasks and to implement them as heavy processes or lightweight threads. This is the main idea of João Lima’s master. He has published his work (in Portuguese) at WSCAD’08 (to be held in November 2008). As far as we know, this is the first work that mixes threads and processes in such a dynamic way, with MPI.

This control of the granularity raises new questions, the most urging one being the on-line scheduling of these two kinds of tasks ; this is the subject of Stéfano Mór’s master. He should use Cilk’s double level scheduling, but in a distirbuted environment.

Another very natural question is the dynamic treatment of the exclusion of processes. This must involve the batch scheduler, which is in charge of allocating the CPUs to the program and to make sure that something is done when the CPUs are off. Márcia Cera is currently in Grenoble, France, to integrate her dynamic MPI schedulers with the batch scheduler called OAR.

Finally, two related approaches that are treated in the group are:

  1. Using a dynamic MPI on dynamic environments such as Computational Grids. This is the subject of Elton Mathias’s master thesis, which he is doing partly in the french institute INRIA, at Nice, with Françoise Baude. This is also what Fernando Afonso is looking at, but on top of the .NET framework. The .NET basic services should help in virtualizing a homogeneous environment, whatever the underlying architecture and OS are.
  2. Using process migration with MPI in order to dynamically adjust the load. Currently, this can be done only with non-spawned processes. Marcelo Veiga’s master is dedicated to this subject, and Rodrigo Righi ‘s PhD is akin to it. (You can see Rodrigo’s publication at SBAC’08.)

If you are interested in these subjects, do not hesitate and send us an email (the address can be found on the main page.) You can also have a look at this page, about Master and PhD at UFRGS under my supervision.

Fri Apr 02 16:48:12 -0300 2010