Version du 12/11/2007

AIAC vs. SIAC

Our goal is to solve scientific problems on an architecture composed of heterogeneous machines, widely distributed and linked by heterogeneous bandwidth. It is well known that direct methods are, most of the time, totally inefficient in such a context, since the total execution time is globally conditioned by the slowest machine and link. Furthermore, direct methods often use a lot of global communications (broadcast, gather-scatter, all-to-all, ...), which are a bottleneck on an heterogeneous network.


In fact, classical iterative methods suffer from the same problems, sometimes in worst when the convergence is particularly slow. Fortunately, there exists a class of iterative methods which is perfectly adapted to our execution context because it greatly reduces the need of synchronizations.


The two following sections illustrate these two classes of iterative algorithms: SIAC and AIAC. They show the first iterations of an abstract application as a Gantt chart, with 3 computing tasks A, B and C. In each case, we suppose that communications are asynchronous, that is non-blocking. It means that emission and reception of data can be done by the system while the application computes. Nevertheless, the code is based on the message passing paradigm which implies explicit routine calls to send and receive data from/to an application buffer.

SIAC

SIAC stands for Synchronous Iterations, Asynchronous Communications. It is the classical way to implement parallel iterative algorithms.



Plain boxes are computing time and white boxes idle times. Plain arrows represent the communications done at the end of each iteration to update the data of the neighbors. Dotted arrows represent the gather-scatter needed to check if the global convergence has been reached at the end of the current iteration. Obviously this test must be done after the data update, and at the end of each iteration. It leads to a lot of global communications and idle times.


back to top.
AIAC

AIAC stands for Asynchronous Iterations, Asynchronous Communications.



Plain boxes are computing time. Plain or hashed arrows represent the communications done at the end of each iteration to update the data of the neighbors. Dotted arrows represent communications concerning the local convergence state of a task. As we can see, the execution scheme is totally different from a SIAC since there are no idle times and global communications. Here is the sketch in three points:


Notes:


back to top.
Conclusion

AIAC algorithms is a new way to address the efficiency problem on a Grid. Linking big clusters with very high speed networks is indeed a way but, from a pure computation point of view, it is just the same than using old parallel machines. The real challenge is to use hundreds of simple machines (even user machines) and to implement more efficient codes than those existing.
To summarize, it is time to leave the old manners for this credo: "adapt the algorithm to the Grid and not the opposite".


back to top.