Publications for Arnaud Giersch

(back to my home page…)

[Without abstracts] [With abstracts] [BibTeX entries]

Journals

[1] Raphaël Couturier, Arnaud Giersch, and Mourad Hakem. Best effort strategy and virtual load for asynchronous iterative load balancing. Journal of Computational Science, 26:118--127, May 2018. [ BibTeX | DOI | PDF ]
Most of the time, asynchronous load balancing algorithms are extensively studied from a theoretical point of view. The Bertsekas and Tsitsiklis' algorithm is undeniably the best known algorithm for which the asymptotic convergence proof is given. From a practical point of view, when a node needs to balance a part of its load to some of its neighbors, the algorithm's description is unfortunately too succinct, and no details are given on what is really sent and how the load balancing decisions are made. In this paper, we propose a new strategy called best effort which aims at balancing the load of a node to all its less loaded neighbors while ensuring that all involved nodes by the load balancing phase have the same amount of load. Moreover, since asynchronous iterative algorithms are less sensitive to communication delays and their variations, both load transfer and load information messages are dissociated. To speedup the convergence time of the load balancing process, we propose a clairvoyant virtual load heuristic. This heuristic allows a node receiving a load information message to integrate the future virtual load (if any) in its load's list, even if the load has not been received yet. This leads to have predictive snapshots of nodes' loads at each iteration of the load balancing process. Consequently, the notified node sends a real part of its load to some of its neighbors, taking into account the virtual load it will receive in the subsequent time-steps. Based on the SimGrid simulator, some series of test-bed scenarios are considered and several QoS metrics are evaluated to show the usefulness of the proposed algorithm.
[2] Ahmed Fanfakh, Jean-Claude Charr, Raphaël Couturier, and Arnaud Giersch. Energy consumption reduction for asynchronous message-passing applications. The Journal of Supercomputing, 73(6):2369--2401, June 2017. [ BibTeX | DOI | PDF ]
It is widely accepted that the asynchronous parallel methods are more suitable than the synchronous ones on a grid architecture. Indeed, they outperform the synchronous methods, because they overlap the communications of the synchronous methods with computations. However, they also usually execute more iterations than the synchronous ones and thus consume more energy. To reduce the energy consumption of the CPUs executing such methods, the Dynamic voltage and frequency scaling technique can be used. It lowers the frequency of a CPU to reduce its energy consumption, but it also decreases its computing power. Therefore, the frequency that gives the best trade-off between energy consumption and performance must be selected. This paper presents a new online frequency selecting algorithm for parallel iterative asynchronous methods running over grids. It selects a vector of frequencies that gives the best trade-off between energy consumption and performance. New energy and performance models were used in this algorithm to predict the execution time and the energy consumption of synchronous, asynchronous, or hybrid iterative applications running over grids. The proposed algorithm was evaluated on the SimGrid simulator. The experiments showed that synchronously applying the proposed algorithm to the asynchronous version of the application reduces on average its energy consumption by 22% and speeds it up by 5.72%. Finally, the proposed algorithm was also compared to a method that uses the well-known energy and delay product and the comparison results showed that it outperforms this method in terms of energy consumption and performance.
[3] Ahmed Fanfakh, Jean-Claude Charr, Raphaël Couturier, and Arnaud Giersch. Optimizing the energy consumption of message passing applications with iterations executed over grids. Journal of Computational Science, 17(3):562--575, November 2016. [ BibTeX | DOI | PDF ]
In recent years, green computing has become an important topic in the supercomputing research domain. However, the computing platforms are still consuming more and more energy due to the increasing number of nodes composing them. To minimize the operating costs of these platforms many techniques have been used. Dynamic voltage and frequency scaling (DVFS) is one of them. It can be used to reduce the power consumption of the CPU while computing, by lowering its frequency. However, lowering the frequency of a CPU may increase the execution time of an application running on that processor. Therefore, the frequency that gives the best trade-off between the energy consumption and the performance of an application must be selected. In this paper, a new online frequency selecting algorithm for grids, composed of heterogeneous clusters, is presented. It selects the frequencies and tries to give the best trade-off between energy saving and performance degradation, for each node computing the message passing application with iterations. The algorithm has a small overhead and works without training or profiling. It uses a new energy model for message passing applications with iterations running on a grid. The proposed algorithm is evaluated on a real grid, the Grid'5000 platform, while running the NAS parallel benchmarks. The experiments on 16 nodes, distributed on three clusters, show that it reduces on average the energy consumption by 30% while the performance is on average only degraded by 3.2%. Finally, the algorithm is compared to an existing method. The comparison results show that it outperforms the latter in terms of energy consumption reduction and performance.
[4] Henri Casanova, Arnaud Giersch, Arnaud Legrand, Martin Quinson, and Frédéric Suter. Versatile, scalable, and accurate simulation of distributed applications and platforms. Journal of Parallel and Distributed Computing, 74(10):2899--2917, October 2014. [ BibTeX | DOI | PDF ]
The study of parallel and distributed applications and platforms, whether in the cluster, grid, peer-to-peer, volunteer, or cloud computing domain, often mandates empirical evaluation of proposed algorithmic and system solutions via simulation. Unlike direct experimentation via an application deployment on a real-world testbed, simulation enables fully repeatable and configurable experiments for arbitrary hypothetical scenarios. Two key concerns are accuracy (so that simulation results are scientifically sound) and scalability (so that simulation experiments can be fast and memory-efficient). While the scalability of a simulator is easily measured, the accuracy of many state-of-the-art simulators is largely unknown because they have not been sufficiently validated. In this work we describe recent accuracy and scalability advances made in the context of the SimGrid simulation framework. A design goal of SimGrid is that it should be versatile, i.e., applicable across all aforementioned domains. We present quantitative results that show that SimGrid compares favorably to state-of-the-art domain-specific simulators in terms of scalability, accuracy, or the trade-off between the two. An important implication is that, contrary to popular wisdom, striving for versatility in a simulator is not an impediment but instead is conducive to improving both accuracy and scalability.
[5] Lilia Ziane Khodja, Raphaël Couturier, Arnaud Giersch, and Jacques M. Bahi. Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters. The Journal of Supercomputing, 69(1):200--224, July 2014. [ BibTeX | DOI ]
In this paper, we aim at exploiting the power computing of a graphics processing unit (GPU) cluster for solving large sparse linear systems. We implement the parallel algorithm of the generalized minimal residual iterative method using the Compute Unified Device Architecture programming language and the MPI parallel environment. The experiments show that a GPU cluster is more efficient than a CPU cluster. In order to optimize the performances, we use a compressed storage format for the sparse vectors and the hypergraph partitioning. These solutions improve the spatial and temporal localization of the shared data between the computing nodes of the GPU cluster.
[6] Arnaud Giersch, Yves Robert, and Frédéric Vivien. Scheduling tasks sharing files on heterogeneous master-slave platforms. Journal of Systems Architecture, special issue on Parallel, Distributed and Network-based Processing: selected papers from the 12th Euromicro Conference, 52(2):88--104, February 2006. [ BibTeX | DOI | PostScript | PDF | Corresponding RR ]
This paper is devoted to scheduling a large collection of independent tasks onto heterogeneous clusters. The tasks depend upon (input) files which initially reside on a master processor. A given file may well be shared by several tasks. The role of the master is to distribute the files to the processors, so that they can execute the tasks. The objective for the master is to select which file to send to which slave, and in which order, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish complexity results that assess the difficulty of the problem. On the practical side, we design several new heuristics, which are shown to perform as efficiently as the best heuristics from Casanova et al. although their cost is an order of magnitude lower.
[7] Stéphane Genaud, Arnaud Giersch, and Frédéric Vivien. Load-balancing scatter operations for grid computing. Parallel Computing, 30(8):923--946, August 2004. [ BibTeX | DOI | PostScript | PDF | Corresponding RR ]
We present solutions to statically load-balance scatter operations in parallel codes run on grids. Our load-balancing strategy is based on the modification of the data distributions used in scatter operations. We study the replacement of scatter operations with parameterized scatters, allowing custom distributions of data. The paper presents: 1) a general algorithm which finds an optimal distribution of data across processors; 2) a quicker guaranteed heuristic relying on hypotheses on communications and computations; 3) a policy on the ordering of the processors. Experimental results with an MPI scientific code illustrate the benefits obtained from our load-balancing.

Conferences

[1] Jean-Claude Charr, Raphaël Couturier, Ahmed Fanfakh, and Arnaud Giersch. Energy consumption reduction with DVFS for message passing iterative applications on heterogeneous architectures. In PDSEC 2015: 16th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, pages 922--931, Hyderabad, India, May 2015. IEEE Computer Society Press. Workshop held in conjunction with IPDPS. [ BibTeX | DOI | PDF ]
Computing platforms are consuming more and more energy due to the increasing number of nodes composing them. To minimize the operating costs of these platforms many techniques have been used. Dynamic voltage and frequency scaling (DVFS) is one of them. It reduces the frequency of a CPU to lower its energy consumption. However, lowering the frequency of a CPU may increase the execution time of an application running on that processor. Therefore, the frequency that gives the best trade-off between the energy consumption and the performance of an application must be selected. In this paper, a new online frequency selecting algorithm for heterogeneous platforms (heterogeneous CPUs) is presented. It selects the frequencies and tries to give the best trade-off between energy saving and performance degradation, for each node computing the message passing iterative application. The algorithm has a small overhead and works without training or profiling. It uses a new energy model for message passing iterative applications running on a heterogeneous platform. The proposed algorithm is evaluated on the SimGrid simulator while running the NAS parallel benchmarks. The experiments show that it reduces the energy consumption by up to 34% while limiting the performance degradation as much as possible. Finally, the algorithm is compared to an existing method, the comparison results show that it outperforms the latter, on average it saves 4% more energy while keeping the same performance.
[2] Charles Emile Ramamonjisoa, Lilia Ziane Khodja, David Laiymani, Arnaud Giersch, and Raphaël Couturier. Simulation of asynchronous iterative algorithms using SimGrid. In HPCC 2014: 16th IEEE International Conference on High Performance Computing and Communications, pages 890--895, Paris, France, August 2014. IEEE Computer Society Press. [ BibTeX | DOI | PDF ]
Synchronous iterative algorithms are often less scalable than asynchronous iterative ones. Performing large scale experiments with different kind of network parameters is not easy because with supercomputers such parameters are fixed. So, one solution consists in using simulations first in order to analyze what parameters could influence or not the behavior of an algorithm. In this paper, we show that it is interesting to use SimGrid to simulate the behavior of asynchronous iterative algorithms. For that, we compare the behavior of a synchronous GMRES algorithm with an asynchronous multisplitting one with simulations which let us easily choose some parameters. Both codes are real MPI codes and simulations allow us to see when the asynchronous multisplitting algorithm can be more efficient than the GMRES one to solve a 3D Poisson problem.
[3] Jean-Claude Charr, Raphaël Couturier, Ahmed Fanfakh, and Arnaud Giersch. Dynamic frequency scaling for energy consumption reduction in synchronous distributed applications. In ISPA 2014: 12th IEEE International Symposium on Parallel and Distributed Processing with Applications, pages 225--230, Milan, Italy, August 2014. IEEE Computer Society Press. [ BibTeX | DOI | PDF ]
Dynamic Voltage Frequency Scaling (DVFS) can be applied to modern CPUs. This technique is usually used to reduce the energy consumed by a CPU while computing. Thus, decreasing the frequency reduces the power consumed by the CPU. However, it can also significantly affect the performance of the executed program if it is compute bound and if a low CPU frequency is selected. Therefore, the chosen scaling factor must give the best possible trade-off between energy reduction and performance. In this paper we present an algorithm that predicts the energy consumed with each frequency gear and selects the one that gives the best ratio between energy consumption reduction and performance. This algorithm works online without training or profiling and has a very small overhead. It also takes into account synchronous communications between the nodes that are executing the distributed algorithm. The algorithm has been evaluated over the SimGrid simulator while being applied to the NAS parallel benchmark programs. The results of the experiments show that it outperforms other existing scaling factor selection algorithms.
[4] Henri Casanova, Arnaud Giersch, Arnaud Legrand, Martin Quinson, and Frédéric Suter. SimGrid: a sustained effort for the versatile simulation of large scale distributed systems. In WSSSPE: First Workshop on Sustainable Software for Science: Practice and Experiences, Denver, CO, USA, November 2013. Workshop held in conjunction with SC13. [ BibTeX | PDF | Link ]
In this paper we present Simgrid, a toolkit for the versatile simulation of large scale distributed systems, whose development effort has been sustained for the last fifteen years. Over this time period SimGrid has evolved from a one-laboratory project in the U.S. into a scientific instrument developed by an international collaboration. The keys to making this evolution possible have been securing of funding, improving the quality of the software, and increasing the user base. In this paper we describe how we have been able to make advances on all three fronts, on which we plan to intensify our efforts over the upcoming years.
[5] Jacques M. Bahi, Sylvain Contassot-Vivier, and Arnaud Giersch. Load balancing in dynamic networks by bounded delays asynchronous diffusion. In José Palma, Michel Daydé, Osni Marques, and João Lopes, editors, VECPAR 2010: 9th International Meeting on High Performance Computing for Computational Science: revised selected papers, volume 6449 of Lecture Notes in Computer Science, pages 352--365. Springer-Verlag, 2011. [ BibTeX | DOI ]
Load balancing is a well known problem, which has been extensively addressed in parallel algorithmic. However, there subsist some contexts in which the existing algorithms cannot be used. One of these contexts is the case of dynamic networks where the links between the different elements are intermittent. We propose in this paper an efficient algorithm, based on asynchronous diffusion, to perform load balancing in such a context. A convergence theorem is proposed and proved. Finally, experimental results performed in the SimGrid environment confirm the efficiency of our algorithm.
[6] Jacques M. Bahi, Sylvain Contassot-Vivier, and Arnaud Giersch. Load balancing in dynamic networks by bounded delays asynchronous diffusion. In VECPAR 2010: 9th International Meeting on High Performance Computing for Computational Science, Berkeley, CA, USA, June 2010. [ BibTeX | PDF | Link ]
Load balancing is a well known problem, which has been extensively addressed in parallel algorithmic. However, there subsist some contexts in which the existing algorithms cannot be used. One of these contexts is the case of dynamic networks where the links between the different elements are intermittent. We propose in this paper an efficient algorithm, based on asynchronous diffusion, to perform load balancing in such a context. A convergence theorem is proposed and proved. Finally, experimental results performed in the SimGrid environment confirm the efficiency of our algorithm.
[7] Jacques M. Bahi, Arnaud Giersch, and Abdallah Makhoul. A scalable fault tolerant diffusion scheme for data fusion in sensor networks. In Infoscale 2008, The Third International ICST Conference on Scalable Information Systems, page 10 (5 pages), Vico Equense, Italy, June 2008. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering). [ BibTeX | PDF | Link ]
In sensor networks, sensor nodes are usually deployed randomly over an area to collect the information of interest. Data fusion is the phase of processing the collected information by sensor nodes before they are sent to the end user. This paper introduces a distributed consensus algorithm that allows the nodes of a sensor network to track the average of n sensor measurements. The study of the above mentioned algorithm showed that it is robust to asynchronism and dynamic topology changes. It is also put in evidence that the algorithm is fully distributed and does not require any global coordination. Moreover, the proposed method doesn't involve explicit point-to-point message or routing, it diffuses information across the network. Accordingly, simulation results are provided illustrating the effectiveness of the studied algorithm.
[8] Arnaud Giersch. Ordonnancement sur plates-formes hétérogènes de tâches partageant des données. In 16ème Rencontres Francophones du Parallélisme (RenPar 2005), pages 159--170, Le Croisic, France, April 2005. French text. [ BibTeX | PostScript | PDF ]
Cet article est consacré à l'ordonnancement d'un grand ensemble de tâches indépendantes sur des plates-formes hétérogènes distribuées. Les tâches dépendent de données (en entrée) qui sont initialement réparties sur les différents nœuds de la plate-forme. Une certaine donnée peut être partagée par plusieurs tâches. Pour chaque tâche, notre problème est de décider sur quel nœud de la plate-forme l'exécuter, et de transférer les données nécessaires (celles dont dépend la tâche) vers ce nœud. L'objectif est de trouver une allocation des tâches, et un ordonnancement des communications induites, qui minimisent le temps total d'exécution. Sur le plan théorique, nous exposons différents résultats qui caractérisent la difficulté du problème. Sur le plan pratique, nous proposons plusieurs nouvelles heuristiques, que nous comparons à des heuristiques classiques comme min-min ou sufferage grâce à des simulations.
[9] Arnaud Giersch, Yves Robert, and Frédéric Vivien. Scheduling tasks sharing files from distributed repositories. In Marco Danelutto, Marco Vanneschi, and Domenico Laforenza, editors, Euro-Par 2004: Parallel Processing: 10th International Euro-Par Conference, volume 3149 of Lecture Notes in Computer Science, pages 246--253, Pisa, Italy, August/September 2004. Springer-Verlag. [ BibTeX | DOI | PostScript | PDF | Corresponding RR ]
This paper is devoted to scheduling a large collection of independent tasks onto a distributed heterogeneous platform, which is composed of a set of servers. Each server is a processor cluster equipped with a file repository. The tasks to be scheduled depend upon (input) files which initially reside on the server repositories. A given file may well be shared by several tasks. For each task, the problem is to decide which server will execute it, and to transfer the required files to that server repository. The objective is to find a task allocation, and to schedule the induced communications, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish a complexity result that assesses the difficulty of the problem. On the practical side, we design several new heuristics, including an extension of the min-min heuristic to such a decentralized framework, and several lower cost heuristics, which we compare through extensive simulations.
[10] Arnaud Giersch, Yves Robert, and Frédéric Vivien. Scheduling tasks sharing files on heterogeneous master-slave platforms. In 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2004), pages 364--371, A Coruña, Spain, February 2004. IEEE Computer Society Press. [ BibTeX | DOI | PostScript | PDF | Corresponding RR ]
This paper is devoted to scheduling a large collection of independent tasks onto heterogeneous clusters. The tasks depend upon (input) files which initially reside on a master processor. A given file may well be shared by several tasks. The role of the master is to distribute the files to the processors, so that they can execute the tasks. The objective for the master is to select which file to send to which slave, and in which order, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish complexity results that assess the difficulty of the problem. On the practical side, we design several new heuristics, which are shown to perform as efficiently as the best heuristics from Casanova et al. although their cost is an order of magnitude lower.
[11] Arnaud Giersch, Yves Robert, and Frédéric Vivien. Scheduling tasks sharing files on heterogeneous clusters. In Jack Dongarra, Domenico Laforenza, and Salvatore Orlando, editors, 10th European PVM/MPI Users' Group Conference (EuroPVM/MPI 2003), volume 2840 of Lecture Notes in Computer Science, pages 657--660, Venice, Italy, September 2003. Springer-Verlag. [ BibTeX | DOI | PostScript | PDF | Corresponding RR ]
This paper is devoted to scheduling a large collection of independent tasks onto heterogeneous clusters. The tasks depend upon (input) files which initially reside on a master processor. A given file may well be shared by several tasks. The role of the master is to distribute the files to the processors, so that they can execute the tasks. The objective for the master is to select which file to send to which slave, and in which order, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish complexity results that assess the difficulty of the problem. On the practical side, we design several new heuristics, which are shown to perform as efficiently as the best heuristics in Casanova et al. although their cost is an order of magnitude lower.
[12] Stéphane Genaud, Arnaud Giersch, and Frédéric Vivien. Load-balancing scatter operations for grid computing. In 12th Heterogeneous Computing Workshop (HCW 2003), page 101a (10 pages), Nice, France, April 2003. IEEE Computer Society Press. Workshop held in conjunction with IPDPS. [ BibTeX | DOI | PostScript | PDF | Corresponding RR ]
We present solutions to statically load-balance scatter operations in parallel codes run on Grids. Our load-balancing strategy is based on the modification of the data distributions used in scatter operations. We need to modify the user source code, but we want to keep the code as close as possible to the original. Hence, we study the replacement of scatter operations with a parameterized scatter, allowing a custom distribution of data. The paper presents: 1) a general algorithm which finds an optimal distribution of data across processors; 2) a quicker guaranteed heuristic relying on hypotheses on communications and computations; 3) a policy on the ordering of the processors. Experimental results with an MPI scientific code of seismic tomography illustrate the benefits obtained from our load-balancing.
[13] Romaric David, Stéphane Genaud, Arnaud Giersch, Benjamin Schwarz, and Éric Violard. Source code transformations strategies to load-balance grid applications. In Manish Parashar, editor, 3rd International Workshop on Grid Computing (GRID 2002), volume 2536 of Lecture Notes in Computer Science, pages 82--87, Baltimore, MD, USA, November 2002. Springer-Verlag. [ BibTeX | DOI | PostScript | PDF | Corresponding RR ]
We present load-balancing strategies to improve the performances of parallel MPI applications running in a Grid environment. We analyze the data distribution constraints found in two scientific codes and propose adapted code transformations to load-balance computations. Experimental results confirm that such source code transformations can improve Grid application performances.

Research Reports

[1] Arnaud Giersch, Yves Robert, and Frédéric Vivien. Scheduling tasks sharing files from distributed repositories (revised version). Research Report no. 5124, INRIA, France, February 2004. This text is also available as Research Report no. 2004-04 of the LIP (http://www.ens-lyon.fr/LIP/). [ BibTeX | PostScript | PDF | PostScript (LIP version) | PDF (LIP version) | Link ]
This paper is devoted to scheduling a large collection of independent tasks onto a large distributed heterogeneous platform, which is composed of a set of servers. Each server is a processor cluster equipped with a file repository. The tasks to be scheduled depend upon (input) files which initially reside on the server repositories. A given file may well be shared by several tasks. For each task, the problem is to decide which server will execute it, and to transfer the required files (those which the task depends upon) to that server repository. The objective is to find a task allocation, and to schedule the induced communications, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish complexity results that assess the difficulty of the problem. On the practical side, we design several new heuristics, including an extension of the min-min heuristic to the decentralized framework, and several lower cost heuristics, which we compare through extensive simulations.
This report is a revised version of the LIP research report no. 2003-49 / INRIA research report no. 4976, which it replaces.
[2] Arnaud Giersch, Yves Robert, and Frédéric Vivien. Scheduling tasks sharing files from distributed repositories. Research Report no. 4976, INRIA, France, October 2003. This text is also available as Research Report no. 2003-49 of the LIP (http://www.ens-lyon.fr/LIP/). [ BibTeX | PostScript | PDF | PostScript (LIP version) | PDF (LIP version) | Link ]
This paper is devoted to scheduling a large collection of independent tasks onto a large distributed heterogeneous platform, which is composed of a set of servers. Each server is a processor cluster equipped with a file repository. The tasks to be scheduled depend upon (input) files which initially reside on the server repositories. A given file may well be shared by several tasks. For each task, the problem is to decide on which server to execute it, and to transfer the required files (those which the task depends upon) to that server repository. The objective is to find a task allocation, and to schedule the induced communications, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish complexity results that assess the difficulty of the problem. On the practical side, we design several new heuristics, including an extension of the min-min heuristic to the decentralized framework, and several lower cost heuristics, which we compare through extensive simulations.
[3] Arnaud Giersch, Yves Robert, and Frédéric Vivien. Scheduling tasks sharing files on heterogeneous clusters. Research Report no. 4819, INRIA, France, May 2003. This text is also available as Research Report no. 2003-28 of the LIP (http://www.ens-lyon.fr/LIP/). [ BibTeX | PostScript | PDF | PostScript (LIP version) | PDF (LIP version) | Link ]
This paper is devoted to scheduling a large collection of independent tasks onto heterogeneous clusters. The tasks depend upon (input) files which initially reside on a master processor. A given file may well be shared by several tasks. The role of the master is to distribute the files to the processors, so that they can execute the tasks. The objective for the master is to select which file to send to which slave, and in which order, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish complexity results that assess the difficulty of the problem. On the practical side, we design several new heuristics, which are shown to perform as efficiently as the best heuristics of Casanova et al. although their cost is an order of magnitude lower.
[4] Stéphane Genaud, Arnaud Giersch, and Frédéric Vivien. Load-balancing scatter operations for grid computing. Research Report no. 4770, INRIA, France, March 2003. This text is also available as Research Report no. 2003-17 of the LIP (http://www.ens-lyon.fr/LIP/). [ BibTeX | PostScript | PDF | PostScript (LIP version) | PDF (LIP version) | Link ]
We present solutions to statically load-balance scatter operations in parallel codes run on grids. Our load-balancing strategy is based on the modification of the data distributions used in scatter operations. We study the replacement of scatter operations with parameterized scatters, allowing custom distributions of data. The paper presents: 1) a general algorithm which finds an optimal distribution of data across processors; 2) a quicker guaranteed heuristic relying on hypotheses on communications and computations; 3) a policy on the ordering of the processors. Experimental results with an MPI scientific code illustrate the benefits obtained from our load-balancing.
[5] Romaric David, Stéphane Genaud, Arnaud Giersch, Benjamin Schwarz, and Éric Violard. Source code transformations strategies to load-balance grid applications. Research Report 02-09, ICPS-LSIIT, Université Louis Pasteur, Strasbourg, France, August 2002. [ BibTeX | PostScript | PDF ]
We present load-balancing strategies to improve performances of parallel MPI applications running in a Grid environment. We analyze the data distribution constraints found in two scientific codes and propose adapted code transformations to load-balance computations. We present the general framework for our load-balancing techniques. We then describe these techniques as well as their application to our examples. Experimental results confirm that adapted source code transformations can improve Grid application performances.

PhD Thesis

[1] Arnaud Giersch. Ordonnancement sur plates-formes hétérogènes de tâches partageant des données. Thèse de doctorat, Université Louis Pasteur, Strasbourg, France, December 2004. French text. [ BibTeX | PostScript | PDF | Link ]

Arnaud Giersch
This page has been generated using bibtex2html 1.98 and Hlins 0.39.
Last modification: Wed Apr 25 10:42:33 CEST 2018

Valid HTML 4.01!