[1]
|
Raphaël Couturier, Arnaud Giersch, and Mourad Hakem.
Best effort strategy and virtual load for asynchronous iterative load
balancing.
Journal of Computational Science, 26:118--127, May 2018.
[ BibTeX |
DOI |
PDF ]
Most of the time, asynchronous load balancing algorithms are
extensively studied from a theoretical point of view. The
Bertsekas and Tsitsiklis' algorithm is undeniably the best
known algorithm for which the asymptotic convergence proof is
given. From a practical point of view, when a node needs to
balance a part of its load to some of its neighbors, the
algorithm's description is unfortunately too succinct, and no
details are given on what is really sent and how the load
balancing decisions are made. In this paper, we propose a new
strategy called best effort which aims at balancing the load
of a node to all its less loaded neighbors while ensuring that
all involved nodes by the load balancing phase have the same
amount of load. Moreover, since asynchronous iterative
algorithms are less sensitive to communication delays and
their variations, both load transfer and load information
messages are dissociated. To speedup the convergence time of
the load balancing process, we propose a clairvoyant virtual
load heuristic. This heuristic allows a node receiving a load
information message to integrate the future virtual load (if
any) in its load's list, even if the load has not been
received yet. This leads to have predictive snapshots of
nodes' loads at each iteration of the load balancing
process. Consequently, the notified node sends a real part of
its load to some of its neighbors, taking into account the
virtual load it will receive in the subsequent
time-steps. Based on the SimGrid simulator, some series of
test-bed scenarios are considered and several QoS metrics are
evaluated to show the usefulness of the proposed algorithm.
|
[2]
|
Ahmed Fanfakh, Jean-Claude Charr, Raphaël Couturier, and Arnaud Giersch.
Energy consumption reduction for asynchronous message-passing
applications.
The Journal of Supercomputing, 73(6):2369--2401, June 2017.
[ BibTeX |
DOI |
PDF ]
It is widely accepted that the asynchronous parallel methods
are more suitable than the synchronous ones on a grid
architecture. Indeed, they outperform the synchronous methods,
because they overlap the communications of the synchronous
methods with computations. However, they also usually execute
more iterations than the synchronous ones and thus consume
more energy. To reduce the energy consumption of the CPUs
executing such methods, the Dynamic voltage and frequency
scaling technique can be used. It lowers the frequency of a
CPU to reduce its energy consumption, but it also decreases
its computing power. Therefore, the frequency that gives the
best trade-off between energy consumption and performance must
be selected. This paper presents a new online frequency
selecting algorithm for parallel iterative asynchronous
methods running over grids. It selects a vector of frequencies
that gives the best trade-off between energy consumption and
performance. New energy and performance models were used in
this algorithm to predict the execution time and the energy
consumption of synchronous, asynchronous, or hybrid iterative
applications running over grids. The proposed algorithm was
evaluated on the SimGrid simulator. The experiments showed
that synchronously applying the proposed algorithm to the
asynchronous version of the application reduces on average its
energy consumption by 22% and speeds it up by
5.72%. Finally, the proposed algorithm was also compared to
a method that uses the well-known energy and delay product and
the comparison results showed that it outperforms this method
in terms of energy consumption and performance.
|
[3]
|
Ahmed Fanfakh, Jean-Claude Charr, Raphaël Couturier, and Arnaud Giersch.
Optimizing the energy consumption of message passing applications
with iterations executed over grids.
Journal of Computational Science, 17(3):562--575, November
2016.
[ BibTeX |
DOI |
PDF ]
In recent years, green computing has become an important topic
in the supercomputing research domain. However, the computing
platforms are still consuming more and more energy due to the
increasing number of nodes composing them. To minimize the
operating costs of these platforms many techniques have been
used. Dynamic voltage and frequency scaling (DVFS) is one of
them. It can be used to reduce the power consumption of the
CPU while computing, by lowering its frequency. However,
lowering the frequency of a CPU may increase the execution
time of an application running on that processor. Therefore,
the frequency that gives the best trade-off between the energy
consumption and the performance of an application must be
selected. In this paper, a new online frequency selecting
algorithm for grids, composed of heterogeneous clusters, is
presented. It selects the frequencies and tries to give the
best trade-off between energy saving and performance
degradation, for each node computing the message passing
application with iterations. The algorithm has a small
overhead and works without training or profiling. It uses a
new energy model for message passing applications with
iterations running on a grid. The proposed algorithm is
evaluated on a real grid, the Grid'5000 platform, while
running the NAS parallel benchmarks. The experiments on 16
nodes, distributed on three clusters, show that it reduces on
average the energy consumption by 30% while the performance
is on average only degraded by 3.2%. Finally, the algorithm
is compared to an existing method. The comparison results show
that it outperforms the latter in terms of energy consumption
reduction and performance.
|
[4]
|
Henri Casanova, Arnaud Giersch, Arnaud Legrand, Martin Quinson, and
Frédéric Suter.
Versatile, scalable, and accurate simulation of distributed
applications and platforms.
Journal of Parallel and Distributed Computing,
74(10):2899--2917, October 2014.
[ BibTeX |
DOI |
PDF ]
The study of parallel and distributed applications
and platforms, whether in the cluster, grid,
peer-to-peer, volunteer, or cloud computing domain,
often mandates empirical evaluation of proposed
algorithmic and system solutions via
simulation. Unlike direct experimentation via an
application deployment on a real-world testbed,
simulation enables fully repeatable and configurable
experiments for arbitrary hypothetical
scenarios. Two key concerns are accuracy (so that
simulation results are scientifically sound) and
scalability (so that simulation experiments can be
fast and memory-efficient). While the scalability of
a simulator is easily measured, the accuracy of many
state-of-the-art simulators is largely unknown
because they have not been sufficiently
validated. In this work we describe recent accuracy
and scalability advances made in the context of the
SimGrid simulation framework. A design goal of
SimGrid is that it should be versatile, i.e.,
applicable across all aforementioned domains. We
present quantitative results that show that SimGrid
compares favorably to state-of-the-art
domain-specific simulators in terms of scalability,
accuracy, or the trade-off between the two. An
important implication is that, contrary to popular
wisdom, striving for versatility in a simulator is
not an impediment but instead is conducive to
improving both accuracy and scalability.
|
[5]
|
Lilia Ziane Khodja, Raphaël Couturier, Arnaud Giersch, and Jacques M. Bahi.
Parallel sparse linear solver with GMRES method using minimization
techniques of communications for GPU clusters.
The Journal of Supercomputing, 69(1):200--224, July 2014.
[ BibTeX |
DOI ]
In this paper, we aim at exploiting the power
computing of a graphics processing unit (GPU)
cluster for solving large sparse linear systems. We
implement the parallel algorithm of the generalized
minimal residual iterative method using the Compute
Unified Device Architecture programming language and
the MPI parallel environment. The experiments show
that a GPU cluster is more efficient than a CPU
cluster. In order to optimize the performances, we
use a compressed storage format for the sparse
vectors and the hypergraph partitioning. These
solutions improve the spatial and temporal
localization of the shared data between the
computing nodes of the GPU cluster.
|
[6]
|
Arnaud Giersch, Yves Robert, and Frédéric Vivien.
Scheduling tasks sharing files on heterogeneous master-slave
platforms.
Journal of Systems Architecture, special issue on Parallel,
Distributed and Network-based Processing: selected papers from the 12th
Euromicro Conference, 52(2):88--104, February 2006.
[ BibTeX |
DOI |
PostScript |
PDF |
Corresponding RR ]
This paper is devoted to scheduling a large
collection of independent tasks onto heterogeneous
clusters. The tasks depend upon (input) files which
initially reside on a master processor. A given file
may well be shared by several tasks. The role of the
master is to distribute the files to the processors,
so that they can execute the tasks. The objective
for the master is to select which file to send to
which slave, and in which order, so as to minimize
the total execution time. The contribution of this
paper is twofold. On the theoretical side, we
establish complexity results that assess the
difficulty of the problem. On the practical side, we
design several new heuristics, which are shown to
perform as efficiently as the best heuristics from
Casanova et al. although their cost is an order of
magnitude lower.
|
[7]
|
Stéphane Genaud, Arnaud Giersch, and Frédéric Vivien.
Load-balancing scatter operations for grid computing.
Parallel Computing, 30(8):923--946, August 2004.
[ BibTeX |
DOI |
PostScript |
PDF |
Corresponding RR ]
We present solutions to statically load-balance
scatter operations in parallel codes run on
grids. Our load-balancing strategy is based on the
modification of the data distributions used in
scatter operations. We study the replacement of
scatter operations with parameterized scatters,
allowing custom distributions of data. The paper
presents: 1) a general algorithm which finds an
optimal distribution of data across processors; 2) a
quicker guaranteed heuristic relying on hypotheses
on communications and computations; 3) a policy on
the ordering of the processors. Experimental results
with an MPI scientific code illustrate the benefits
obtained from our load-balancing.
|
[1]
|
Jean-Claude Charr, Raphaël Couturier, Ahmed Fanfakh, and Arnaud Giersch.
Energy consumption reduction with DVFS for message passing
iterative applications on heterogeneous architectures.
In PDSEC 2015: 16th IEEE International Workshop on Parallel
and Distributed Scientific and Engineering Computing, pages 922--931,
Hyderabad, India, May 2015. IEEE Computer Society Press.
Workshop held in conjunction with IPDPS.
[ BibTeX |
DOI |
PDF ]
Computing platforms are consuming more and more energy due to
the increasing number of nodes composing them. To minimize the
operating costs of these platforms many techniques have been
used. Dynamic voltage and frequency scaling (DVFS) is one of
them. It reduces the frequency of a CPU to lower its energy
consumption. However, lowering the frequency of a CPU may
increase the execution time of an application running on that
processor. Therefore, the frequency that gives the best
trade-off between the energy consumption and the performance
of an application must be selected. In this paper, a new
online frequency selecting algorithm for heterogeneous
platforms (heterogeneous CPUs) is presented. It selects the
frequencies and tries to give the best trade-off between
energy saving and performance degradation, for each node
computing the message passing iterative application. The
algorithm has a small overhead and works without training or
profiling. It uses a new energy model for message passing
iterative applications running on a heterogeneous
platform. The proposed algorithm is evaluated on the SimGrid
simulator while running the NAS parallel benchmarks. The
experiments show that it reduces the energy consumption by up
to 34% while limiting the performance degradation as much
as possible. Finally, the algorithm is compared to an existing
method, the comparison results show that it outperforms the
latter, on average it saves 4% more energy while keeping
the same performance.
|
[2]
|
Charles Emile Ramamonjisoa, Lilia Ziane Khodja, David Laiymani, Arnaud Giersch,
and Raphaël Couturier.
Simulation of asynchronous iterative algorithms using SimGrid.
In HPCC 2014: 16th IEEE International Conference on High
Performance Computing and Communications, pages 890--895, Paris, France,
August 2014. IEEE Computer Society Press.
[ BibTeX |
DOI |
PDF ]
Synchronous iterative algorithms are often less
scalable than asynchronous iterative ones.
Performing large scale experiments with different
kind of network parameters is not easy because with
supercomputers such parameters are fixed. So, one
solution consists in using simulations first in
order to analyze what parameters could influence or
not the behavior of an algorithm. In this paper, we
show that it is interesting to use SimGrid to
simulate the behavior of asynchronous iterative
algorithms. For that, we compare the behavior of a
synchronous GMRES algorithm with an asynchronous
multisplitting one with simulations which let us
easily choose some parameters. Both codes are real
MPI codes and simulations allow us to see when the
asynchronous multisplitting algorithm can be more
efficient than the GMRES one to solve a 3D Poisson
problem.
|
[3]
|
Jean-Claude Charr, Raphaël Couturier, Ahmed Fanfakh, and Arnaud Giersch.
Dynamic frequency scaling for energy consumption reduction in
synchronous distributed applications.
In ISPA 2014: 12th IEEE International Symposium on Parallel
and Distributed Processing with Applications, pages 225--230, Milan, Italy,
August 2014. IEEE Computer Society Press.
[ BibTeX |
DOI |
PDF ]
Dynamic Voltage Frequency Scaling (DVFS) can be
applied to modern CPUs. This technique is usually
used to reduce the energy consumed by a CPU while
computing. Thus, decreasing the frequency reduces
the power consumed by the CPU. However, it can also
significantly affect the performance of the executed
program if it is compute bound and if a low CPU
frequency is selected. Therefore, the chosen
scaling factor must give the best possible trade-off
between energy reduction and performance. In this
paper we present an algorithm that predicts the
energy consumed with each frequency gear and selects
the one that gives the best ratio between energy
consumption reduction and performance. This
algorithm works online without training or profiling
and has a very small overhead. It also takes into
account synchronous communications between the nodes
that are executing the distributed algorithm. The
algorithm has been evaluated over the SimGrid
simulator while being applied to the NAS parallel
benchmark programs. The results of the experiments
show that it outperforms other existing scaling
factor selection algorithms.
|
[4]
|
Henri Casanova, Arnaud Giersch, Arnaud Legrand, Martin Quinson, and
Frédéric Suter.
SimGrid: a sustained effort for the versatile simulation of large
scale distributed systems.
In WSSSPE: First Workshop on Sustainable Software for Science:
Practice and Experiences, Denver, CO, USA, November 2013.
Workshop held in conjunction with SC13.
[ BibTeX |
PDF |
Link ]
In this paper we present Simgrid, a toolkit for the
versatile simulation of large scale distributed
systems, whose development effort has been sustained
for the last fifteen years. Over this time period
SimGrid has evolved from a one-laboratory project in
the U.S. into a scientific instrument developed by
an international collaboration. The keys to making
this evolution possible have been securing of
funding, improving the quality of the software, and
increasing the user base. In this paper we describe
how we have been able to make advances on all three
fronts, on which we plan to intensify our efforts
over the upcoming years.
|
[5]
|
Jacques M. Bahi, Sylvain Contassot-Vivier, and Arnaud Giersch.
Load balancing in dynamic networks by bounded delays asynchronous
diffusion.
In José Palma, Michel Daydé, Osni Marques, and João
Lopes, editors, VECPAR 2010: 9th International Meeting on High
Performance Computing for Computational Science: revised selected papers,
volume 6449 of Lecture Notes in Computer Science, pages 352--365.
Springer-Verlag, 2011.
[ BibTeX |
DOI ]
Load balancing is a well known problem, which has
been extensively addressed in parallel
algorithmic. However, there subsist some contexts in
which the existing algorithms cannot be used. One of
these contexts is the case of dynamic networks where
the links between the different elements are
intermittent. We propose in this paper an efficient
algorithm, based on asynchronous diffusion, to
perform load balancing in such a context. A
convergence theorem is proposed and proved. Finally,
experimental results performed in the SimGrid
environment confirm the efficiency of our algorithm.
|
[6]
|
Jacques M. Bahi, Sylvain Contassot-Vivier, and Arnaud Giersch.
Load balancing in dynamic networks by bounded delays asynchronous
diffusion.
In VECPAR 2010: 9th International Meeting on High Performance
Computing for Computational Science, Berkeley, CA, USA, June 2010.
[ BibTeX |
PDF |
Link ]
Load balancing is a well known problem, which has
been extensively addressed in parallel
algorithmic. However, there subsist some contexts in
which the existing algorithms cannot be used. One of
these contexts is the case of dynamic networks where
the links between the different elements are
intermittent. We propose in this paper an efficient
algorithm, based on asynchronous diffusion, to
perform load balancing in such a context. A
convergence theorem is proposed and proved. Finally,
experimental results performed in the SimGrid
environment confirm the efficiency of our
algorithm.
|
[7]
|
Jacques M. Bahi, Arnaud Giersch, and Abdallah Makhoul.
A scalable fault tolerant diffusion scheme for data fusion in sensor
networks.
In Infoscale 2008, The Third International ICST Conference on
Scalable Information Systems, page 10 (5 pages), Vico Equense, Italy, June
2008. ICST (Institute for Computer Sciences, Social-Informatics and
Telecommunications Engineering).
[ BibTeX |
PDF |
Link ]
In sensor networks, sensor nodes are usually
deployed randomly over an area to collect the
information of interest. Data fusion is the phase of
processing the collected information by sensor nodes
before they are sent to the end user. This paper
introduces a distributed consensus algorithm that
allows the nodes of a sensor network to track the
average of n sensor measurements. The study of the
above mentioned algorithm showed that it is robust
to asynchronism and dynamic topology changes. It is
also put in evidence that the algorithm is fully
distributed and does not require any global
coordination. Moreover, the proposed method doesn't
involve explicit point-to-point message or routing,
it diffuses information across the
network. Accordingly, simulation results are
provided illustrating the effectiveness of the
studied algorithm.
|
[8]
|
Arnaud Giersch.
Ordonnancement sur plates-formes hétérogènes de
tâches partageant des données.
In 16ème Rencontres Francophones du Parallélisme (RenPar
2005), pages 159--170, Le Croisic, France, April 2005.
French text.
[ BibTeX |
PostScript |
PDF ]
Cet article est consacré à l'ordonnancement d'un
grand ensemble de tâches indépendantes sur des
plates-formes hétérogènes distribuées. Les
tâches dépendent de données (en entrée) qui
sont initialement réparties sur les différents
nœuds de la plate-forme. Une certaine donnée
peut être partagée par plusieurs tâches. Pour
chaque tâche, notre problème est de décider sur
quel nœud de la plate-forme l'exécuter, et de
transférer les données nécessaires (celles
dont dépend la tâche) vers ce nœud.
L'objectif est de trouver une allocation des
tâches, et un ordonnancement des communications
induites, qui minimisent le temps total
d'exécution. Sur le plan théorique, nous
exposons différents résultats qui
caractérisent la difficulté du problème. Sur le
plan pratique, nous proposons plusieurs nouvelles
heuristiques, que nous comparons à des
heuristiques classiques comme min-min ou sufferage
grâce à des simulations.
|
[9]
|
Arnaud Giersch, Yves Robert, and Frédéric Vivien.
Scheduling tasks sharing files from distributed repositories.
In Marco Danelutto, Marco Vanneschi, and Domenico Laforenza, editors,
Euro-Par 2004: Parallel Processing: 10th International Euro-Par
Conference, volume 3149 of Lecture Notes in Computer Science, pages
246--253, Pisa, Italy, August/September 2004. Springer-Verlag.
[ BibTeX |
DOI |
PostScript |
PDF |
Corresponding RR ]
This paper is devoted to scheduling a large
collection of independent tasks onto a distributed
heterogeneous platform, which is composed of a set
of servers. Each server is a processor cluster
equipped with a file repository. The tasks to be
scheduled depend upon (input) files which initially
reside on the server repositories. A given file may
well be shared by several tasks. For each task, the
problem is to decide which server will execute it,
and to transfer the required files to that server
repository. The objective is to find a task
allocation, and to schedule the induced
communications, so as to minimize the total
execution time. The contribution of this paper is
twofold. On the theoretical side, we establish a
complexity result that assesses the difficulty of
the problem. On the practical side, we design
several new heuristics, including an extension of
the min-min heuristic to such a
decentralized framework, and several lower cost
heuristics, which we compare through extensive
simulations.
|
[10]
|
Arnaud Giersch, Yves Robert, and Frédéric Vivien.
Scheduling tasks sharing files on heterogeneous master-slave
platforms.
In 12th Euromicro Conference on Parallel, Distributed and
Network-Based Processing (PDP 2004), pages 364--371, A Coruña, Spain,
February 2004. IEEE Computer Society Press.
[ BibTeX |
DOI |
PostScript |
PDF |
Corresponding RR ]
This paper is devoted to scheduling a large
collection of independent tasks onto heterogeneous
clusters. The tasks depend upon (input) files which
initially reside on a master processor. A given file
may well be shared by several tasks. The role of the
master is to distribute the files to the processors,
so that they can execute the tasks. The objective
for the master is to select which file to send to
which slave, and in which order, so as to minimize
the total execution time. The contribution of this
paper is twofold. On the theoretical side, we
establish complexity results that assess the
difficulty of the problem. On the practical side, we
design several new heuristics, which are shown to
perform as efficiently as the best heuristics from
Casanova et al. although their cost is an order of
magnitude lower.
|
[11]
|
Arnaud Giersch, Yves Robert, and Frédéric Vivien.
Scheduling tasks sharing files on heterogeneous clusters.
In Jack Dongarra, Domenico Laforenza, and Salvatore Orlando, editors,
10th European PVM/MPI Users' Group Conference (EuroPVM/MPI 2003),
volume 2840 of Lecture Notes in Computer Science, pages 657--660,
Venice, Italy, September 2003. Springer-Verlag.
[ BibTeX |
DOI |
PostScript |
PDF |
Corresponding RR ]
This paper is devoted to scheduling a large
collection of independent tasks onto heterogeneous
clusters. The tasks depend upon (input) files which
initially reside on a master processor. A given file
may well be shared by several tasks. The role of the
master is to distribute the files to the processors,
so that they can execute the tasks. The objective
for the master is to select which file to send to
which slave, and in which order, so as to minimize
the total execution time. The contribution of this
paper is twofold. On the theoretical side, we
establish complexity results that assess the
difficulty of the problem. On the practical side, we
design several new heuristics, which are shown to
perform as efficiently as the best heuristics in
Casanova et al. although their cost is an order of
magnitude lower.
|
[12]
|
Stéphane Genaud, Arnaud Giersch, and Frédéric Vivien.
Load-balancing scatter operations for grid computing.
In 12th Heterogeneous Computing Workshop (HCW 2003), page
101a (10 pages), Nice, France, April 2003. IEEE Computer Society Press.
Workshop held in conjunction with IPDPS.
[ BibTeX |
DOI |
PostScript |
PDF |
Corresponding RR ]
We present solutions to statically load-balance
scatter operations in parallel codes run on
Grids. Our load-balancing strategy is based on the
modification of the data distributions used in
scatter operations. We need to modify the user
source code, but we want to keep the code as close
as possible to the original. Hence, we study the
replacement of scatter operations with a
parameterized scatter, allowing a custom
distribution of data. The paper presents: 1) a
general algorithm which finds an optimal
distribution of data across processors; 2) a quicker
guaranteed heuristic relying on hypotheses on
communications and computations; 3) a policy on the
ordering of the processors. Experimental results
with an MPI scientific code of seismic tomography
illustrate the benefits obtained from our
load-balancing.
|
[13]
|
Romaric David, Stéphane Genaud, Arnaud Giersch, Benjamin Schwarz, and
Éric Violard.
Source code transformations strategies to load-balance grid
applications.
In Manish Parashar, editor, 3rd International Workshop on Grid
Computing (GRID 2002), volume 2536 of Lecture Notes in Computer
Science, pages 82--87, Baltimore, MD, USA, November 2002.
Springer-Verlag.
[ BibTeX |
DOI |
PostScript |
PDF |
Corresponding RR ]
We present load-balancing strategies to improve the
performances of parallel MPI applications running in
a Grid environment. We analyze the data distribution
constraints found in two scientific codes and
propose adapted code transformations to load-balance
computations. Experimental results confirm that such
source code transformations can improve Grid
application performances.
|
[1]
|
Arnaud Giersch, Yves Robert, and Frédéric Vivien.
Scheduling tasks sharing files from distributed repositories (revised
version).
Research Report no. 5124, INRIA, France, February 2004.
This text is also available as Research Report no. 2004-04 of the LIP
(http://www.ens-lyon.fr/LIP/).
[ BibTeX |
PostScript |
PDF |
PostScript (LIP version) |
PDF (LIP version) |
Link ]
This paper is devoted to scheduling a large
collection of independent tasks onto a large
distributed heterogeneous platform, which is
composed of a set of servers. Each server is a
processor cluster equipped with a file
repository. The tasks to be scheduled depend upon
(input) files which initially reside on the server
repositories. A given file may well be shared by
several tasks. For each task, the problem is to
decide which server will execute it, and to transfer
the required files (those which the task depends
upon) to that server repository. The objective is to
find a task allocation, and to schedule the induced
communications, so as to minimize the total
execution time. The contribution of this paper is
twofold. On the theoretical side, we establish
complexity results that assess the difficulty of the
problem. On the practical side, we design several
new heuristics, including an extension of the
min-min heuristic to the decentralized
framework, and several lower cost heuristics, which
we compare through extensive simulations. This
report is a revised version of the LIP research
report no. 2003-49 / INRIA research report no. 4976,
which it replaces.
|
[2]
|
Arnaud Giersch, Yves Robert, and Frédéric Vivien.
Scheduling tasks sharing files from distributed repositories.
Research Report no. 4976, INRIA, France, October 2003.
This text is also available as Research Report no. 2003-49 of the LIP
(http://www.ens-lyon.fr/LIP/).
[ BibTeX |
PostScript |
PDF |
PostScript (LIP version) |
PDF (LIP version) |
Link ]
This paper is devoted to scheduling a large
collection of independent tasks onto a large
distributed heterogeneous platform, which is
composed of a set of servers. Each server is a
processor cluster equipped with a file
repository. The tasks to be scheduled depend upon
(input) files which initially reside on the server
repositories. A given file may well be shared by
several tasks. For each task, the problem is to
decide on which server to execute it, and to
transfer the required files (those which the task
depends upon) to that server repository. The
objective is to find a task allocation, and to
schedule the induced communications, so as to
minimize the total execution time. The contribution
of this paper is twofold. On the theoretical side,
we establish complexity results that assess the
difficulty of the problem. On the practical side, we
design several new heuristics, including an
extension of the min-min heuristic to the
decentralized framework, and several lower cost
heuristics, which we compare through extensive
simulations.
|
[3]
|
Arnaud Giersch, Yves Robert, and Frédéric Vivien.
Scheduling tasks sharing files on heterogeneous clusters.
Research Report no. 4819, INRIA, France, May 2003.
This text is also available as Research Report no. 2003-28 of the LIP
(http://www.ens-lyon.fr/LIP/).
[ BibTeX |
PostScript |
PDF |
PostScript (LIP version) |
PDF (LIP version) |
Link ]
This paper is devoted to scheduling a large
collection of independent tasks onto heterogeneous
clusters. The tasks depend upon (input) files which
initially reside on a master processor. A given file
may well be shared by several tasks. The role of the
master is to distribute the files to the processors,
so that they can execute the tasks. The objective
for the master is to select which file to send to
which slave, and in which order, so as to minimize
the total execution time. The contribution of this
paper is twofold. On the theoretical side, we
establish complexity results that assess the
difficulty of the problem. On the practical side, we
design several new heuristics, which are shown to
perform as efficiently as the best heuristics of
Casanova et al. although their cost is an order of
magnitude lower.
|
[4]
|
Stéphane Genaud, Arnaud Giersch, and Frédéric Vivien.
Load-balancing scatter operations for grid computing.
Research Report no. 4770, INRIA, France, March 2003.
This text is also available as Research Report no. 2003-17 of the LIP
(http://www.ens-lyon.fr/LIP/).
[ BibTeX |
PostScript |
PDF |
PostScript (LIP version) |
PDF (LIP version) |
Link ]
We present solutions to statically load-balance
scatter operations in parallel codes run on
grids. Our load-balancing strategy is based on the
modification of the data distributions used in
scatter operations. We study the replacement of
scatter operations with parameterized scatters,
allowing custom distributions of data. The paper
presents: 1) a general algorithm which finds an
optimal distribution of data across processors; 2) a
quicker guaranteed heuristic relying on hypotheses
on communications and computations; 3) a policy on
the ordering of the processors. Experimental results
with an MPI scientific code illustrate the benefits
obtained from our load-balancing.
|
[5]
|
Romaric David, Stéphane Genaud, Arnaud Giersch, Benjamin Schwarz, and
Éric Violard.
Source code transformations strategies to load-balance grid
applications.
Research Report 02-09, ICPS-LSIIT, Université Louis Pasteur,
Strasbourg, France, August 2002.
[ BibTeX |
PostScript |
PDF ]
We present load-balancing strategies to improve
performances of parallel MPI applications running in
a Grid environment. We analyze the data distribution
constraints found in two scientific codes and
propose adapted code transformations to load-balance
computations. We present the general framework for
our load-balancing techniques. We then describe
these techniques as well as their application to our
examples. Experimental results confirm that adapted
source code transformations can improve Grid
application performances.
|