1 /*! \page publis Publications
3 When citing SimGrid, the prefered reference paper is <i>Scheduling
4 Distributed Applications: the SimGrid Simulation Framework</i>, even if it's
5 a bit old now. We are actively working on improving this.
7 \subsection pub_simulation About simulation
9 \li <b>Scheduling Distributed Applications: the
10 SimGrid Simulation Framework</b>\n
11 by <em>Henri Casanova and Arnaud Legrand and Loris Marchal</em>\n
12 Proceedings of the third IEEE International Symposium
13 on Cluster Computing and the Grid (CCGrid'03)\n
14 Since the advent of distributed computer systems an active field
15 of research has been the investigation of scheduling strategies
16 for parallel applications. The common approach is to employ
17 scheduling heuristics that approximate an optimal
18 schedule. Unfortunately, it is often impossible to obtain
19 analytical results to compare the efficacy of these heuristics.
20 One possibility is to conducts large numbers of back-to-back
21 experiments on real platforms. While this is possible on
22 tightly-coupled platforms, it is infeasible on modern distributed
23 platforms (i.e. Grids) as it is labor-intensive and does not
24 enable repeatable results. The solution is to resort to
25 simulations. Simulations not only enables repeatable results but
26 also make it possible to explore wide ranges of platform and
27 application scenarios.\n
28 In this paper we present the SimGrid framework which enables the
29 simulation of distributed applications in distributed computing
30 environments for the specific purpose of developing and evaluating
31 scheduling algorithms. This paper focuses on SimGrid v2, which
32 greatly improves on the first version of the software with more
33 realistic network models and topologies. SimGrid v2 also enables
34 the simulation of distributed scheduling agents, which has become
35 critical for current scheduling research in large-scale platforms.
36 After describing and validating these features, we present a case
37 study by which we demonstrate the usefulness of SimGrid for
38 conducting scheduling research.
41 \li <b>A Network Model for Simulation of Grid Application</b>\n
42 by <em>Henri Casanova and Loris Marchal</em>\n
44 In this work we investigate network models that can be
45 potentially employed in the simulation of scheduling algorithms for
46 distributed computing applications. We seek to develop a model of TCP
47 communication which is both high-level and realistic. Previous research
48 works show that accurate and global modeling of wide-area networks, such
49 as the Internet, faces a number of challenging issues. However, some
50 global models of fairness and bandwidth-sharing exist, and can be link
51 withthe behavior of TCP. Using both previous results and simulation (with
52 NS), we attempt to understand the macroscopic behavior of
53 TCP communications. We then propose a global model of the network for the
54 Grid platform. We perform partial validation of this model in
55 simulation. The model leads to an algorithm for computing
56 bandwidth-sharing. This algorithm can then be implemented as part of Grid
57 application simulations. We provide such an implementation for the
58 SimGrid simulation toolkit.\n
59 ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-40.ps.gz
62 \li <b>MetaSimGrid : Towards realistic scheduling simulation of
63 distributed applications</b>\n
64 by <em>Arnaud Legrand and Julien Lerouge</em>\n
65 Most scheduling problems are already hard on homogeneous
66 platforms, they become quite intractable in an heterogeneous
67 framework such as a metacomputing grid. In the best cases, a
68 guaranteed heuristic can be found, but most of the time, it is
69 not possible. Real experiments or simulations are often
70 involved to test or to compare heuristics. However, on a
71 distributed heterogeneous platform, such experiments are
72 technically difficult to drive, because of the genuine
73 instability of the platform. It is almost impossible to
74 guarantee that a platform which is not dedicated to the
75 experiment, will remain exactly the same between two tests,
76 thereby forbidding any meaningful comparison. Simulations are
77 then used to replace real experiments, so as to ensure the
78 reproducibility of measured data. A key issue is the
79 possibility to run the simulations against a realistic
80 environment. The main idea of trace-based simulation is to
81 record the platform parameters today, and to simulate the
82 algorithms tomorrow, against the recorded data: even though it
83 is not the current load of the platform, it is realistic,
84 because it represents a fair summary of what happened
85 previously. A good example of a trace-based simulation tool is
86 SimGrid, a toolkit providing a set of core abstractions and
87 functionalities that can be used to easily build simulators for
88 specific application domains and/or computing environment
89 topologies. Nevertheless, SimGrid lacks a number of convenient
90 features to craft simulations of a distributed application
91 where scheduling decisions are not taken by a single
92 process. Furthermore, modeling a complex platform by hand is
93 fastidious for a few hosts and is almost impossible for a real
94 grid. This report is a survey on simulation for scheduling
95 evaluation purposes and present MetaSimGrid, a simulator built
97 ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-28.ps.gz
99 \li <b>SimGrid: A Toolkit for the Simulation of Application
101 by <em>Henri Casanova</em>\n
102 Advances in hardware and software technologies have made it
103 possible to deploy parallel applications over increasingly large
104 sets of distributed resources. Consequently, the study of
105 scheduling algorithms for such applications has been an active area
106 of research. Given the nature of most scheduling problems one must
107 resort to simulation to effectively evaluate and compare their
108 efficacy over a wide range of scenarios. It has thus become
109 necessary to simulate those algorithms for increasingly complex
110 distributed, dynamic, heterogeneous environments. In this paper we
111 present SimGrid, a simulation toolkit for the study of scheduling
112 algorithms for distributed application. This paper gives the main
113 concepts and models behind SimGrid, describes its API and
114 highlights current implementation issues. We also give some
115 experimental results and describe work that builds on SimGrid's
117 http://grail.sdsc.edu/papers/simgrid_ccgrid01.ps.gz
119 \subsection pub_research Papers using SimGrid results
121 \li <b> A study of meta-scheduling architectures for high throughput
122 computing: Pull vs. Push</b>\n
123 by <em> Vincent Garonne, Andrei Tsaregorodtsev, and Eddy Caron </em>\n
124 Proceedings of 4th Internationnal Symposium on Parallel and
125 Distributed Computing Job Scheduling Strategies for Parallel
126 Processing (ISPDC'05), July 2005.\n
127 Preliminary version in http://marwww.in2p3.fr/~garonne/garonne-meta.pdf
129 \li <b>Exploiting Replication and Data Reuse to Efficiently Schedule
130 Data-intensive Applications on Grids</b>\n
131 by <em> E. Santos-Neto, W. Cirne, F. Brasileiro, A. Lima.</em>\n
132 Proceedings of 10th Job Scheduling Strategies for Parallel Processing, June 2004.\n
133 http://www.lsd.ufcg.edu.br/~elizeu/articles/jsspp.v6.pdf
135 \li <b>Optimal algorithms for scheduling divisible workloads on
136 heterogeneous systems</b>\n
137 by <em>Olivier Beaumont and Arnaud Legrand and Yves Robert</em>\n
138 in Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS'03).\n
139 Preliminary version on ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-36.ps.gz
141 \li <b>On-line Parallel Tomography</b>\n
142 by <em>Shava Smallen</em>\n
143 Masters Thesis, UCSD, May 2001
144 \li <b>Applying Scheduling and Tuning to On-line Parallel Tomography </b>\n
145 by <em>Shava Smallen, Henri Casanova, Francine Berman</em>\n
146 in Proceedings of Supercomputing 2001
147 \li <b>Heuristics for Scheduling Parameter Sweep applications in
148 Grid environments</b>\n
149 by <em>Henri Casanova, Arnaud Legrand, Dmitrii Zagorodnov and
150 Francine Berman</em>\n
151 in Proceedings of the 9th Heterogeneous Computing workshop
152 (HCW'2000), pp349-363.