1 /*! \page publis Publications
3 \section pub_reference Reference publication about SimGrid
5 When citing SimGrid, the prefered reference paper is <i>Scheduling
6 Distributed Applications: the SimGrid Simulation Framework</i>, even if it's
7 a bit old now. We are actively working on improving this.
9 \li <b>Scheduling Distributed Applications: the
10 SimGrid Simulation Framework</b>\n
11 by <em>Henri Casanova and Arnaud Legrand and Loris Marchal</em>\n
12 Proceedings of the third IEEE International Symposium
13 on Cluster Computing and the Grid (CCGrid'03)\n
14 Since the advent of distributed computer systems an active field
15 of research has been the investigation of scheduling strategies
16 for parallel applications. The common approach is to employ
17 scheduling heuristics that approximate an optimal
18 schedule. Unfortunately, it is often impossible to obtain
19 analytical results to compare the efficacy of these heuristics.
20 One possibility is to conducts large numbers of back-to-back
21 experiments on real platforms. While this is possible on
22 tightly-coupled platforms, it is infeasible on modern distributed
23 platforms (i.e. Grids) as it is labor-intensive and does not
24 enable repeatable results. The solution is to resort to
25 simulations. Simulations not only enables repeatable results but
26 also make it possible to explore wide ranges of platform and
27 application scenarios.\n
28 In this paper we present the SimGrid framework which enables the
29 simulation of distributed applications in distributed computing
30 environments for the specific purpose of developing and evaluating
31 scheduling algorithms. This paper focuses on SimGrid v2, which
32 greatly improves on the first version of the software with more
33 realistic network models and topologies. SimGrid v2 also enables
34 the simulation of distributed scheduling agents, which has become
35 critical for current scheduling research in large-scale platforms.
36 After describing and validating these features, we present a case
37 study by which we demonstrate the usefulness of SimGrid for
38 conducting scheduling research.
40 \section pub_simulation Other publications about simulation
42 \li <b>A Network Model for Simulation of Grid Application</b>\n
43 by <em>Henri Casanova and Loris Marchal</em>\n
45 In this work we investigate network models that can be
46 potentially employed in the simulation of scheduling algorithms for
47 distributed computing applications. We seek to develop a model of TCP
48 communication which is both high-level and realistic. Previous research
49 works show that accurate and global modeling of wide-area networks, such
50 as the Internet, faces a number of challenging issues. However, some
51 global models of fairness and bandwidth-sharing exist, and can be link
52 withthe behavior of TCP. Using both previous results and simulation (with
53 NS), we attempt to understand the macroscopic behavior of
54 TCP communications. We then propose a global model of the network for the
55 Grid platform. We perform partial validation of this model in
56 simulation. The model leads to an algorithm for computing
57 bandwidth-sharing. This algorithm can then be implemented as part of Grid
58 application simulations. We provide such an implementation for the
59 SimGrid simulation toolkit.\n
60 ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-40.ps.gz
63 \li <b>MetaSimGrid : Towards realistic scheduling simulation of
64 distributed applications</b>\n
65 by <em>Arnaud Legrand and Julien Lerouge</em>\n
66 Most scheduling problems are already hard on homogeneous
67 platforms, they become quite intractable in an heterogeneous
68 framework such as a metacomputing grid. In the best cases, a
69 guaranteed heuristic can be found, but most of the time, it is
70 not possible. Real experiments or simulations are often
71 involved to test or to compare heuristics. However, on a
72 distributed heterogeneous platform, such experiments are
73 technically difficult to drive, because of the genuine
74 instability of the platform. It is almost impossible to
75 guarantee that a platform which is not dedicated to the
76 experiment, will remain exactly the same between two tests,
77 thereby forbidding any meaningful comparison. Simulations are
78 then used to replace real experiments, so as to ensure the
79 reproducibility of measured data. A key issue is the
80 possibility to run the simulations against a realistic
81 environment. The main idea of trace-based simulation is to
82 record the platform parameters today, and to simulate the
83 algorithms tomorrow, against the recorded data: even though it
84 is not the current load of the platform, it is realistic,
85 because it represents a fair summary of what happened
86 previously. A good example of a trace-based simulation tool is
87 SimGrid, a toolkit providing a set of core abstractions and
88 functionalities that can be used to easily build simulators for
89 specific application domains and/or computing environment
90 topologies. Nevertheless, SimGrid lacks a number of convenient
91 features to craft simulations of a distributed application
92 where scheduling decisions are not taken by a single
93 process. Furthermore, modeling a complex platform by hand is
94 fastidious for a few hosts and is almost impossible for a real
95 grid. This report is a survey on simulation for scheduling
96 evaluation purposes and present MetaSimGrid, a simulator built
98 ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-28.ps.gz
100 \li <b>SimGrid: A Toolkit for the Simulation of Application
102 by <em>Henri Casanova</em>\n
103 Advances in hardware and software technologies have made it
104 possible to deploy parallel applications over increasingly large
105 sets of distributed resources. Consequently, the study of
106 scheduling algorithms for such applications has been an active area
107 of research. Given the nature of most scheduling problems one must
108 resort to simulation to effectively evaluate and compare their
109 efficacy over a wide range of scenarios. It has thus become
110 necessary to simulate those algorithms for increasingly complex
111 distributed, dynamic, heterogeneous environments. In this paper we
112 present SimGrid, a simulation toolkit for the study of scheduling
113 algorithms for distributed application. This paper gives the main
114 concepts and models behind SimGrid, describes its API and
115 highlights current implementation issues. We also give some
116 experimental results and describe work that builds on SimGrid's
118 http://grail.sdsc.edu/papers/simgrid_ccgrid01.ps.gz
120 \section pub_ext Papers that use SimGrid-generated results
122 This list is a selection of articles. We list only papers written by people
123 external to the development group, but we also use our tool ourselves (see
128 \li <b>Hierarchical Scheduling of Independent Tasks with Shared Files</b>\n
129 by <em>H. Senger, F. Silva, W. Nascimento</em>\n
130 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid Workshop (CCGRIDW'06), 2006.\n
131 http://www.unisantos.br/mestrado/informatica/hermes/File/senger-HierarchicalScheduling-Workshop-TB120.pdf
135 \li <b>On Dynamic Resource Management Mechanism using Control
136 Theoretic Approach for Wide-Area Grid Computing</b>\n
137 by <em>Hiroyuki Ohsaki, Soushi Watanabe, and Makoto Imase</em>
138 in Proceedings of IEEE Conference on Control Applications (CCA 2005), Aug. 2005.\n
139 http://www.ispl.jp/~oosaki/papers/Ohsaki05_CCA.pdf
141 \li <b>Evaluation of Meta-scheduler Architectures and Task Assignment Policies for
142 high Throughput Computing</b>
143 by <em>Eddy Caron, Vincent Garonne and Andrei Tsaregorodtsev</em>\n
144 Proceedings of 4th Internationnal Symposium on Parallel and
145 Distributed Computing Job Scheduling Strategies for Parallel
146 Processing (ISPDC'05), July 2005.\n
147 http://www.ens-lyon.fr/LIP/Pub/Rapports/RR/RR2005/RR2005-27.pdf
151 \li <b>Deadline Scheduling with Priority for Client-Server Systems on the Grid</b>\n
152 by <em>E Caron, PK Chouhan, F Desprez</em>
153 in IEEE International Conference On Grid Computing. Super Computing 2004, oct 2004.
155 \li <b>Efficient Scheduling Heuristics for GridRPC Systems</b>
156 by <em>Y. Caniou and E. Jeannot.</em>\n
157 in IEEE QoS and Dynamic System workshop (QDS) of International Conference
158 on Parallel and Distributed Systems (ICPADS), New-Port Beach California, USA,
159 pages 621-630, July 2004\n
160 http://graal.ens-lyon.fr/~ycaniou/QDS04.ps
162 \li <b>Exploiting Replication and Data Reuse to Efficiently Schedule
163 Data-intensive Applications on Grids</b>\n
164 by <em> E. Santos-Neto, W. Cirne, F. Brasileiro, A. Lima.</em>\n
165 Proceedings of 10th Job Scheduling Strategies for Parallel Processing, June 2004.\n
166 http://www.lsd.ufcg.edu.br/~elizeu/articles/jsspp.v6.pdf
170 \li <b>Link-Contention-Aware Genetic Scheduling Using Task Duplication in Grid Environments</b>\n
171 by <em>Wensheng Yao, Xiao Xie and Jinyuan You</em>\n
172 in Grid and Cooperative Computing: Second International Workshop, GCC 2003, Shanghai, China, December 7-10, 2003 (LNCS)
173 http://www.chinagrid.edu.cn/chinagrid/download/GCC2003/pdf/266.pdf
175 \li <b>New Dynamic Heuristics in the Client-Agent-Server Model</b>\n
176 by <em>Y. Caniou and E. Jeannot</em>\n
177 in IEEE 13th Heteregeneous Computing Workshop - HCW'03, Nice, France, April 2003.\n
178 http://graal.ens-lyon.fr/~ycaniou/HCW03.ps
180 \li <b>A Hierarchical Resource Reservation Algorithm for Network Enabled Servers</b>\n
181 by <em>E. Caron, F. Desprez, F. Petit, V. Villain</em>\n
182 in the 17th International Parallel and Distributed Processing Symposium -- IPDPS'03, Nice - France, April 2003.
184 \section pub_self Our own papers that use SimGrid-generated results
186 This list is a selection of the articles we have written that used results
187 generated by SimGrid.
191 \li <b>Interference-Aware Scheduling</b>\n
192 by <em>B. Kreaseck, L. Carter, H. Casanova, J. Ferrante, S. Nandy</em>\n
193 International Journal of High Performance Computing Applications (IJHPCA), to appear.\n
194 http://www2.hawaii.edu/~henric/homepage/papers/kreaseck_ijhpca_2005.pdf
198 \li <b>From Heterogeneous Task Scheduling to Heterogeneous Mixed Data and Task Parallel Scheduling</b>\n
199 by F. Suter, V. Boudet, F. Desprez, H. Casanova\n
200 Proceedings of Europar, 230--237, (LCNS volume 3149), Pisa, Italy, August 2004.
203 \li <b>On the Interference of Communication on Computation</b>\n
204 by <em>B. Kreaseck, L. Carter, H. Casanova, J. Ferrante</em>\n
205 Proceedings of the workshop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems, Santa Fe, April 2004.\n
206 http://www2.hawaii.edu/~henric/homepage/papers/k_pmeo2004.pdf
211 \li <b>RUMR: Robust Scheduling for Divisible Workloads</b>\n
212 by <em>Y. Yang, H. Casanova</em>\n
213 Proceedings of the 12th IEEE Symposium on High Performance and Distributed Computing (HPDC-12), Seattle, June 2003.\n
214 http://www2.hawaii.edu/~henric/homepage/papers/yang_hpdc2003.pdf
218 \li <b>Resource Allocation Strategies for Guided Parameter Space Searches</b>\n
219 by <em>M. Faerman, A. Birnbaum, F. Berman, H. Casanova</em>\n
220 International Journal of High Performance Computing Applications (IJHPCA), 17(4), 383--402, 2003.\n
221 http://grail.sdsc.edu/papers/faerman_ijhpca04.pdf
225 \li <b>Resource Allocation for Steerable Parallel Parameter Searches</b>\n
226 by <em>M. Faerman, A. Birnbaum, H. Casanova, F. Berman</em>\n
227 Proceedings of the Grid Computing Workshop, Baltimore, 157--169, November 2002.\n
228 http://grail.sdsc.edu/projects/vi_itr/grid02.pdf
233 \li <b>Applying Scheduling and Tuning to On-line Parallel Tomography </b>\n
234 by <em>Shava Smallen, Henri Casanova, Francine Berman</em>\n
235 in Proceedings of Supercomputing 2001\n
236 http://grail.sdsc.edu/papers/tomo_journal.ps.gz
241 \li <b>Heuristics for Scheduling Parameter Sweep applications in Grid environments</b>\n
242 by <em>Henri Casanova, Arnaud Legrand, Dmitrii Zagorodnov and Francine Berman</em>\n
243 in Proceedings of the 9th Heterogeneous Computing workshop (HCW'2000), pp349-363.\n
244 http://www2.hawaii.edu/~henric/homepage/papers/hcw00_pst.pdf
249 \li <b>Optimal algorithms for scheduling divisible workloads on
250 heterogeneous systems</b>\n
251 by <em>Olivier Beaumont and Arnaud Legrand and Yves Robert</em>\n
252 in Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS'03).\n
253 Preliminary version on ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-36.ps.gz
256 \li <b>On-line Parallel Tomography</b>\n
257 by <em>Shava Smallen</em>\n
258 Masters Thesis, UCSD, May 2001