doc/platform.doc

   1 /*! \page platform Platform description
   2
   3 \htmlinclude .platform.doc.toc
   4
   5 In order to run any simulation, SimGrid needs 3 things: something to run
   6 (so, your code), a description of the platform on which you want to run your
   7 application, and finally it needs something to know where to deploy what.
   8
   9 For the latest 2 entries, you have basically 2 ways to give it as an input :
  10 \li You can program it, either using the Lua console or if you're using MSG some
  11 of its platform and deployments functions. If you want to use it, please refer
  12 to its doc.
  13 \li You can use two XML files: a platform description file and a deployment
  14 description one.
  15
  16 As the second one (deployment description) just consists of saying which
  17 process runs where and which arguments it should take as input, the easier way to
  18 understand how to write it is just to take a look at the examples. Here is an example of it:
  19
  20 \verbatim
  21 <?xml version='1.0'?>
  22 <!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid.dtd">
  23 <platform version="3">
  24   <!-- The master process (with some arguments) -->
  25   <process host="Tremblay" function="master">
  26      <argument value="20"/>       <!-- Number of tasks -->
  27      <argument value="50000000"/>  <!-- Computation size of tasks -->
  28      <argument value="1000000"/>   <!-- Communication size of tasks -->
  29      <argument value="Jupiter"/>  <!-- First slave -->
  30      <argument value="Fafard"/>   <!-- Second slave -->
  31      <argument value="Ginette"/>  <!-- Third slave -->
  32      <argument value="Bourassa"/> <!-- Last slave -->
  33      <argument value="Tremblay"/> <!-- Me! I can work too! -->
  34   </process>
  35   <!-- The slave processes (with no argument) -->
  36   <process host="Tremblay" function="slave"/>
  37   <process host="Jupiter" function="slave"/>
  38   <process host="Fafard" function="slave"/>
  39   <process host="Ginette" function="slave"/>
  40   <process host="Bourassa" function="slave"/>
  41 </platform>
  42 \endverbatim
  43
  44 The platform description is slightly more complicated. This documentation is all about how to write this file: what are the basic concept it relies on, what possibilities are offered, and some hints and tips on how to write a good platform description.
  45
  46 \section pf_overview Some words about XML and DTD
  47
  48 We choose to use XML because of some of its possibilities: if you're
  49 using an accurate XML editor, or simply using any XML plug-in for eclipse, it
  50 will allow you to have cool stuff like auto-completion, validation and checking,
  51 so all syntaxic errors may be avoided this way.
  52
  53 the XML checking is done based on the dtd which is nowaday online at
  54 <a href="http://simgrid.gforge.inria.fr/simgrid.dtd">http://simgrid.gforge.inria.fr/simgrid.dtd</a>
  55 while you might be tempted to read it, it will not help you that much.
  56
  57 If you read it, you should notice two or three important things :
  58 \li The platform tags contains a version attributes. At the time of writing this doc
  59 the current version is 3.
  60 \li The DTD contains definitions for the 2 files used by SimGrid (platform
  61 description and deployment).
  62 \li There is a bunch of possibilities ! Let's see what's in it
  63
  64
  65 \section pf_basics Basic concepts
  66
  67 Nowadays, the Internet is composed of a bunch of independently managed networks. Within each of those networks, there are entry and exit points (most of the time, you can both enter and exit through the same point) that allows to go out of the current network and reach other networks. At the upper level, these networks are known as <b>Autonomous System (AS)</b>, while at the lower level they are named sub-networks, or LAN. Indeed they are autonomous: routing is defined within the limits of his network by the administrator, and so, those networks can continue to operate without the existence of other networks. There are some rules to get out of networks by the entry points (or gateways). Those gateways allow you to go from a network to another one. Inside of each autonomous system, there is a bunch of equipments (cables, routers, switches, computers) that belong to the autonomous system owner.
  68
  69 SimGrid platform description file relies exactly on the same concepts as real life platform. Every resource (computers, network equipments, and so on) belongs to an AS. Within this AS, you can define the routing you want between its elements (that's done with the routing model attribute and eventually with some \<route\> tag). You define AS by using ... well ... the \<AS\> tag. An AS can also contain some AS : AS allows you to define the hierarchy of your platform.
  70
  71 Within each AS, you basically have the following type of resources:
  72 \li <b>host</b>: an host, with cores in it, and so on
  73 \li <b>router</b>: a router or a gateway.
  74 \li <b>link</b>: a link, that defines a connection between two (or more) resources (and have a bandwidth and a latency)
  75 \li <b>cluster</b>: like a real cluster, contains many hosts interconnected by some dedicated network.
  76
  77 Between those elements, a routing has to be defined. As the AS is supposed to be Autonomous, this has to be done at the AS level. As AS handles two different types of entities (<b>host/router</b> and <b>AS</b>) you will have to define routes between those elements. A network model have to be provided for AS, but you may/will need, depending of the network model, or because you want to bypass the default beahviour to defines routes manually. There are 3 tags to use :
  78 \li <b>ASroute</b>: to define routes between two  <b>AS</b>
  79 \li <b>route</b>: to define routes between two <b>host/router</b>
  80 \li <b>bypassRoute</b>: to define routes between two <b>AS</b> that will bypass default routing.
  81
  82 Here is an illustration of the overall concepts:
  83
  84 \htmlonly
  85 <a href="AS_hierarchy.png" border=0><img src="AS_hierarchy.png" width="30%" border=0 align="center"></a>
  86 <br/>
  87 \endhtmlonly
  88  Circles represent processing units and squares represent network routers. Bold
  89     lines represent communication links. AS2 models the core of a national
  90     network interconnecting a small flat cluster (AS4) and a larger
  91     hierarchical cluster (AS5), a subset of a LAN (AS6), and a set of peers
  92     scattered around the world (AS7).
  93
  94
  95 This is all for the concepts ! To make a long story short, a SimGrid platform is made of a hierarchy of AS, each of them containing resources, and routing is defined at AS level. Let's have a deeper look in the tags.
  96
  97
  98
  99 \section pf_pftags Describing resources and their organization
 100
 101 \subsection  pf_As Platform organization tag : AS
 102
 103 AS (or Autonomous System) is an organizational unit that contains resources and defines routing between them, and eventually some other AS. So it allows you to define a hierarchy into your platform. <b>*ANY*</b> resource <b>*MUST*</b> belong to an AS. There are a few attributes.
 104
 105 <b>AS</b> attributes :
 106 \li <b>name (mandatory)</b>: the identifier of AS to be used when referring to it.
 107 \li <b>routing (mandatory)</b>: the routing model used into it. By model we mean the internal way the simulator will manage routing. That also have a big impact on how many information you'll have to provide to help the simulator to route between the AS elements. <b>routing</b> possible values are <b>Full, Floyd, Dijkstra, DijkstraCache, none, RuleBased, Vivaldi, Cluster</b>. For more explanation about what to choose, take a look at the section devoted to it below.
 108
 109 Elements into an AS are basically resources (computers, network equipments) and some routing informations if necessary (see below for more explanation).
 110
 111 <b>AS example</b>
 112 \verbatim
 113 <AS  id="AS0"  routing="Full">
 114    <host id="host1" power="1000000000"/>
 115    <host id="host2" power="1000000000"/>
 116    <link id="link1" bandwidth="125000000" latency="0.000100"/>
 117    <route src="host1" dst="host2"><link_ctn id="link1"/></route>
 118  </AS>
 119 \endverbatim
 120
 121
 122 In this example, AS0 contains two hosts (host1 and host2). The route between the hosts goes through link1.
 123 \subsection pf_Cr Computing resources: hosts, clusters and peers.
 124
 125 \subsubsection pf_host host
 126 A <b>host</b> represents a computer, where you will be able to execute code and from which you can send and receive information. A host can contain more than 1 core. Here are the attributes of a host :
 127
 128
 129 <b>host</b> attributes :
 130 \li <b>id (mandatory)</b>: the identifier of the host to be used when referring to it.
 131 \li <b>power (mandatory)</b>:the peak number FLOPS the CPU can manage. Expressed in ﬂop/s.
 132 \li <b>core</b>: The number of core of this host. If setted, the power gives the power of one core.
 133 \li <b>availability</b>: specify if the percentage of power available.
 134 \li <b>availability_file</b>: Allow you to use a file as input. This file will contain availability traces for this computer. The syntax of this file is defined below. Possible values : absolute or relative path, syntax similar to the one in use on your system.
 135 \li <b>state</b>: the computer state, as in : is that computer ON or OFF. Possible values : "ON" or "OFF".
 136 \li <b>state_file</b>: Same mechanism as availability_file, similar syntax for value.
 137 \li <b>coordinates</b>: you'll have to give it if you choose the vivaldi, coordinate-based routing model for the AS the host belongs to. More details about it in the P2P coordinate based section.
 138
 139 An host can also contain the <b>prop</b> tag. the prop tag allows you to define additional informations on this host following the attribute/value schema. You may want to use it to give information to the tool you use for rendering your simulation, for example.
 140
 141 <b>host example</b>
 142 \verbatim
 143    <host id="host1" power="1000000000"/>
 144    <host id="host2" power="1000000000">
 145         <prop id="color" value="blue"/>
 146         <prop id="rendershape" value="square"/>
 147    </host>
 148 \endverbatim
 149
 150
 151 <b>Expressing dynamicity.</b>
 152 It is also possible to seamlessly declare a host whose
 153 availability changes over time using the availability_file
 154 attribute and a separate text file whose syntax is exemplified below.
 155
 156 <b>Adding a trace file</b>
 157 \verbatim
 158     <platform version="1">
 159       <host id="bob" power="500000000"
 160             availability_file="bob.trace" />
 161     </platform>
 162 \endverbatim
 163 <b>Example of "bob.trace" file</b>
 164 \verbatim
 165 PERIODICITY 1.0
 166   0.0 1.0
 167   11.0 0.5
 168   20.0 0.8
 169 \endverbatim
 170
 171 At time 0, our host will deliver 500~Mflop/s. At time 11.0, it will
 172 deliver half, that is 250~Mflop/s until time 20.0 where it will
 173 will start delivering 80\% of its power, that is 400~Mflop/s. Last, at
 174 time 21.0 (20.0 plus the periodicity 1.0), we loop back to the
 175 beginning and the host will deliver again 500~Mflop/s.
 176
 177 <b>Changing initial state</b>
 178
 179 It is also possible to specify whether the host
 180 is up or down by setting the <b>state</b> attribute to either <b>ON</b>
 181 (default value) or <b>OFF</b>.
 182
 183 <b>Expliciting the default value "ON"</b>
 184 \verbatim
 185   <platform version="1">
 186      <host id="bob"
 187            power="500000000"
 188           state="ON" />
 189   </platform>
 190 \endverbatim
 191 <b>Host switched off</b>
 192 \verbatim
 193   <platform version="1">
 194      <host id="bob"
 195            power="500000000"
 196            state="OFF" />
 197   </platform>
 198 \endverbatim
 199 <b>Expressing churn</b>
 200 To express the fact that a host can change state over time (as in P2P
 201 systems, for instance), it is possible to use a file describing the time
 202 at which the host is turned on or off. An example of the content
 203 of such a file is presented below.
 204 <b>Adding a state file</b>
 205   \verbatim
 206     <platform version="1">
 207       <host id="bob" power="500000000"
 208            state_file="bob.fail" />
 209     </platform>
 210   \endverbatim
 211 <b>Example of "bob.fail" file</b>
 212 \verbatim
 213   PERIODICITY 10.0
 214   1.0 -1.0
 215   2.0 1.0
 216 \endverbatim
 217
 218 A negative value means <b>down</b> while a positive one means <b>up and
 219   running</b>. From time 0.0 to time 1.0, the host is on. At time 1.0, it is
 220 turned off and at time 2.0, it is turned on again until time 12 (2.0 plus the
 221 periodicity 10.0). It will be turned on again at time 13.0 until time 23.0, and
 222 so on.
 223
 224
 225
 226 \subsubsection pf_cluster cluster
 227 A <b>cluster</b> represents a cluster. It is most of the time used when you want to have a bunch of machine defined quickly. It must be noted that cluster is meta-tag : <b>from the inner SimGrid point of view, a cluster is an AS where some optimized routing is defined</b> . The default inner organisation of the cluster is as follow :
 228 \verbatim
 229                  _________
 230                 |          |
 231                 |  router  |
 232     ____________|__________|_____________ backbone
 233       |   |   |              |     |   |
 234     l0| l1| l2|           l97| l96 |   | l99
 235       |   |   |   ........   |     |   |
 236       |                                |
 237     c-0.me                             c-99.me
 238 \endverbatim
 239
 240 You have a set of <b>host</b> defined. Each of them has a <b>link</b> to a central backbone (backbone is a <b>link</b> itsef, as a link can be used to represent a switch, see the switch or <b>link</b> section below for more details about it). A <b>router</b> gives a way to the <b>cluster</b> to be connected to the outside world. Internally, cluster is then an AS containing all hosts : the router is the default gateway for the cluster.
 241
 242 There is an alternative organization, which is as follow :
 243 \verbatim
 244                  _________
 245                 |          |
 246                 |  router  |
 247                 |__________|
 248                     / | \
 249                    /  |  \
 250                l0 / l1|   \l2
 251                  /    |    \
 252                 /     |     \
 253             host0   host1   host2
 254 \endverbatim
 255
 256 The principle is the same, except we don't have the backbone. The way to obtain it is simple : you just have to let bb_* attributes unsetted.
 257
 258
 259
 260 <b>cluster</b> attributes :
 261 \li <b>id (mandatory)</b>: the identifier of the cluster to be used when referring to it.
 262 \li <b>prefix (mandatory)</b>: each node of the cluster has to have a name. This is its prefix.
 263 \li <b>suffix (mandatory)</b>: node suffix name.
 264 \li <b>radical (mandatory)</b>: regexp used to generate cluster nodes name. Syntax is quite common, "10-20" will give you 11 machines numbered from 10 to 20, "10-20;2" will give you 12 machines, one with the number 2, others numbered as before. The produced number is concatenated  between prefix and suffix to form machine names.
 265 \li <b>power (mandatory)</b>: same as <b>host</b> power.
 266 \li <b>core</b>: same as <b>host</b> core.
 267 \li <b>bw (mandatory)</b>: bandwidth for the links between nodes and backbone (if any). See <b>link</b> section for syntax/details.
 268 \li <b>lat (mandatory)</b>: latency for the links between nodes and backbone (if any). See <b>link</b> section for syntax/details.
 269 \li <b>sharing_policy</b>: sharing policy for the links between nodes and backbone (if any). See <b>link</b> section for syntax/details.
 270 \li <b>bb_bw </b>: bandwidth for backbone (if any). See <b>link</b> section for syntax/details. If both bb_* attributes are ommited, no backbone is create (alternative cluster architecture described before).
 271 \li <b>bb_lat </b>: latency for backbone (if any). See <b>link</b> section for syntax/details. If both bb_* attributes are ommited, no backbone is create (alternative cluster architecture described before).
 272 \li <b>bb_sharing_policy</b>: sharing policy for the backbone (if any). See <b>link</b> section for syntax/details.
 273 \li <b>availability_file</b>: Allow you to use a file as input for availability. Similar to <b>hosts</b> attribute.
 274 \li <b>state_file</b>: Allow you to use a file as input for states. Similar to <b>hosts</b> attribute.
 275
 276 the router name is defined as the resulting String in the following java line of code: router_name = prefix + "router_ + suffix ;
 277
 278
 279 <b>cluster example</b>
 280 \verbatim
 281 <cluster id="my_cluster_1" prefix="" suffix=""
 282                 radical="0-262144"      power="1000000000"    bw="125000000"     lat="5E-5"/>
 283 <cluster id="my_cluster_1" prefix="c-" suffix=".me"
 284                 radical="0-99"  power="1000000000"    bw="125000000"     lat="5E-5"
 285         bb_bw="2250000000" bb_lat="5E-4"/>
 286 \endverbatim
 287
 288 \subsubsection pf_peer peer
 289 A <b>peer</b> represents a peer, as in Peer-to-Peer (P2P). Basically, as cluster, <b>A PEER IS INTERNALLY INTERPRETED AS AN \<AS\></b>. It's just a kind of shortcut that does the following :
 290 \li It creates an host that has coordinates
 291 \li Two links : one for download and one for upload. This is convenient to use and simulate stuff under the last mile model (as ADSL peers).
 292
 293 <b>peer</b> attributes :
 294 \li <b>id (mandatory)</b>: the identifier of the peer to be used when referring to it.
 295 \li <b>power CDATA (mandatory)</b>:
 296 \li <b>bw_in CDATA (mandatory)</b>:
 297 \li <b>bw_out CDATA (mandatory)</b>:
 298 \li <b>lat CDATA (mandatory)</b>:
 299 \li <b>coordinates</b>:
 300 \li <b>sharing_policy</b>: sharing policy for links. Can be SHARED or FULLDUPLEX, FULLDUPLEX is the default. See <b>link</b> description for details.
 301 \li <b>availability_file</b>: availability file for the peer. Same as host availability file. See <b>host</b> description for details.
 302 \li <b>state_file </b>: state file for the peer. Same as host state file. See <b>host</b> description for details.
 303
 304 \subsection pf_ne Network equipments: links and routers
 305
 306 You have basically two entities available to represent network entities :
 307 \li <b>link</b>: represents something that has a limited bandwidth, a latency, and that can be shared according to TCP way to share this bandwidth. <b>LINKS ARE NOT EDGES BUT HYPEREDGES</b>: it means that you can have more than 2 equipments connected to it.
 308 \li <b>router</b>: represents something that one message can be routed to, but does not accept any code, nor have any influence on the performances (no bandwidth, no latency, not anything).<b>ROUTERS ARE ENTITIES (ALMOST) IGNORED BY THE SIMULATOR WHEN THE SIMULATION HAS BEGUN</b>. If you want to represent something like a switch, you must use <b>link</b> (see section below). Routers are used in order to run some routing algorithm and determine routes (see routing section for details).
 309
 310 let's see deeper what those entities hide.
 311
 312 \subsubsection pf_router router
 313 As said before, <b>router</b> is used only to give some information for routing algorithms. So, it does not have any attributes except :
 314
 315 <b>router</b> attributes :
 316 \li <b>id (mandatory)</b>: the identifier of the router to be used when referring to it.
 317 \li <b>coordinates</b>: you'll have to give it if you choose the vivaldi, coordinate-based routing model for the AS the host belongs to. More details about it in the P2P coordinates based section.
 318
 319
 320 <b>router example</b>
 321 \verbatim
 322  <router id="gw_dc1_horizdist"/>
 323 \endverbatim
 324
 325 \subsubsection pf_link link
 326 Network links can represent one-hop network connections. They are characterized by their id and their bandwidth.
 327 The latency is optional with a default value of 0.0. For instance, we can declare a network link named link1
 328 having bandwidth of 1Gb/s and a latency of 50µs.
 329 Example link:
 330 \verbatim
 331  <link id="LINK1" bandwidth="125000000" latency="5E-5"/>
 332 \endverbatim
 333 <b>Expressing sharing policy</b>
 334
 335 By default a network link is SHARED, that is if more than one ﬂow go through
 336 a link, each gets a share of the available bandwidth similar to the share TCP connections offers.
 337
 338 Conversely if a link is deﬁned as a FATPIPE, each ﬂow going through this link will get all the available bandwidth, whatever the number of ﬂows. The FATPIPE
 339 behavior allows to describe big backbones that won't affect performances (except latency). Finally a link can be considered as FULLDUPLEX, XXX?
 340
 341 \verbatim
 342  <link id="SWITCH" bandwidth="125000000" latency="5E-5" sharing_policy="FATPIPE" />
 343 \endverbatim
 344
 345 <b>Expressing dynamicity and failures</b>
 346
 347  As for hosts, it is possible to declare links whose state, bandwidth or latency change over the time. In this case, the bandwidth and latency attributes are respectively replaced by the bandwidth file and latency file attributes and the corresponding text ﬁles.
 348
 349 \verbatim
 350  <link id="LINK1" state_file="link1.fail" bandwidth="80000000" latency=".0001" bandwidth_file="link1.bw" latency_file="link1.lat" />
 351 \endverbatim
 352
 353 It has to be noted that even if the syntax is the same, the semantic of bandwidth and latency trace ﬁles
 354 diﬀers from that of host availability ﬁles. Those ﬁles do not express availability as a fraction of the available
 355 capacity but directly in bytes per seconds for the bandwidth and in seconds for the latency. This is because
 356 most tools allowing to capture traces on real platforms (such as NWS ) express their results this way.
 357
 358 <b>Example of "link1.bw" file</b>
 359 \verbatim
 360
 361 1 PERIODICITY 12.0
 362 2 4.0 40000000
 363 3 8.0 60000000
 364 \endverbatim
 365 <b>Example of "link1.lat" file</b>
 366 \verbatim
 367  1 PERIODICITY 5.0
 368 2 1.0 0.001
 369 3 2.0 0.01
 370 4 3.0 0.001
 371 \endverbatim
 372 In this example, the bandwidth varies with a period of 12 seconds while the latency varies with a period of
 373 5 seconds. At the beginning of simulation, the link’s bandwidth is of 80,000,000 B/s (i.e., 80 Mb/s). After four
 374 seconds, it drops at 40 Mb/s, and climbs back to 60 Mb/s after eight seconds. It keeps that way until second
 375 12 (ie, until the end of the period), point at which it loops its behavior (seconds 12-16 will experience 80 Mb/s,
 376 16-20 40 Mb/s and so on). In the same time, the latency values are 100µs (initial value) on the [0, 1[ time
 377 interval, 1ms on [1, 2[, 10ms on [2, 3[, 1ms on [3,5[ (i.e., until the end of period). It then loops back, starting
 378 at 100µs for one second.
 379
 380 <b>link</b> attributes :
 381 \li <b>id (mandatory)</b>: the identifier of the cluster to be used when referring to it.
 382 \li <b>bandwidth (mandatory)</b>: bandwidth for the link.
 383 \li <b>lat (mandatory)</b>: latency for the link.
 384 \li <b>sharing_policy</b>: sharing policy for the link.
 385 \li <b>state</b>: Allow you to to set link as ON or OFF. Default is ON.
 386 \li <b>bandwidth_file</b>: Allow you to use a file as input for bandwidth.
 387 \li <b>latency_file</b>: Allow you to use a file as input for latency.
 388 \li <b>state_file</b>: Allow you to use a file as input for states.
 389
 390 As an host, a <b>link</b> tag can also contain the <b>prop</b> tag.
 391
 392 <b>link example</b>
 393 \verbatim
 394    <link id="link1" bandwidth="125000000" latency="0.000100"/>
 395 \endverbatim
 396
 397
 398 \subsection pf_storage Storage
 399
 400 At the time of writing this doc, a storage protoype has been implemented. While it is not stable, no doc for it, sorry.
 401
 402 \section pf_routing Routing
 403
 404 In order to run fast, it has been chosen to use static routing within SimGrid. By static, it means that it is calculated once, and will not change during execution. We chose to do that because it is rare to have a real deficience of a resource ; most of the time, a communication fails because the links are too overloaded, and so your connection stops before the time out, or because the computer at the other end is not answering.
 405
 406 We also chose to use shortests paths algorithms in order to emulate routing. Doing so is consistent with the reality: RIP, OSPF, BGP are all calculating shortest paths. They have some convergence time, but at the end, so when the platform is stable (and this should be the moment you want to simulate something using SimGrid) your packets will follow the shortest paths.
 407
 408 \subsection pf_rm Routing models
 409
 410 Within each AS, you have to define a routing model to use. You have basically 3 main kind of routing models :
 411 \li Shortest-path based models: you let SimGrid calculates shortest paths and manage it. Behaves more or less as most real life routing.
 412 \li Manually-entered route models: you'll have to define all routes manually by yourself into the platform description file. Consistent with some manually managed real life routing.
 413 \li Simple/fast models: those models offers fast, low memory routing algorithms. You should consider to use it if you can make some assumptions about your AS. Routing in this case is more or less ignored
 414
 415 \subsubsection pf_raf The router affair
 416
 417 Expressing routers becomes mandatory when using shortest-path based models or when using ns-3 or the bindings to the GTNetS packet-level simulator instead of the native analytical network model implemented in SimGrid.
 418
 419 For graph-based shortest path algorithms, routers are mandatory, because both algorithms need a graph, and so we need to have source and destination for each edge.
 420
 421 Routers are naturally an important concept in GTNetS or ns-3 since the way they run the packet routing algorithms is actually simulated. Instead, the
 422 SimGrid’s analytical models aggregate the routing time with the transfer time.
 423 Rebuilding a graph representation only from the route information turns to be a very diﬃcult task, because
 424 of the missing information about how routes intersect. That is why we introduced a \<router\> tag, which is
 425 simply used to express these intersection points. The only attribute accepted by this tag an id.
 426 It is important to understand that the \<router\> tag is only used to provide topological information.
 427
 428 To express those topological information, some <b>route</b> have to be defined saying which link is between which routers. Description or the route syntax is given below, as well as example for the different models.
 429
 430 \subsubsection pf_rm_sh Shortest-path based models
 431
 432 Here is the complete list of such models, that computes routes using classic shortest-paths algorithms. How to choose the best suited algorithm is discussed later in the section devoted to it.
 433 \li <b>Floyd</b>: Floyd routing data
 434 \li <b>Dijkstra</b>: Dijkstra routing data
 435 \li <b>DijkstraCache</b>: Dijkstra routing data
 436
 437 Floyd example :
 438 \verbatim
 439 <AS  id="AS0"  routing="Floyd">
 440
 441   <cluster id="my_cluster_1" prefix="c-" suffix=""
 442                 radical="0-1"   power="1000000000"    bw="125000000"     lat="5E-5"
 443         router_id="router1"/>
 444
 445  <AS id="AS1" routing="none">
 446     <host id="host1" power="1000000000"/>
 447  </AS>
 448
 449   <link id="link1" bandwidth="100000" latency="0.01"/>
 450
 451   <ASroute src="my_cluster_1" dst="AS1"
 452     gw_src="router1"
 453     gw_dst="host1">
 454     <link_ctn id="link1"/>
 455   </ASroute>
 456
 457 </AS>
 458 \endverbatim
 459 ASroute given at the end gives a topological information : link1 is between router1 and host1.
 460
 461
 462 Dijsktra example :
 463 \verbatim
 464 XXX?
 465 \endverbatim
 466
 467 DijsktraCache example :
 468 \verbatim
 469 XXX?
 470 \endverbatim
 471
 472 \subsubsection pf_rm_sh Manually-entered route models
 473
 474 \li <b>Full</b>: You have to enter all necessary routes manually
 475 \li <b>RuleBased</b>: Rule-Based routing data; same as Full except you can use regexp to express route. As SimGrid has to evaluate the regexp, it's slower than Full, but requires less memory. Regexp syntax is similar as <a href="http://www.pcre.org">pcre</a> ones, as this is the lib SimGrid use to do so.
 476
 477 Full example :
 478 \verbatim
 479 <AS  id="AS0"  routing="Full">
 480    <host id="host1" power="1000000000"/>
 481    <host id="host2" power="1000000000"/>
 482    <link id="link1" bandwidth="125000000" latency="0.000100"/>
 483    <route src="host1" dst="host2"><link_ctn id="link1"/></route>
 484  </AS>
 485 \endverbatim
 486
 487 RuleBased example :
 488 \verbatim
 489 <AS id="AS_orsay" routing="RuleBased" >
 490                         <cluster id="AS_gdx" prefix="gdx-" suffix=".orsay.grid5000.fr"
 491                                 radical="1-310" power="4.7153E9" bw="1.25E8" lat="1.0E-4"
 492                                 bb_bw="1.25E9" bb_lat="1.0E-4"></cluster>
 493                         <link   id="link_gdx" bandwidth="1.25E9" latency="1.0E-4"/>
 494
 495                         <cluster id="AS_netgdx" prefix="netgdx-" suffix=".orsay.grid5000.fr"
 496                                 radical="1-30" power="4.7144E9" bw="1.25E8" lat="1.0E-4"
 497                                 bb_bw="1.25E9" bb_lat="1.0E-4"></cluster>
 498                         <link   id="link_netgdx" bandwidth="1.25E9" latency="1.0E-4"/>
 499
 500                         <AS id="gw_AS_orsay" routing="Full">
 501                                 <router id="gw_orsay"/>
 502                         </AS>
 503                         <link   id="link_gw_orsay" bandwidth="1.25E9" latency="1.0E-4"/>
 504
 505                         <ASroute src="^AS_(.*)$" dst="^AS_(.*)$"
 506                                 gw_src="$1src-AS_$1src_router.orsay.grid5000.fr"
 507                                 gw_dst="$1dst-AS_$1dst_router.orsay.grid5000.fr"
 508                                 symmetrical="YES">
 509                                         <link_ctn id="link_$1src"/>
 510                                         <link_ctn id="link_$1dst"/>
 511                         </ASroute>
 512
 513                         <ASroute src="^AS_(.*)$" dst="^gw_AS_(.*)$"
 514                                 gw_src="$1src-AS_$1src_router.orsay.grid5000.fr"
 515                                 gw_dst="gw_$1dst"
 516                                 symmetrical="NO">
 517                                         <link_ctn id="link_$1src"/>
 518                         </ASroute>
 519
 520                         <ASroute src="^gw_AS_(.*)$" dst="^AS_(.*)$"
 521                                 gw_src="gw_$1src"
 522                                 gw_dst="$1dst-AS_$1dst_router.orsay.grid5000.fr"
 523                                 symmetrical="NO">
 524                                         <link_ctn id="link_$1dst"/>
 525                         </ASroute>
 526
 527                 </AS>
 528 \endverbatim
 529
 530 The example upper contains $1. Those $1 are evaluated as follow XXX?
 531
 532 \subsubsection pf_rm_sh Simple/fast models
 533
 534 \li <b>none</b>: No routing (usable with Constant network only)
 535 None Example :
 536 \verbatim
 537 XXX?
 538 \endverbatim
 539
 540 \li <b>Vivaldi</b>: Vivaldi routing, so when you want to use coordinates. See the corresponding section P2P below for details.
 541 \li <b>Cluster</b>: Cluster routing, specific to cluster tag, should not be used, except internally.
 542
 543 \subsection pf_asro ASroute
 544
 545 ASroute tag purpose is to let people write manually their routes between AS. It's usefull when you're in Full or Rule-based model.
 546
 547 <b>ASroute</b> attributes :
 548 \li <b>src (mandatory)</b>: the source AS id.
 549 \li <b>dst (mandatory)</b>: the destination AS id.
 550 \li <b>gw_src (mandatory)</b>: the gateway to be used within the AS. Can be any <b>host</b> or \b router defined into the \b src AS or into one of the AS it includes.
 551 \li <b>gw_dst (mandatory)</b>: the gateway to be used within the AS. Can be any <b>host</b> or \b router defined into the \b dst AS or into one of the AS it includes.
 552 \li <b>symmetrical</b>: if the route is symmetric, the reverse route will be the opposite of the one defined. Can be either YES or NO, default is  YES.
 553
 554 <b>Example of ASroute with RuleBased</b>
 555 \verbatim
 556 <ASroute src="^gw_AS_(.*)$" dst="^AS_(.*)$"
 557                                 gw_src="gw_$1src"
 558                                 gw_dst="$1dst-AS_$1dst_router.orsay.grid5000.fr"
 559                                 symmetrical="NO">
 560                                         <link_ctn id="link_$1dst"/>
 561                         </ASroute>
 562 \endverbatim
 563 <b>Example of ASroute with Full</b>
 564 \verbatim
 565 <AS  id="AS0"  routing="Full">
 566   <cluster id="my_cluster_1" prefix="c-" suffix=".me"
 567                 radical="0-149" power="1000000000"    bw="125000000"     lat="5E-5"
 568         bb_bw="2250000000" bb_lat="5E-4"/>
 569
 570   <cluster id="my_cluster_2" prefix="c-" suffix=".me"
 571             radical="150-299" power="1000000000"        bw="125000000"  lat="5E-5"
 572             bb_bw="2250000000" bb_lat="5E-4"/>
 573
 574      <link id="backbone" bandwidth="1250000000" latency="5E-4"/>
 575
 576      <ASroute src="my_cluster_1" dst="my_cluster_2"
 577          gw_src="c-my_cluster_1_router.me"
 578          gw_dst="c-my_cluster_2_router.me">
 579                 <link_ctn id="backbone"/>
 580      </ASroute>
 581      <ASroute src="my_cluster_2" dst="my_cluster_1"
 582          gw_src="c-my_cluster_2_router.me"
 583          gw_dst="c-my_cluster_1_router.me">
 584                 <link_ctn id="backbone"/>
 585      </ASroute>
 586 </AS>
 587 \endverbatim
 588
 589 \subsection pf_ro route
 590 The principle is the same as ASroute : <b>route</b> contains list of links that are in the path between src and dst, except that it is for routes between a src that can be either <b>host</b> or \b router and a dst that can be either <b>host</b> or \b router. Usefull for Full and RuleBased, as well as for the shortest-paths based models, where you have to give topological informations.
 591
 592
 593 <b>route</b> attributes :
 594 \li <b>src (mandatory)</b>: the source id.
 595 \li <b>dst (mandatory)</b>: the destination id.
 596 \li <b>symmetrical</b>: if the route is symmetric, the reverse route will be the opposite of the one defined. Can be either YES or NO, default is  YES.
 597
 598 <b>route example in Full</b>
 599 \verbatim
 600  <route src="Tremblay" dst="Bourassa">
 601      <link_ctn id="4"/><link_ctn id="3"/><link_ctn id="2"/><link_ctn id="0"/><link_ctn id="1"/><link_ctn id="6"/><link_ctn id="7"/>
 602    </route>
 603 \endverbatim
 604
 605 <b>route example in a shortest-path model</b>
 606 \verbatim
 607  <route src="Tremblay" dst="Bourassa">
 608      <link_ctn id="3"/>
 609    </route>
 610 \endverbatim
 611 Note that when using route to give topological information, you have to give routes with one link only in it, as SimGrid needs to know which host are at the end of the link.
 612
 613 \subsection pf_byro bypassRoute
 614
 615 As said before, once you choose a model, it (if so) calculates routes for you. But maybe you want to define some of your routes, which will be specific. You may also want to bypass some routes defined in lower level AS at an upper stage : <b>bypassRoute</b> is the tag you're looking for. It allows to bypass routes defined between already defined between AS (if you want to bypass route for a specific host, you should just XXX?). The principle is the same as ASroute : <b>bypassRoute</b> contains list of links that are in the path between src and dst.
 616
 617 <b>bypassRoute</b> attributes :
 618 \li <b>src (mandatory)</b>: the source AS id.
 619 \li <b>dst (mandatory)</b>: the destination AS id.
 620 \li <b>gw_src (mandatory)</b>: the gateway to be used within the AS. Can be any <b>host</b> or \b router defined into the \b src AS or into one of the AS it includes.
 621 \li <b>gw_dst (mandatory)</b>: the gateway to be used within the AS. Can be any <b>host</b> or \b router defined into the \b dst AS or into one of the AS it includes.
 622 \li <b>symmetrical</b>: if the route is symmetric, the reverse route will be the opposite of the one defined. Can be either YES or NO, default is  YES.
 623
 624 <b>bypassRoute Example</b>
 625 \verbatim
 626
 627 \endverbatim
 628
 629 \subsection pb_baroex Basic Routing Example
 630
 631 Let's say you have an AS named AS_Big that contains two other AS, AS_1 and AS_2. If you want to make an host (h1) from AS_1 with another one (h2) from  AS_2 and you did not have choosen to use some routing model that compute routes automatically, then you'll have to proceed as follow:
 632 \li First, you have to ensure that a route is defined from h1 to the AS_1's exit gateway and from h2 to AS_2's exit gateway.
 633 \li Then, you'll have to define a route between AS_1 to AS_2. As those AS are both resources belonging to AS_Big, then it has to be done at AS_big level. To define such a route, you have to give the source AS (AS_1), the destination AS (AS_2), and their respective gateway (as the route is effectively defined between those two entry/exit points). Elements of this route can only be elements belonging to AS_Big, so links and routers in this route should be defined inside AS_Big.
 634
 635 As said before, there are mainly 2 tags for routing :
 636 \li <b>ASroute</b>: to define routes between two  <b>AS</b>
 637 \li <b>router</b>: to define routes between two <b>host/router</b>
 638
 639 As we are dealing with routes between AS, it means that those we'll have some definition at AS_Big level. Let consider AS_1 and AS_2 contains 1 host, 1 link and one router.
 640
 641 \section pf_other_tags Tags not (directly) describing the platform
 642
 643 There are 3 tags, that you can use inside a \<platform\> tag that are not describing the platform:
 644 \li random: it allows you to define random generators you want to use for your simulation.
 645 \li config: it allows you to pass some configuration stuff like, for example, the network model and so on. It follows the
 646 \li include: simply allows you to include another file into the current one.
 647
 648 \subsection pf_conf config
 649 <b>config</b> attributes :
 650 \li <b>id (mandatory)</b>: the identifier of the config to be used when referring to it.
 651
 652
 653 <b>config</b> tag only purpose is to include <b>prop</b> tags. Valid id are basically the same as the list of possible parameters you can use by command line, except that "/" are used for namespace definition.
 654
 655
 656 <b>config example</b>
 657 \verbatim
 658 <?xml version='1.0'?>
 659 <!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid.dtd">
 660 <platform version="3">
 661 <config id="General">
 662         <prop id="maxmin/precision" value="0.000010"></prop>
 663         <prop id="cpu/optim" value="TI"></prop>
 664         <prop id="workstation/model" value="compound"></prop>
 665         <prop id="network/model" value="SMPI"></prop>
 666         <prop id="path" value="~/"></prop>
 667         <prop id="smpi/bw_factor" value="65472:0.940694;15424:0.697866;9376:0.58729"></prop>
 668 </config>
 669
 670 <AS  id="AS0"  routing="Full">
 671 ...
 672 \endverbatim
 673
 674
 675 \subsection pf_rand random
 676 Not yet in use XXX?
 677
 678 \subsection pf_incl include
 679 Not yet in use XXX?
 680
 681 \section pf_hints Hints and tips, or how to write a platform efficiently
 682
 683 Now you should know at least the syntax dans be able to create a platform. However, after having ourselves wrote some platforms, there are some best practices you should pay attention to in order to produce good platform and some choices you can make in order to have faster simulations. Here's some hints and tips, then.
 684
 685 \subsection pf_as_h AS Hierarchy
 686 The AS design allows SimGrid to go fast, because computing route is done only for the set of resources defined in this AS. If you're using only a big AS containing all resource with no AS into it and you're using Full model, then ... you'll loose all interest into it. On the other hand, designing a binary tree of AS with, at the lower level, only one host, then you'll also loose all the good AS hierarchy can give you. Remind you should always be "reasonable" in your platform definition when choosing the hierarchy. A good choice if you try to describe a real life platform is to follow the AS described in reality, since this kind og trade-off works well for real life platforms.
 687
 688 \subsection pf_exit_as Exit AS: why and how
 689 Users that have looked at some of our platforms may have notice a non-intuitive schema ... Something like that :
 690
 691
 692 \verbatim
 693 <AS id="AS_4"  routing="Full">
 694 <AS id="exitAS_4"  routing="Full">
 695         <router id="router_4"/>
 696 </AS>
 697 <cluster id="cl_4_1" prefix="c_4_1-" suffix="" radical="1-20" power="1000000000" bw="125000000" lat="5E-5" bb_bw="2250000000" bb_lat="5E-4"/>
 698 <cluster id="cl_4_2" prefix="c_4_2-" suffix="" radical="1-20" power="1000000000" bw="125000000" lat="5E-5" bb_bw="2250000000" bb_lat="5E-4"/>
 699 <link id="4_1" bandwidth="2250000000" latency="5E-5"/>
 700 <link id="4_2" bandwidth="2250000000" latency="5E-5"/>
 701 <link id="bb_4" bandwidth="2250000000" latency="5E-4"/>
 702 <ASroute src="cl_4_1"
 703         dst="cl_4_2"
 704         gw_src="c_4_1-cl_4_1_router"
 705         gw_dst="c_4_2-cl_4_2_router"
 706         symmetrical="YES">
 707                 <link_ctn id="4_1"/>
 708                 <link_ctn id="bb_4"/>
 709                 <link_ctn id="4_2"/>
 710 </ASroute>
 711 <ASroute src="cl_4_1"
 712         dst="exitAS_4"
 713         gw_src="c_4_1-cl_4_1_router"
 714         gw_dst="router_4"
 715         symmetrical="YES">
 716                 <link_ctn id="4_1"/>
 717                 <link_ctn id="bb_4"/>
 718 </ASroute>
 719 <ASroute src="cl_4_2"
 720         dst="exitAS_4"
 721         gw_src="c_4_2-cl_4_2_router"
 722         gw_dst="router_4"
 723         symmetrical="YES">
 724                 <link_ctn id="4_2"/>
 725                 <link_ctn id="bb_4"/>
 726 </ASroute>
 727 </AS>
 728 \endverbatim
 729
 730 In the AS_4, you have an exitAS_4 defined, containing only one router, and routes defined to that AS from all other AS (as cluster is only a shortcut for an AS, see cluster description for details). If there was an upper AS, it would define routes to and from AS_4 with the gateway router_4. It's just because, as we did not allowed (for performances issues) to have routes from an AS to a single host/router, you have to enclose your gateway, when you have AS included in your AS, within an AS to define routes to it.
 731
 732
 733 \subsection pf_P2P_tags P2P or how to use coordinates
 734 SimGrid allows you to use some coordinated-based system, like vivaldi, to describe a platform. The main concept is that you have some peers that are located somewhere: this is the function of the  <b>coordinates</b> of the \<peer\> or \<host\> tag. There's nothing complicated in using it, here is an example of it:
 735
 736 \verbatim
 737 <?xml version='1.0'?>
 738 <!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid.dtd">
 739 <platform version="3">
 740
 741 <config id="General">
 742         <prop id="network/coordinates" value="yes"></prop>
 743 </config>
 744  <AS  id="AS0"  routing="Vivaldi">
 745         <host id="100030591" coordinates="25.5 9.4 1.4" power="1500000000.0" />
 746         <host id="100036570" coordinates="-12.7 -9.9 2.1" power="730000000.0" />
 747         ...
 748         <host id="100429957" coordinates="17.5 6.7 18.8" power="830000000.0" />
 749         </AS>
 750 </platform>
 751 \endverbatim
 752
 753
 754 \subsection pf_wisely Choosing wisely the routing model to use
 755
 756
 757 Choosing wisely the routing model to use can significantly fasten your simulation/save your time when writing the platform/save tremendeous disk space. Here is the list of available model and their characteristics (lookup : time to resolve a route):
 758
 759 \li <b>Full</b>: Full routing data (fast, large memory requirements, fully expressive)
 760 \li <b>Floyd</b>: Floyd routing data (slow initialization, fast lookup, lesser memory requirements, shortest path routing only)
 761 \li <b>Dijkstra</b>: Dijkstra routing data (fast initialization, slow lookup, small memory requirements, shortest path routing only)
 762 \li <b>DijkstraCache</b>: Dijkstra routing data (fast initialization, fast lookup, small memory requirements, shortest path routing only)
 763 \li <b>none</b>: No routing (usable with Constant network only)
 764 \li <b>RuleBased</b>: Rule-Based routing data (...)
 765 \li <b>Vivaldi</b>: Vivaldi routing, so when you want to use coordinates
 766 \li <b>Cluster</b>: Cluster routing, specific to cluster tag, should not be used.
 767
 768
 769
 770 \subsection pf_switch Hey, I want to describe a switch but there is no switch tag !
 771
 772 Actually we did not include swith tag, ok. But when you're trying to simulate a switch, the only major impact it has when you're using fluid model (and SimGrid uses fluid model unless you activate GTNetS or ns-3 mode) is the impact of the upper limit of the switch motherboard speed that will eventually be reached if you're using intensively your switch. So, the switch impact is similar to a link one. That's why we are used to describe a switch using a link tag (as a link is not an edge by a hyperedge, you can connect more than 2 other links to it).
 773
 774 */