docs/source/Design_goals.rst

   1 .. _design:
   2
   3 SimGrid Design Goals
   4 ######################
   5
   6 A SimGrid simulation boils down to a set of **actors** that use
   7 **resources** by performing **activities**. SimGrid provides several kinds of
   8 resources (link, CPU, disk, and synchronization objects such as mutexes
   9 and semaphores) along with the corresponding kinds of activities
  10 (communication, execution, I/O, synchronization). SimGrid users provide a
  11 platform instantiation (sets of interconnected resources) and an
  12 application (the code executed by actors) along with the actors'
  13 placement on the platform.
  14
  15 The actors (i.e., the user code) can only interact with the platform
  16 through activities, which are somewhat similar to synchronizations.  Some
  17 are very natural (locking a mutex is a synchronization with the other
  18 actors that use the same mutex). Others activities implement more atypical
  19 kinds of synchronization: execution, communication, and I/O have a
  20 quantitative component that depends on the resources. Regardless, in
  21 SimGrid writing some data to disk, for instance, is seen as a
  22 synchronization with the other actors that use the same disk. When you lock
  23 a mutex, you can proceed only when that mutex gets unlocked by someone
  24 else.  Similarly, when you do an I/O, you can proceed only once the disk
  25 has delivered enough performance to fulfill your demand (while competing
  26 with concurrent demands for that same disk made by other actors).
  27 Communication activities have both a qualitative component (the actual
  28 communication starts only when both the sender and receiver are ready to
  29 proceed) and a quantitative component (consuming the communication power of
  30 the link resources).
  31
  32 The design of SimGrid is driven by several design goals:
  33
  34  - **reproducibility**: re-executing the same simulation must lead to
  35    the exact same outcome, even if it runs on another computer or
  36    operating system. When possible, this should also be true across different
  37    versions of SimGrid.
  38  - **sweet spot between simulation accuracy and simulation speed**: running a given simulation should be as fast as possible but predict
  39    correct performance trends (or even provide accurate predictions when correctly calibrated).
  40  - **versatility**: ability to simulate many kinds of distributed system and application, while remaining parsimonious so as to not hinder usability. SimGrid tries to provide sane
  41    default settings along with the possibility to augment and modify
  42    the provided models and their settings for each relevant use case.
  43  - **scalability**: ability to deal with very large simulations in terms of the
  44    number of actors, the size of the platform, the number of
  45    events, or all all the above.
  46
  47 Actors and activities
  48 *********************
  49
  50 At the core of the SimGrid design lies the interactions between actors and
  51 activities. For the sake of reproducibility, the actors cannot interact
  52 directly with their environment: every interaction is serialized through a
  53 central component that processes them in a reproducible order. For the sake
  54 of speed, SimGrid is designed as an operating system: the actors issue
  55 **simcalls** (akin to system calls) to a simulation kernel (akin to an OS
  56 kernel) called the **maestro** that decides which actors can proceed and
  57 which ones must wait.
  58
  59 A SimGrid simulation proceeds as a sequence of **scheduling
  60 rounds**. At each round, all actors that are not currently blocked on a
  61 simcall get executed. This is done by the maestro, which passes the control flow to the
  62 code of each actor, written by the user in either C++, C, Fortran or Python.
  63 The control flow is returned to the maestro when the actor
  64 blocks on its next simcall. If the time it takes to
  65 execute the actor's code is of import to the simulator, it has to be the object
  66 of execution activities.  SMPI programs are automatically benchmarked, but
  67 these execution activities must be manually created if using S4U. The simulated
  68 time is discrete in SimGrid and only progresses between scheduling
  69 rounds. So all events occurring during a given scheduling round occur
  70 at the exact same simulated time, even if the actors are usually
  71 executed sequentially on a real platform.
  72
  73 .. image:: img/design-scheduling-simulatedtime.svg
  74    :scale: 80%
  75    :align: center
  76
  77 To modify their environment actors issue either **immediate
  78 simcalls** that take no time in the simulation (e.g., spawning another
  79 actor), or **blocking simcalls** that must wait for future events (e.g.,
  80 mutex locks require the mutex to be unlocked by its owner;
  81 communications wait for the network to provide have provided enough communication
  82 bandwidth to transfer the data). A given scheduling round is
  83 usually composed of several sub-scheduling rounds during which
  84 immediate simcalls are resolved. This ends when all actors are either
  85 terminated or within a blocking simcall. The simulation models are
  86 then used to compute the time at which the first pending simcall
  87 terminates. Time is advanced to that point, and a new scheduling
  88 round begins with all actors that became unblocked at that timestamp.
  89
  90 Context switching between the actors and maestro is highly optimized
  91 for the sake of simulation performance. SimGrid provides several
  92 implementations of this mechanism, called **context factories**. These
  93 implementations fall into two categories: Preemptive contexts are
  94 based on standard system threads from the libstdc library.
  95 They are usually better supported by external
  96 debuggers and profiling tools, but less efficient. The most efficient
  97 factories use non-preemptive mechanisms, such as SysV's ucontexts,
  98 boost's context, or our own hand-tuned implementation, that is written
  99 in assembly language. This is possible because a given actor is never
 100 interrupted between consecutive simcalls in SimGrid.
 101
 102 .. image:: img/design-scheduling-wallclock.svg
 103    :scale: 80%
 104    :align: center
 105
 106 For the sake of performance, actors can be executed in parallel using several system threads which execute all actors in
 107 turn. In our experience, however, this rarely leads to significant performance improvement because most applications simulated with
 108 SimGrid are fine-grained: it's often not worth simulating actors in parallel because the amount of work performed by each actor is too
 109 small. This is because the users tend to abstract away any large computations to simulate the control flow of their
 110 application efficiently. In addition, parallel simulation puts restrictions on the user's code.
 111 For example, the existing SMPI implementation cannot be used in parallel yet.
 112
 113 .. image:: img/design-scheduling-parallel.svg
 114    :scale: 80%
 115    :align: center
 116
 117 Parsimonious model versatility
 118 ******************************
 119
 120 Another main design goal of SimGrid is parsimonious versatility in
 121 modeling, that is, aiming to simulate very different things using the same
 122 simulation abstractions. To achieve this goal, the SimGrid implementation
 123 tends to unify all resource kinds and activity kinds. For instance, the
 124 classical notion of **computational power** is mirrored for
 125 **communication power** and **I/O power**. Asynchronous executions are less
 126 common than the asynchronous communications that proliferate in MPI but
 127 they are still provided for sake of symmetry: they even prove useful to
 128 simulate thread pools efficiently. SimGrid also provides asynchronous mutex
 129 locks for symmetry. The notion of **pstate** was introduced to model the
 130 step-wise variation of computational speed depending on the DVFS, and was
 131 reused to model the bootup and shutdown phases of a CPU: the computational
 132 speed is 0 at these specific pstates. This pstate notion was extended to
 133 represent the fact that the bandwidth provided by a wifi link to a given
 134 station depends on its signal-noise ratio (SNR). In summary, simulation
 135 abstractions are re-used and/or generalized as much as possible to serve a
 136 wide range of purposes.
 137
 138 Furthermore, all provided resource models are very similar internally. They
 139 :ref:`rely on linear inequation systems <models-lmm>`, stating for example
 140 that the sum of the computational power received by all computation
 141 activities located on a given CPU cannot exceed the computational power
 142 provided by this CPU. This extends nicely to multi-resources activities
 143 such as communications that use several links, and also to parallel tasks
 144 (abstract activities representing a parallel execution kernel that consumes
 145 both the communication and computational power of a set of machines) or
 146 fluid I/O streams (abstract activities representing a data stream from disk
 147 to disk through the network). Specific coefficients are added to the linear
 148 system to mimic how the resources behavior in the real world. The resulting
 149 system is then solved using a max-min objective function that maximizes the
 150 minimum of all shares allocated to activities. Our experience shows that
 151 this approach can successfully be used for fast yet accurate simulations of
 152 complex phenomena, provided that the model's coefficients and constants are
 153 carefully :ref:`calibrated <models_calibration>`, i.e. tailored and
 154 instantiated to that phenomenon.
 155
 156 Model-checking
 157 **************
 158
 159 Even if it was not in its original goals, SimGrid now
 160 integrates a full-featured model-checker (dubbed MC or Mc SimGrid)
 161 that can exhaustively explore all execution paths that the application
 162 could experience. Conceptually, Mc SimGrid is built upon the ideas
 163 presented previously. Instead of using the resource models to compute
 164 the order of simcall terminations, it explores every order that is
 165 causally possible. In a simulation entailing only three concurrent
 166 events (i.e., simcalls) A, B, and C, it will first explore the
 167 scenario where the activities order is ABC, then ACB, then
 168 BAC, then BCA, then CAB and finally CBA. Of course, the number of
 169 scenarios to explore grows exponentially with the number of simcalls
 170 in the simulation. Mc SimGrid leverages reduction techniques to avoid
 171 re-exploring redundant traces.
 172
 173 In practice, Mc SimGrid can be used to verify classical `safety and
 174 liveness properties
 175 <https://en.wikipedia.org/wiki/Linear_time_property>`_, but also
 176 `communication determinism
 177 <https://hal.inria.fr/hal-01953167/document>`_, a property that allows
 178 more efficient solutions toward fault-tolerance. It can alleviate the
 179 state space explosion problem through `Dynamic Partial Ordering
 180 Reduction (DPOR)
 181 <https://en.wikipedia.org/wiki/Partial_order_reduction>`_ and `state
 182 equality <https://hal.inria.fr/hal-01900120/document>`_.
 183
 184 Mc SimGrid is more experimental than other parts of SimGrid, such as SMPI that can now be used to run many full-featured
 185 MPI codes out of the box, but it's constantly improving.