docs/source/Design_goals.rst

   1 SimGrid's Design Goals
   2 ######################
   3
   4 Any SimGrid simulation comes down to a set of **actors** using some
   5 **resources** through **activities**. SimGrid provides several kinds of
   6 resources (link, CPU, disk, and synchronization objects such as mutex
   7 or semaphores) along with the corresponding activity kinds
   8 (communication, execution, I/O, lock). SimGrid users provide a
   9 platform instantiation (list of interconnected resources) and an
  10 application (the code executed by actors) along with the actors'
  11 placement on the platform.
  12
  13 The actors (ie, the user code) can only interact with the platform
  14 through activities, that are somewhat similar to synchronizations.
  15 Some are very natural (locking a mutex is a synchronization with the
  16 other actors using the same mutex) while others activities constitute
  17 more original synchronization: execution, communication, and I/O have a
  18 quantitative component that depends on the resources. But still,
  19 writing some data to disk is seen as a synchronization with the other
  20 actors using the same disk. When you lock a mutex, you can proceed
  21 only when that mutex gets unlocked by someone else. Similarly, when you
  22 do an I/O, you can proceed once the disk delivered enough performance
  23 to fulfill your demand (along with the concurrent demands of the other
  24 actors occurring at the same time). Communication activities have both a
  25 qualitative component (the actual communication starts only when both
  26 the sender and receiver are ready to proceed) and a quantitative
  27 component (consuming the communication power of the link resources).
  28
  29 The design of SimGrid is shaped by several design goals:
  30
  31  - **reproducibility**: re-executing the same simulation must lead to
  32    the exact same outcome, even if it runs on another computer or
  33    operating system. When possible, this should also be true when you
  34    use another version of SimGrid.
  35  - **speed**: running a given simulation should be as fast as possible
  36  - **versatility**: ability to simulate many kinds of distributed systems
  37    and resource models. But the simulation should be parsimonious too,
  38    to not hinder the tool's usability. SimGrid tries to provide sane
  39    default settings along with the possibility to augment and modify
  40    the provided models and their default settings.
  41  - **scalability**: ability to deal with very large simulations. In the
  42    number of actors, in the size of the platform, in the number of
  43    events, or all together.
  44
  45 Actors and activities
  46 *********************
  47
  48 The first crux of the SimGrid design lays in the interaction between
  49 each actor and the activities. For the sake of reproducibility, the
  50 actors cannot interact directly with their environment: every
  51 modification request is serialized through a central component that
  52 processes them in a reproducible order. For the sake of speed, SimGrid
  53 is designed as an operating system: the actors issue **simcalls** to a
  54 simulation kernel that is called **maestro** (because it decides which
  55 actors can proceed and which ones must wait).
  56
  57 In practice, a SimGrid simulation is a suite of so-called **scheduling
  58 rounds**, during which all actors that are not currently blocked on a
  59 simcall get executed. For that, maestro passes the control flow to the
  60 code of each actor, that are written in either C++, C, Fortran, Python,
  61 or Java. The control flow then returns to the maestro when the actor
  62 blocks on its next blocking simcall. Note that the time it takes to
  63 execute the actor code has to be reported to the simulator using
  64 execution activities. SMPI programs are automatically benchmarked
  65 while these executions must be manually reported in S4U. The simulated
  66 time is discrete in SimGrid and only progresses between scheduling
  67 rounds, so all events occurring during a given scheduling round occur
  68 at the exact same simulated timestamp, even if the actors are usually
  69 executed sequentially on the real platform.
  70
  71 To modify their environment, the actors issue either **immediate
  72 simcalls** that take no time in the simulation (e.g.: spawning another
  73 actor), or **blocking simcalls** that must wait for future events (e.g.:
  74 mutex locks require the mutex to be unlocked by its owner;
  75 communications wait for the network to provide enough communication
  76 performance to fulfill the demand). A given scheduling round is
  77 usually composed of several sub-scheduling rounds during which
  78 immediate simcalls are resolved. This ends when all actors are either
  79 terminated or within a blocking simcall. The simulation models are
  80 then used to compute the time at which the first next simcall
  81 terminates. The time is advanced to that point, and a new scheduling
  82 round begins with all actors that got unblocked at that timestamp.
  83
  84 Context switching between the actors and maestro is highly optimized
  85 for the sake of simulation performance. SimGrid provides several
  86 implementations of this mechanism, called **context factories**. These
  87 implementations fall into two categories: Preemptive contexts are
  88 based on full-fledged system threads such as pthread on Linux or Java
  89 threads in the JVM. They are usually better supported by external
  90 debuggers and profiling tools, but less efficient. The most efficient
  91 factories use non-preemptive mechanisms, such as SysV's ucontexts,
  92 boost's context, or our own hand-tuned implementation, that is written
  93 in assembly language. This is possible because a given actor is never
  94 interrupted between consecutive simcalls in SimGrid.
  95
  96 For the sake of performance, actors can be executed in parallel using
  97 several system threads for non-preemptive contexts. But in our
  98 experience, this rarely leads to any performance improvement because
  99 most applications simulated on top of SimGrid are fine-grained: when
 100 the simulation performance really matters, the users tend to abstract
 101 away any large computations to efficiently simulate the control flow
 102 of their application. In addition, parallel simulation puts unpleasant
 103 restrictions on the user code, that must be correctly isolated. To be
 104 honest, most of the existing SMPI implementation cannot be used in
 105 parallel yet.
 106
 107 Parsimonious model versatility
 108 ******************************
 109
 110 Another orthogonal crux of the SimGrid design is the parsimonious
 111 versatility in modeling. For that, we tend to unify all resource and
 112 activity kinds. As you have seen, we parallel the classical notion of
 113 **computational power** with the more original **communication power** and
 114 **I/O power**. Asynchronous executions are less common than the
 115 asynchronous communications that proliferate in MPI but they are still
 116 provided for sake of symmetry: they even prove useful to efficiently
 117 simulate thread pools. Note that asynchronous mutex locking is still to be
 118 added to SimGrid atm. The notion of **pstate** was introduced to model
 119 the stepwise variation of computational speed depending on the DVFS,
 120 and was reused to model the bootup and shutdown phases of a CPU: the
 121 computational speed is 0 at these specific pstates. This pstate notion
 122 was extended to represent the fact that the bandwidth provided by a
 123 wifi link to a given station depends on its signal-noise ratio (SNR).
 124
 125 Further on this line, all provided resource models are very comparable
 126 internally. They rely on linear inequation systems, stating for
 127 example that the sum of the computational power received by all
 128 computation activities located on a given CPU cannot overpass the
 129 computational power provided by this resource. This extends nicely to
 130 multi-resources activities such as communications using several links,
 131 and also to the so-called parallel tasks (abstract activities
 132 representing a parallel execution kernel consuming both the
 133 communication and computational power of a set of machines). Specific
 134 coefficients are added to the linear system to reflect how the real
 135 resources are shared between concurrent usages. The resulting system
 136 is then solved using a max-min objective function that maximizes the
 137 minimum of all shares allocated to activities. Our experience shows
 138 that this approach can successfully be used for fast yet accurate
 139 simulations of complex phenomena, provided that the model's
 140 coefficients and constants are carefully tailored and instantiated to
 141 that phenomenon.
 142
 143 Model-checking
 144 **************
 145
 146 Even if it was not in the original goals of SimGrid, the framework now
 147 integrates a full-featured model-checker (dubbed MC or Mc SimGrid)
 148 that can exhaustively explore all execution paths that the application
 149 could experience. Conceptually, Mc SimGrid is built upon the ideas
 150 presented previously. Instead of using the resource models to compute
 151 the order simcall terminations, it explores every order that is
 152 causally possible. In a simulation entailing only three concurrent
 153 events (i.e., simcalls) A, B, and C, it will first explore the
 154 scenario where the activities order is ABC, then the ACB order, then
 155 BAC, then BCA, then CAB and finally CBA. Of course, the number of
 156 scenarios to explore grows exponentially with the number of simcalls
 157 in the simulation. Mc SimGrid leverages reduction techniques to avoid
 158 re-exploring equivalent traces.
 159
 160 In practice, Mc SimGrid can be used to verify classical `safety and
 161 liveness properties
 162 <https://en.wikipedia.org/wiki/Linear_time_property>`_, but also
 163 `communication determinism
 164 <https://hal.inria.fr/hal-01953167/document>`_, a property that allows
 165 more efficient solutions toward fault-tolerance. It can alleviate the
 166 state space explosion problem through `Dynamic Partial Ordering
 167 Reduction (DPOR)
 168 <https://en.wikipedia.org/wiki/Partial_order_reduction>`_ and `state
 169 equality <https://hal.inria.fr/hal-01900120/document>`_.
 170
 171 Mc SimGrid is far more experimental than other parts of the framework,
 172 such as SMPI that can now be used to run many full-featured MPI codes
 173 out of the box.