doc/doxygen/uhood.doc

   1 /*! @page uhood Under the Hood
   2
   3 \tableofcontents
   4
   5 TBD
   6
   7  - Simulation Loop, LMM, sharing -> papers
   8  - Context Switching, privatization -> papers
   9  - @subpage inside
  10
  11 \section simgrid_uhood_s4u S4U
  12
  13 S4U classes are designed to be user process interfaces to Maestro resources.
  14 We provide an uniform interface to them:
  15
  16 * automatic reference count with intrusive smart pointers `simgrid::s4u::FooPtr`
  17   (also called `simgrid::s4u::Foo::Ptr`);
  18
  19 * manual reference count with `intrusive_ptr_add_ref(p)`,
  20   `intrusive_ptr_release(p)`;
  21
  22 * delegation of the operations to a opaque `pimpl` (which is the Maestro object);
  23
  24 * the Maestro object and the corresponding S4U object have the same lifetime
  25   (and share the same reference count).
  26
  27 The ability to manipulate thge objects thought pointers and have the ability
  28 to use explicite reference count management is useful for creating C wrappers
  29 to the S4U and should play nicely with other language bindings (such as
  30 SWIG-based ones).
  31
  32 Some objects currently live for the whole duration of the simulation and do
  33 not have refertence counts. We still provide dummy `intrusive_ptr_add_ref(p)`,
  34 `intrusive_ptr_release(p)` and `FooPtr` for consistency.
  35
  36 In many cases, we try to have a API which is consistent with the API or
  37 corresponding C++ standard classes. For example, the methods of
  38 `simgrid::s4u::Mutex`. This has different benefits:
  39
  40  * we use a proven interface with a well defined and documented semantic;
  41
  42  * the interface is easy to understand and remember for people used to the C++
  43    standard interface;
  44
  45  * we can use some standard C++ algorithms and helper classes with our types
  46    (`simgrid::s4u::Mutex` can be used with `std::lock`, `std::unique_lock`,
  47    etc.).
  48
  49 Example of `simgris::s4u::Actor`:
  50
  51 ~~~
  52 class Actor {
  53   // This is the corresponding maestro object:
  54   friend simgrid::simix::Process;
  55   simgrid::simix::Process* pimpl_ = nullptr;
  56 public:
  57
  58   Actor(simgrid::simix::Process* pimpl) : pimpl_(pimpl) {}
  59   Actor(Actor const&) = delete;
  60   Actor& operator=(Actor const&) = delete;
  61
  62   // Reference count is delegated to the S4u object:
  63   friend void intrusive_ptr_add_ref(Actor* actor)
  64   {
  65     xbt_assert(actor != nullptr);
  66     SIMIX_process_ref(actor->pimpl_);
  67   }
  68   friend void intrusive_ptr_release(Actor* actor)
  69   {
  70     xbt_assert(actor != nullptr);
  71     SIMIX_process_unref(actor->pimpl_);
  72   }
  73   using Ptr = boost::intrusive_ptr<Actor>;
  74
  75   // Create processes:
  76   static Ptr createActor(const char* name, s4u::Host *host, double killTime, std::function<void()> code);
  77
  78   // [...]
  79 };
  80
  81 using ActorPtr = Actor::Ptr;
  82 ~~~
  83
  84 It uses the `simgrid::simix::Process` as a opaque pimple:
  85
  86 ~~~
  87 class Process {
  88 private:
  89   std::atomic_int_fast32_t refcount_ { 1 };
  90   // The lifetime of the s4u::Actor is bound to the lifetime of the Process:
  91   simgrid::s4u::Actor actor_;
  92 public:
  93   Process() : actor_(this) {}
  94
  95   // Reference count:
  96   friend void intrusive_ptr_add_ref(Process* process)
  97   {
  98     // Atomic operation! Do not split in two instructions!
  99     auto previous = (process->refcount_)++;
 100     xbt_assert(previous != 0);
 101     (void) previous;
 102   }
 103   friend void intrusive_ptr_release(Process* process)
 104   {
 105     // Atomic operation! Do not split in two instructions!
 106     auto count = --(process->refcount_);
 107     if (count == 0)
 108       delete process;
 109   }
 110
 111   // [...]
 112 };
 113
 114 smx_process_t SIMIX_process_ref(smx_process_t process)
 115 {
 116   if (process != nullptr)
 117     intrusive_ptr_add_ref(process);
 118   return process;
 119 }
 120
 121 /** Decrease the refcount for this process */
 122 void SIMIX_process_unref(smx_process_t process)
 123 {
 124   if (process != nullptr)
 125     intrusive_ptr_release(process);
 126 }
 127 ~~~
 128
 129 \section simgrid_uhood_async Asynchronous operations
 130
 131 \subsection simgrid_uhood_futures Futures
 132
 133 The `simgrid::kernel::Future` class has been added to SimGrid as an abstraction
 134 to represent asynchronous operations in the SimGrid maestro. Its API is based
 135 on `std::experimental::future` from the [C++ Extensions for Concurrency Technical
 136 Specification](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0159r0.html):
 137
 138  - `simgrid::kernel::Future<T>` represents the result an asynchronous operations
 139     in the simulation inside the SimGrid maestro/kernel;
 140
 141  - `simgrid::kernel::Promise<T>` can be used to set the value of an assocaiated
 142    `simgrid::kernel::Future<T>`.
 143
 144 The expected way to work with `simgrid::kernel::Future<T>` is to add a
 145 completion handler/continuation:
 146
 147 ~~~
 148 // This code is executed in the maestro context, we cannot block for the result
 149 // to be ready:
 150 simgrid::kernel::Future<std::vector<char>> result = simgrid::kernel::readFile(file);
 151
 152 // Add a completion handler:
 153 result.then([file](simgrid::kernel::Future<std::vector<char>> result) {
 154   // At this point, the operation is complete and we can safely call .get():
 155   xbt_assert(result.is_ready());
 156   try {
 157     std::vector<char> data = result.get();
 158     XBT_DEBUG("Finished reading file %s: length %zu", file.c_str(), data.size());
 159   }
 160   // If the operation failed, .get() throws an exception:
 161   catch (std::runtime_error& e) {
 162     XBT_ERROR("Could not read file %s", file.c_str());
 163   }
 164 });
 165 ~~~
 166
 167 The SimGrid kernel cannot block so calling `.get()` or `.wait()` on a
 168 `simgrid::kernel::Future<T>` which is not ready will deadlock. In practice, the
 169 simulator detects this and aborts after reporting an error.
 170
 171 In order to generate your own future, you might need to use a
 172 `simgrid::kernel::Promise<T>`. The promise is a one-way channel which can be
 173 used to set the result of an associated `simgrid::kernel::Future<T>`
 174 (with either `.set_value()` or `.set_exception()`):
 175
 176 ~~~
 177 simgrid::kernel::Future<void> kernel_wait_until(double date)
 178 {
 179   auto promise = std::make_shared<simgrid::kernel::Promise<void>>();
 180   auto future = promise->get_future();
 181   SIMIX_timer_set(date, [promise] {
 182     promise->set_value();
 183   });
 184   return future;
 185 }
 186 ~~~
 187
 188 Like the experimental futures, we support chaining `.then()` methods with
 189 automatic future unwrapping.
 190 You might want to look at some [C++ tutorial on futures](https://www.youtube.com/watch?v=mPxIegd9J3w&list=PLHTh1InhhwT75gykhs7pqcR_uSiG601oh&index=43)
 191 for more details and examples. Some operations of the proposed experimental
 192 futures are currently not implemented in our futures however such as
 193 `.wait_for()`, `.wait_until()`, `shared_future`, `when_any()`.
 194
 195 \subsection simgrid_uhood_timer Timers
 196
 197 \section simgrid_uhood_mc Model Checker
 198
 199 The current implementation of the model-checker uses two distinct processes:
 200
 201  - the SimGrid model-checker (`simgrid-mc`) itself lives in the parent process;
 202
 203  - it spaws a child process for the SimGrid simulator/mastro and the simulated
 204    processes.
 205
 206 They communicate using a `AF_UNIX` `SOCK_DGRAM` socket and exchange messages
 207 defined in `mc_protocol.h`. The `SIMGRID_MC_SOCKET_FD` environment variable it
 208 set to the file descriptor of this socket in the child process.
 209
 210 The model-checker analyzes, saves and restores the state of the model-checked
 211 process using the following techniques:
 212
 213 * the model-checker reads and writes in the model-checked address space;
 214
 215 * the model-cheker `ptrace()`s the model-checked process and is thus able to
 216   know the state of the model-checked process if it crashes;
 217
 218 * DWARF debug informations are used to unwind the stack and identify local
 219   variables;
 220
 221 * a custom heap is enabled in the model-checked process which allows the model
 222   checker to know which chunks are allocated and which are freed.
 223
 224 \subsection simgrid_uhood_mc_address_space Address space
 225
 226 The `AddressSpace` is a base class used for both the model-checked process
 227 and its snapshots and has methods to read in the corresponding address space:
 228
 229  - the `Process` class is a subclass representing the model-checked process;
 230
 231  - the `Snapshot` class is a subclass representing a snapshot of the process.
 232
 233 Additional helper class include:
 234
 235  - `Remote<T>` is the result of reading a `T` in a remote AddressSpace. For
 236     trivial types (int, etc.), it is convertible t o `T`.
 237
 238  - `RemotePtr<T>` represents the address of an object of type `T` in some
 239     remote `AddressSpace` (it could be an alias to `Remote<T*>`).
 240
 241 \subsection simgrid_uhood_mc_address_elf_dwarf ELF and DWARF
 242
 243 ELF is a standard executable file and dynamic libraries file format.
 244 DWARF is a standard for debug informations. Both are used on GNU/Linux systems
 245 and exploited by the model-checker to understand the model-checked process:
 246
 247  - `ObjectInformation` represents the informations about a given ELF module
 248    (executable or shared-object);
 249
 250  - `Frame` represents a subprogram scope (either a subprogram or a scope within
 251     the subprogram);
 252
 253  - `Type` represents a type (`char*`, `int`, `std::string`) and is referenced
 254     by variables (global, variables, parameters), functions (return type),
 255     and other types (type of a `struct` field, etc.);
 256
 257  - `LocationList` and `DwarfExpression` are used to describe the location of
 258     variables.
 259
 260 */