X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/ce6c3f5dd63f79859c7c243f0fb714b49b6b60a8..92661d62eaee255e5f677a667c0f05a4f5917c24:/doc/doxygen/uhood_switch.doc diff --git a/doc/doxygen/uhood_switch.doc b/doc/doxygen/uhood_switch.doc index 814bf1ca4b..3515d8c23c 100644 --- a/doc/doxygen/uhood_switch.doc +++ b/doc/doxygen/uhood_switch.doc @@ -1,5 +1,7 @@ /*! @page uhood_switch Process Synchronizations and Context Switching +@tableofcontents + @section uhood_switch_DES SimGrid as an Operating System SimGrid is a discrete event simulator of distributed systems: it does @@ -96,7 +98,7 @@ producer calls `promise.set_value(42)` or `promise.set_exception(e)` in order to set the result which will be made available to the consumer by `future.get()`. -### Which future do we need? +@subsection uhood_switch_futures_needs Which future do we need? The blocking API provided by the standard C++11 futures does not suit our needs since the simulation kernel cannot block, and since @@ -129,7 +131,7 @@ API, with a few differences: - Some features of the standard (such as shared futures) are not needed in our context, and thus not considered here. -### Implementing `Future` and `Promise` +@subsection uhood_switch_futures_implem Implementing `Future` and `Promise` The `simgrid::kernel::Future` and `simgrid::kernel::Promise` use a shared state defined as follows: @@ -272,27 +274,44 @@ T simgrid::kernel::FutureState::get() } @endcode -## Generic simcalls +@section uhood_switch_simcalls Implementing the simcalls + +So a simcall is a way for the actor to push a request to the +simulation kernel and yield the control until the request is +fulfilled. The performance requirements are very high because +the actors usually do an inordinate amount of simcalls during the +simulation. + +As for real syscalls, the basic idea is to write the wanted call and +its arguments in a memory area that is specific to the actor, and +yield the control to the simulation kernel. Once in kernel mode, the +simcalls of each demanding actor are evaluated sequentially in a +strictly reproducible order. This makes the whole simulation +reproducible. -### Motivation -Simcalls are not so easy to understand and adding a new one is not so easy -either. In order to add one simcall, one has to first -add it to the [list of simcalls](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/simcalls.in) -which looks like this: +@subsection uhood_switch_simcalls_v2 The historical way + +In the very first implementation, everything was written by hand and +highly optimized, making our software very hard to maintain and +evolve. We decided to sacrifice some performance for +maintainability. In a second try (that is still in use in SimGrid +v3.13), we had a lot of boiler code generated from a python script, +taking the [list of simcalls](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/simcalls.in) +as input. It looks like this: @code{cpp} # This looks like C++ but it is a basic IDL-like language # (one definition per line) parsed by a python script: -void process_kill(smx_process_t process); +void process_kill(smx_actor_t process); void process_killall(int reset_pid); -void process_cleanup(smx_process_t process) [[nohandler]]; -void process_suspend(smx_process_t process) [[block]]; -void process_resume(smx_process_t process); -void process_set_host(smx_process_t process, sg_host_t dest); -int process_is_suspended(smx_process_t process) [[nohandler]]; -int process_join(smx_process_t process, double timeout) [[block]]; +void process_cleanup(smx_actor_t process) [[nohandler]]; +void process_suspend(smx_actor_t process) [[block]]; +void process_resume(smx_actor_t process); +void process_set_host(smx_actor_t process, sg_host_t dest); +int process_is_suspended(smx_actor_t process) [[nohandler]]; +int process_join(smx_actor_t process, double timeout) [[block]]; int process_sleep(double duration) [[block]]; smx_mutex_t mutex_init(); @@ -311,7 +330,7 @@ struct s_smx_simcall { // Simcall number: e_smx_simcall_t call; // Issuing actor: - smx_process_t issuer; + smx_actor_t issuer; // Arguments of the simcall: union u_smx_scalar args[11]; // Result of the simcall: @@ -342,9 +361,9 @@ union u_smx_scalar { }; @endcode -Then one has to call (manually:cry:) a -[Python script](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/simcalls.py) -which generates a bunch of C++ files: +When manually calling the relevant [Python +script](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/simcalls.py), +this generates a bunch of C++ files: * an enum of all the [simcall numbers](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/popping_enum.h#L19); @@ -360,7 +379,7 @@ which generates a bunch of C++ files: Then one has to write the code of the kernel side handler for the simcall and the code of the simcall itself (which calls the code-generated -marshaling/unmarshaling stuff):sob:. +marshaling/unmarshaling stuff). In order to simplify this process, we added two generic simcalls which can be used to execute a function in the simulation kernel: @@ -428,7 +447,7 @@ xbt_dict_t Host::properties() { } @endcode -### Blocking simcall +### Blocking simcall {#uhood_switch_v2_blocking} The second generic simcall (`simcall_run_blocking()`) executes a function in the SimGrid simulation kernel immediately but does not wake up the calling actor @@ -462,7 +481,7 @@ auto kernelSync(F code) -> decltype(code().get()) if (SIMIX_is_maestro()) xbt_die("Can't execute blocking call in kernel mode"); - smx_process_t self = SIMIX_process_self(); + smx_actor_t self = SIMIX_process_self(); simgrid::xbt::Result result; simcall_run_blocking([&result, self, &code]{ @@ -500,12 +519,12 @@ int res = simgrid::simix::kernelSync([&] { }); @endcode -### Asynchronous operations +### Asynchronous operations {#uhood_switch_v2_async} We can write the related `kernelAsync()` which wakes up the actor immediately and returns a future to the actor. As this future is used in the actor context, it is a different future -(`simgrid::simix::Future` instead of `simgrid::kernel::Furuere`) +(`simgrid::simix::Future` instead of `simgrid::kernel::Future`) which implements a C++11 `std::future` wait-based API: @code{cpp} @@ -532,7 +551,7 @@ T simgrid::simix::Future::get() { if (!valid()) throw std::future_error(std::future_errc::no_state); - smx_process_t self = SIMIX_process_self(); + smx_actor_t self = SIMIX_process_self(); simgrid::xbt::Result result; simcall_run_blocking([this, &result, self]{ try { @@ -600,172 +619,8 @@ The semantic is equivalent but this form would require two simcalls instead of one to do the same job (one in `kernelAsync()` and one in `.get()`). -## Representing the simulated time - -SimGrid uses `double` for representing the simulated time: - -* durations are expressed in seconds; - -* timepoints are expressed as seconds from the beginning of the simulation. - -In contrast, all the C++ APIs use `std::chrono::duration` and -`std::chrono::time_point`. They are used in: - -* `std::this_thread::wait_for()` and `std::this_thread::wait_until()`; - -* `future.wait_for()` and `future.wait_until()`; - -* `condvar.wait_for()` and `condvar.wait_until()`. - -We can define `future.wait_for(duration)` and `future.wait_until(timepoint)` -for our futures but for better compatibility with standard C++ code, we might -want to define versions expecting `std::chrono::duration` and -`std::chrono::time_point`. - -For time points, we need to define a clock (which meets the -[TrivialClock](http://en.cppreference.com/w/cpp/concept/TrivialClock) -requirements, see -[`[time.clock.req]`](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf#page=642) -working in the simulated time in the C++14 standard): - -@code{cpp} -struct SimulationClock { - using rep = double; - using period = std::ratio<1>; - using duration = std::chrono::duration; - using time_point = std::chrono::time_point; - static constexpr bool is_steady = true; - static time_point now() - { - return time_point(duration(SIMIX_get_clock())); - } -}; -@endcode - -A time point in the simulation is a time point using this clock: - -@code{cpp} -template -using SimulationTimePoint = - std::chrono::time_point; -@endcode - -This is used for example in `simgrid::s4u::this_actor::sleep_for()` and -`simgrid::s4u::this_actor::sleep_until()`: - -@code{cpp} -void sleep_for(double duration) -{ - if (duration > 0) - simcall_process_sleep(duration); -} - -void sleep_until(double timeout) -{ - double now = SIMIX_get_clock(); - if (timeout > now) - simcall_process_sleep(timeout - now); -} - -template -void sleep_for(std::chrono::duration duration) -{ - auto seconds = - std::chrono::duration_cast(duration); - this_actor::sleep_for(seconds.count()); -} - -template -void sleep_until(const SimulationTimePoint& timeout_time) -{ - auto timeout_native = - std::chrono::time_point_cast(timeout_time); - this_actor::sleep_until(timeout_native.time_since_epoch().count()); -} -@endcode - -Which means it is possible to use (since C++14): - -@code{cpp} -using namespace std::chrono_literals; -simgrid::s4u::actor::sleep_for(42s); -@endcode - ## Mutexes and condition variables -## Mutexes - -SimGrid has had a C-based API for mutexes and condition variables for -some time. These mutexes are different from the standard -system-level mutex (`std::mutex`, `pthread_mutex_t`, etc.) because -they work at simulation-level. Locking on a simulation mutex does -not block the thread directly but makes a simcall -(`simcall_mutex_lock()`) which asks the simulation kernel to wake the calling -actor when it can get ownership of the mutex. Blocking directly at the -OS level would deadlock the simulation. - -Reusing the C++ standard API for our simulation mutexes has many -benefits: - - * it makes it easier for people familiar with the `std::mutex` to - understand and use SimGrid mutexes; - - * we can benefit from a proven API; - - * we can reuse from generic library code in SimGrid. - -We defined a reference-counted `Mutex` class for this (which supports -the [`Lockable`](http://en.cppreference.com/w/cpp/concept/Lockable) -requirements, see -[`[thread.req.lockable.req]`](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf#page=1175) -in the C++14 standard): - -@code{cpp} -class Mutex { - friend ConditionVariable; -private: - friend simgrid::simix::Mutex; - simgrid::simix::Mutex* mutex_; - Mutex(simgrid::simix::Mutex* mutex) : mutex_(mutex) {} -public: - - friend void intrusive_ptr_add_ref(Mutex* mutex); - friend void intrusive_ptr_release(Mutex* mutex); - using Ptr = boost::intrusive_ptr; - - // No copy: - Mutex(Mutex const&) = delete; - Mutex& operator=(Mutex const&) = delete; - - static Ptr createMutex(); - -public: - void lock(); - void unlock(); - bool try_lock(); -}; -@endcode - -The methods are simply wrappers around existing simcalls: - -@code{cpp} -void Mutex::lock() -{ - simcall_mutex_lock(mutex_); -} -@endcode - -Using the same API as `std::mutex` (`Lockable`) means we can use existing -C++-standard code such as `std::unique_lock` or -`std::lock_guard` for exception-safe mutex handling[^lock]: - -@code{cpp} -{ - std::lock_guard lock(*mutex); - sum += 1; -} -@endcode - ### Condition Variables Similarly SimGrid already had simulation-level condition variables