X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/ce6c3f5dd63f79859c7c243f0fb714b49b6b60a8..08f744b9a55745ac1b1dcf0ed2ea735471cd7f89:/doc/doxygen/uhood_switch.doc diff --git a/doc/doxygen/uhood_switch.doc b/doc/doxygen/uhood_switch.doc index 814bf1ca4b..72a86050ea 100644 --- a/doc/doxygen/uhood_switch.doc +++ b/doc/doxygen/uhood_switch.doc @@ -1,5 +1,7 @@ /*! @page uhood_switch Process Synchronizations and Context Switching +@tableofcontents + @section uhood_switch_DES SimGrid as an Operating System SimGrid is a discrete event simulator of distributed systems: it does @@ -25,7 +27,7 @@ Mimicking the OS behavior may seem over-engineered here, but this is mandatory to the model-checker. The simcalls, representing actors' actions, are the transitions of the formal system. Verifying the system requires to manipulate these transitions explicitly. This also -allows to run safely the actors in parallel, even if this is less +allows one to run the actors safely in parallel, even if this is less commonly used by our users. So, the key ideas here are: @@ -96,7 +98,7 @@ producer calls `promise.set_value(42)` or `promise.set_exception(e)` in order to set the result which will be made available to the consumer by `future.get()`. -### Which future do we need? +@subsection uhood_switch_futures_needs Which future do we need? The blocking API provided by the standard C++11 futures does not suit our needs since the simulation kernel cannot block, and since @@ -129,7 +131,7 @@ API, with a few differences: - Some features of the standard (such as shared futures) are not needed in our context, and thus not considered here. -### Implementing `Future` and `Promise` +@subsection uhood_switch_futures_implem Implementing `Future` and `Promise` The `simgrid::kernel::Future` and `simgrid::kernel::Promise` use a shared state defined as follows: @@ -198,7 +200,7 @@ The crux of `future.then()` is: @code{cpp} template template -auto simgrid::kernel::Future::thenNoUnwrap(F continuation) +auto simgrid::kernel::Future::then_no_unwrap(F continuation) -> Future { typedef decltype(continuation(std::move(*this))) R; @@ -272,27 +274,44 @@ T simgrid::kernel::FutureState::get() } @endcode -## Generic simcalls +@section uhood_switch_simcalls Implementing the simcalls + +So a simcall is a way for the actor to push a request to the +simulation kernel and yield the control until the request is +fulfilled. The performance requirements are very high because +the actors usually do an inordinate amount of simcalls during the +simulation. + +As for real syscalls, the basic idea is to write the wanted call and +its arguments in a memory area that is specific to the actor, and +yield the control to the simulation kernel. Once in kernel mode, the +simcalls of each demanding actor are evaluated sequentially in a +strictly reproducible order. This makes the whole simulation +reproducible. -### Motivation -Simcalls are not so easy to understand and adding a new one is not so easy -either. In order to add one simcall, one has to first -add it to the [list of simcalls](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/simcalls.in) -which looks like this: +@subsection uhood_switch_simcalls_v2 The historical way + +In the very first implementation, everything was written by hand and +highly optimized, making our software very hard to maintain and +evolve. We decided to sacrifice some performance for +maintainability. In a second try (that is still in use in SimGrid +v3.13), we had a lot of boiler code generated from a python script, +taking the [list of simcalls](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/simcalls.in) +as input. It looks like this: @code{cpp} # This looks like C++ but it is a basic IDL-like language # (one definition per line) parsed by a python script: -void process_kill(smx_process_t process); +void process_kill(smx_actor_t process); void process_killall(int reset_pid); -void process_cleanup(smx_process_t process) [[nohandler]]; -void process_suspend(smx_process_t process) [[block]]; -void process_resume(smx_process_t process); -void process_set_host(smx_process_t process, sg_host_t dest); -int process_is_suspended(smx_process_t process) [[nohandler]]; -int process_join(smx_process_t process, double timeout) [[block]]; +void process_cleanup(smx_actor_t process) [[nohandler]]; +void process_suspend(smx_actor_t process) [[block]]; +void process_resume(smx_actor_t process); +void process_set_host(smx_actor_t process, sg_host_t dest); +int process_is_suspended(smx_actor_t process) [[nohandler]]; +int process_join(smx_actor_t process, double timeout) [[block]]; int process_sleep(double duration) [[block]]; smx_mutex_t mutex_init(); @@ -311,7 +330,7 @@ struct s_smx_simcall { // Simcall number: e_smx_simcall_t call; // Issuing actor: - smx_process_t issuer; + smx_actor_t issuer; // Arguments of the simcall: union u_smx_scalar args[11]; // Result of the simcall: @@ -342,9 +361,9 @@ union u_smx_scalar { }; @endcode -Then one has to call (manually:cry:) a -[Python script](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/simcalls.py) -which generates a bunch of C++ files: +When manually calling the relevant [Python +script](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/simcalls.py), +this generates a bunch of C++ files: * an enum of all the [simcall numbers](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/popping_enum.h#L19); @@ -352,7 +371,7 @@ which generates a bunch of C++ files: responsible for wrapping the parameters in the `struct s_smx_simcall`; and wrapping out the result; -* [accessors](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/popping_accessors.h) +* [accessors](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/popping_accessors.hpp) to get/set values of of `struct s_smx_simcall`; * a simulation-kernel-side [big switch](https://github.com/simgrid/simgrid/blob/4ae2fd01d8cc55bf83654e29f294335e3cb1f022/src/simix/popping_generated.cpp#L106) @@ -360,7 +379,7 @@ which generates a bunch of C++ files: Then one has to write the code of the kernel side handler for the simcall and the code of the simcall itself (which calls the code-generated -marshaling/unmarshaling stuff):sob:. +marshaling/unmarshaling stuff). In order to simplify this process, we added two generic simcalls which can be used to execute a function in the simulation kernel: @@ -428,7 +447,7 @@ xbt_dict_t Host::properties() { } @endcode -### Blocking simcall +### Blocking simcall {#uhood_switch_v2_blocking} The second generic simcall (`simcall_run_blocking()`) executes a function in the SimGrid simulation kernel immediately but does not wake up the calling actor @@ -449,20 +468,20 @@ kernel which will wake up the actor (with `simgrid::simix::unblock(actor)`) when the operation is completed. This is wrapped in a higher-level primitive as well. The -`kernelSync()` function expects a function-object which is executed +`kernel_sync()` function expects a function-object which is executed immediately in the simulation kernel and returns a `Future`. The simulator blocks the actor and resumes it when the `Future` becomes ready with its result: @code{cpp} template -auto kernelSync(F code) -> decltype(code().get()) +auto kernel_sync(F code) -> decltype(code().get()) { typedef decltype(code().get()) T; if (SIMIX_is_maestro()) xbt_die("Can't execute blocking call in kernel mode"); - smx_process_t self = SIMIX_process_self(); + smx_actor_t self = SIMIX_process_self(); simgrid::xbt::Result result; simcall_run_blocking([&result, self, &code]{ @@ -491,7 +510,7 @@ auto kernelSync(F code) -> decltype(code().get()) A contrived example of this would be: @code{cpp} -int res = simgrid::simix::kernelSync([&] { +int res = simgrid::simix::kernel_sync([&] { return kernel_wait_until(30).then( [](simgrid::kernel::Future future) { return 42; @@ -500,12 +519,12 @@ int res = simgrid::simix::kernelSync([&] { }); @endcode -### Asynchronous operations +### Asynchronous operations {#uhood_switch_v2_async} -We can write the related `kernelAsync()` which wakes up the actor immediately +We can write the related `kernel_async()` which wakes up the actor immediately and returns a future to the actor. As this future is used in the actor context, it is a different future -(`simgrid::simix::Future` instead of `simgrid::kernel::Furuere`) +(`simgrid::simix::Future` instead of `simgrid::kernel::Future`) which implements a C++11 `std::future` wait-based API: @code{cpp} @@ -532,7 +551,7 @@ T simgrid::simix::Future::get() { if (!valid()) throw std::future_error(std::future_errc::no_state); - smx_process_t self = SIMIX_process_self(); + smx_actor_t self = SIMIX_process_self(); simgrid::xbt::Result result; simcall_run_blocking([this, &result, self]{ try { @@ -553,12 +572,12 @@ T simgrid::simix::Future::get() } @endcode -`kernelAsync()` simply :wink: calls `kernelImmediate()` and wraps the +`kernel_async()` simply :wink: calls `kernelImmediate()` and wraps the `simgrid::kernel::Future` into a `simgrid::simix::Future`: @code{cpp} template -auto kernelAsync(F code) +auto kernel_async(F code) -> Future { typedef decltype(code().get()) T; @@ -575,7 +594,7 @@ auto kernelAsync(F code) A contrived example of this would be: @code{cpp} -simgrid::simix::Future future = simgrid::simix::kernelSync([&] { +simgrid::simix::Future future = simgrid::simix::kernel_sync([&] { return kernel_wait_until(30).then( [](simgrid::kernel::Future future) { return 42; @@ -586,186 +605,22 @@ do_some_stuff(); int res = future.get(); @endcode -`kernelSync()` could be rewritten as: +`kernel_sync()` could be rewritten as: @code{cpp} template -auto kernelSync(F code) -> decltype(code().get()) +auto kernel_sync(F code) -> decltype(code().get()) { - return kernelAsync(std::move(code)).get(); + return kernel_async(std::move(code)).get(); } @endcode The semantic is equivalent but this form would require two simcalls -instead of one to do the same job (one in `kernelAsync()` and one in +instead of one to do the same job (one in `kernel_async()` and one in `.get()`). -## Representing the simulated time - -SimGrid uses `double` for representing the simulated time: - -* durations are expressed in seconds; - -* timepoints are expressed as seconds from the beginning of the simulation. - -In contrast, all the C++ APIs use `std::chrono::duration` and -`std::chrono::time_point`. They are used in: - -* `std::this_thread::wait_for()` and `std::this_thread::wait_until()`; - -* `future.wait_for()` and `future.wait_until()`; - -* `condvar.wait_for()` and `condvar.wait_until()`. - -We can define `future.wait_for(duration)` and `future.wait_until(timepoint)` -for our futures but for better compatibility with standard C++ code, we might -want to define versions expecting `std::chrono::duration` and -`std::chrono::time_point`. - -For time points, we need to define a clock (which meets the -[TrivialClock](http://en.cppreference.com/w/cpp/concept/TrivialClock) -requirements, see -[`[time.clock.req]`](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf#page=642) -working in the simulated time in the C++14 standard): - -@code{cpp} -struct SimulationClock { - using rep = double; - using period = std::ratio<1>; - using duration = std::chrono::duration; - using time_point = std::chrono::time_point; - static constexpr bool is_steady = true; - static time_point now() - { - return time_point(duration(SIMIX_get_clock())); - } -}; -@endcode - -A time point in the simulation is a time point using this clock: - -@code{cpp} -template -using SimulationTimePoint = - std::chrono::time_point; -@endcode - -This is used for example in `simgrid::s4u::this_actor::sleep_for()` and -`simgrid::s4u::this_actor::sleep_until()`: - -@code{cpp} -void sleep_for(double duration) -{ - if (duration > 0) - simcall_process_sleep(duration); -} - -void sleep_until(double timeout) -{ - double now = SIMIX_get_clock(); - if (timeout > now) - simcall_process_sleep(timeout - now); -} - -template -void sleep_for(std::chrono::duration duration) -{ - auto seconds = - std::chrono::duration_cast(duration); - this_actor::sleep_for(seconds.count()); -} - -template -void sleep_until(const SimulationTimePoint& timeout_time) -{ - auto timeout_native = - std::chrono::time_point_cast(timeout_time); - this_actor::sleep_until(timeout_native.time_since_epoch().count()); -} -@endcode - -Which means it is possible to use (since C++14): - -@code{cpp} -using namespace std::chrono_literals; -simgrid::s4u::actor::sleep_for(42s); -@endcode - ## Mutexes and condition variables -## Mutexes - -SimGrid has had a C-based API for mutexes and condition variables for -some time. These mutexes are different from the standard -system-level mutex (`std::mutex`, `pthread_mutex_t`, etc.) because -they work at simulation-level. Locking on a simulation mutex does -not block the thread directly but makes a simcall -(`simcall_mutex_lock()`) which asks the simulation kernel to wake the calling -actor when it can get ownership of the mutex. Blocking directly at the -OS level would deadlock the simulation. - -Reusing the C++ standard API for our simulation mutexes has many -benefits: - - * it makes it easier for people familiar with the `std::mutex` to - understand and use SimGrid mutexes; - - * we can benefit from a proven API; - - * we can reuse from generic library code in SimGrid. - -We defined a reference-counted `Mutex` class for this (which supports -the [`Lockable`](http://en.cppreference.com/w/cpp/concept/Lockable) -requirements, see -[`[thread.req.lockable.req]`](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf#page=1175) -in the C++14 standard): - -@code{cpp} -class Mutex { - friend ConditionVariable; -private: - friend simgrid::simix::Mutex; - simgrid::simix::Mutex* mutex_; - Mutex(simgrid::simix::Mutex* mutex) : mutex_(mutex) {} -public: - - friend void intrusive_ptr_add_ref(Mutex* mutex); - friend void intrusive_ptr_release(Mutex* mutex); - using Ptr = boost::intrusive_ptr; - - // No copy: - Mutex(Mutex const&) = delete; - Mutex& operator=(Mutex const&) = delete; - - static Ptr createMutex(); - -public: - void lock(); - void unlock(); - bool try_lock(); -}; -@endcode - -The methods are simply wrappers around existing simcalls: - -@code{cpp} -void Mutex::lock() -{ - simcall_mutex_lock(mutex_); -} -@endcode - -Using the same API as `std::mutex` (`Lockable`) means we can use existing -C++-standard code such as `std::unique_lock` or -`std::lock_guard` for exception-safe mutex handling[^lock]: - -@code{cpp} -{ - std::lock_guard lock(*mutex); - sum += 1; -} -@endcode - ### Condition Variables Similarly SimGrid already had simulation-level condition variables @@ -774,7 +629,7 @@ which can be exposed using the same API as `std::condition_variable`: @code{cpp} class ConditionVariable { private: - friend s_smx_cond; + friend s_smx_cond_t; smx_cond_t cond_; ConditionVariable(smx_cond_t cond) : cond_(cond) {} public: @@ -853,20 +708,15 @@ std::cv_status ConditionVariable::wait_for( simcall_cond_wait_timeout(cond_, lock.mutex()->mutex_, timeout); return std::cv_status::no_timeout; } - catch (xbt_ex& e) { - + catch (const simgrid::TimeoutException& e) { // If the exception was a timeout, we have to take the lock again: - if (e.category == timeout_error) { - try { - lock.mutex()->lock(); - return std::cv_status::timeout; - } - catch (...) { - std::terminate(); - } + try { + lock.mutex()->lock(); + return std::cv_status::timeout; + } + catch (...) { + std::terminate(); } - - std::terminate(); } catch (...) { std::terminate(); @@ -914,7 +764,7 @@ We wrote two future implementations based on the `std::future` API: * the second one is a wait-based (`future.get()`) future used in the actors which waits using a simcall. -These futures are used to implement `kernelSync()` and `kernelAsync()` which +These futures are used to implement `kernel_sync()` and `kernel_async()` which expose asynchronous operations in the simulation kernel to the actors. In addition, we wrote variations of some other C++ standard library @@ -960,30 +810,14 @@ single-object without shared-state and synchronisation: @code{cpp} template class Result { - enum class ResultStatus { - invalid, - value, - exception, - }; public: - Result(); - ~Result(); - Result(Result const& that); - Result& operator=(Result const& that); - Result(Result&& that); - Result& operator=(Result&& that); bool is_valid() const; - void reset(); void set_exception(std::exception_ptr e); void set_value(T&& value); void set_value(T const& value); T get(); private: - ResultStatus status_ = ResultStatus::invalid; - union { - T value_; - std::exception_ptr exception_; - }; + boost::variant value_; }; @endcode~ @@ -1119,4 +953,4 @@ auto makeTask(F code, Args... args) in the simulation which we would like to avoid. `std::try_lock()` should be safe to use though. -*/ \ No newline at end of file +*/