.. _S4U_doc: The S4U Interface ################# .. raw:: html

The S4U interface (SimGrid for you) mixes the full power of SimGrid with the full power of C++. This is the preferred interface to describe abstract algorithms in the domains of Cloud, P2P, HPC, IoT, and similar settings. Since v3.20 (June 2018), S4U is definitely the way to go for long-term projects. It is feature complete, but may still evolve slightly in the future releases. It can already be used to do everything that can be done in SimGrid, but you may have to adapt your code in future releases. When this happens, compiling your code will produce deprecation warnings for 4 releases (one year) before the removal of the old symbols. If you want an API that will never ever evolve in the future, you should use the deprecated MSG API instead. Main Concepts ************* A typical SimGrid simulation is composed of several |API_s4u_Actors|_, that execute user-provided functions. The actors have to explicitly use the S4U interface to express their :ref:`computation `, :ref:`communication `, :ref:`disk usage `, and other |API_s4u_Activities|_, so that they get reflected within the simulator. These activities take place on resources such as |API_s4u_Hosts|_, |API_s4u_Links|_ and |API_s4u_Storages|_. SimGrid predicts the time taken by each activity and orchestrates the actors accordingly, waiting for the completion of these activities. When **communicating**, data is not directly sent to other actors but posted onto a |API_s4u_Mailbox|_ that serves as a rendez-vous point between communicating actors. This means that you don't need to know who you are talking to, you just put your communication `Put` request in a mailbox, and it will be matched with a complementary `Get` request. Alternatively, actors can interact through **classical synchronization mechanisms** such as |API_s4u_Barrier|_, |API_s4u_Semaphore|_, |API_s4u_Mutex|_ and |API_s4u_ConditionVariable|_. Each actor is located on a simulated |API_s4u_Host|_. Each host is located itself in a |API_s4u_NetZone|_, that knows the networking path between one resource to another. Each NetZone is included in another one, forming a tree of NetZones which root zone contains the whole platform. The actors can also be located on a |API_s4U_VirtualMachine|_ that may restrict the activities it contains to a limited amount of cores. Virtual machines can also be migrated between hosts. The :ref:`simgrid::s4u::this_actor ` namespace provides many helper functions to simplify the code of actors. - **Global Classes** - :ref:`class s4u::Actor `: Active entities executing your application. - :ref:`class s4u::Engine ` Simulation engine (singleton). - :ref:`class s4u::Mailbox ` Communication rendez-vous. - **Platform Elements** - :ref:`class s4u::Host `: Actor location, providing computational power. - :ref:`class s4u::Link ` Interconnecting hosts. - :ref:`class s4u::NetZone `: Sub-region of the platform, containing resources (Hosts, Links, etc). - :ref:`class s4u::Storage ` Resource on which actors can write and read data. - :ref:`class s4u::VirtualMachine `: Execution containers that can be moved between Hosts. - **Activities** (:ref:`class s4u::Activity `): The things that actors can do on resources - :ref:`class s4u::Comm ` Communication activity, started on Mailboxes and consuming links. - :ref:`class s4u::Exec ` Computation activity, started on Host and consuming CPU resources. - :ref:`class s4u::Io ` I/O activity, started on and consumming Storages. - **Synchronization Mechanisms**: Classical IPC that actors can use - :ref:`class s4u::Barrier ` - :ref:`class s4u::ConditionVariable ` - :ref:`class s4u::Mutex ` - :ref:`class s4u::Semaphore ` .. |API_s4u_Actors| replace:: **Actors** .. _API_s4u_Actors: #s4u-actor .. |API_s4u_Activities| replace:: **Activities** .. _API_s4u_Activities: #s4u-activity .. |API_s4u_Hosts| replace:: **Hosts** .. _API_s4u_Hosts: #s4u-host .. |API_s4u_Links| replace:: **Links** .. _API_s4u_Links: #s4u-link .. |API_s4u_Storages| replace:: **Storages** .. _API_s4u_Storages: #s4u-storage .. |API_s4u_VirtualMachine| replace:: **VirtualMachines** .. |API_s4u_Host| replace:: **Host** .. |API_s4u_Mailbox| replace:: **Mailbox** .. |API_s4u_Mailboxes| replace:: **Mailboxes** .. _API_s4u_Mailboxes: #s4u-mailbox .. |API_s4u_NetZone| replace:: **NetZone** .. |API_s4u_Barrier| replace:: **Barrier** .. |API_s4u_Semaphore| replace:: **Semaphore** .. |API_s4u_ConditionVariable| replace:: **ConditionVariable** .. |API_s4u_Mutex| replace:: **Mutex** .. THE EXAMPLES .. include:: ../../examples/s4u/README.rst Activities ********** Activities represent the actions that consume a resource, such as a :ref:`s4u::Comm ` that consumes the *transmiting power* of :ref:`s4u::Link ` resources. ======================= Asynchronous Activities ======================= Every activity can be either **blocking** or **asynchronous**. For example, :cpp:func:`s4u::Mailbox::put() ` and :cpp:func:`s4u::Mailbox::get() ` create blocking communications: the actor is blocked until the completion of that communication. Asynchronous communications do not block the actor during their execution but progress on their own. Once your asynchronous activity is started, you can test for its completion using :cpp:func:`s4u::Activity::test() `. This function returns ``true`` if the activity completed already. You can also use :cpp:func:`s4u::Activity::wait() ` to block until the completion of the activity. To wait for at most a given amount of time, use :cpp:func:`s4u::Activity::wait_for() `. Finally, to wait at most until a specified time limit, use :cpp:func:`s4u::Activity::wait_until() `. .. todo:: wait_for and wait_until are currently not implemented for Exec and Io activities. Every kind of activities can be asynchronous: - :ref:`s4u::CommPtr ` are created with :cpp:func:`s4u::Mailbox::put_async() ` and :cpp:func:`s4u::Mailbox::get_async() `. - :ref:`s4u::IoPtr ` are created with :cpp:func:`s4u::Storage::read_async() ` and :cpp:func:`s4u::Storage::write_async() `. - :ref:`s4u::ExecPtr ` are created with :cpp:func:`s4u::Host::exec_async() `. - In the future, it will become possible to have asynchronous IPC such as asynchronous mutex lock requests. The following example shows how to have several concurrent communications ongoing. First, you have to declare a vector in which we will store the ongoing communications. It is also useful to have a vector of mailboxes. .. literalinclude:: ../../examples/s4u/async-waitall/s4u-async-waitall.cpp :language: c++ :start-after: init-begin :end-before: init-end :dedent: 4 Then, you start all the communications that should occur concurrently with :cpp:func:`s4u::Mailbox::put_async() `. Finally, the actor waits for the completion of all of them at once with :cpp:func:`s4u::Comm::wait_all() `. .. literalinclude:: ../../examples/s4u/async-waitall/s4u-async-waitall.cpp :language: c++ :start-after: put-begin :end-before: put-end :dedent: 4 ===================== Activities Life cycle ===================== Sometimes, you want to change the setting of an activity before it even starts. .. todo:: write this section .. _s4u_mailbox: Mailboxes ********* Please also refer to the :ref:`API reference for s4u::Mailbox `. =================== What are Mailboxes? =================== |API_s4u_Mailboxes|_ are rendez-vous points for network communications, similar to URLs on which you could post and retrieve data. Actually, the mailboxes are not involved in the communication once it starts, but only to find the contact with which you want to communicate. They are similar to many common things: The phone number, which allows the caller to find the receiver. The twitter hashtag, which help senders and receivers to find each others. In TCP, the pair ``{host name, host port}`` to which you can connect to find your peer. In HTTP, URLs through which the clients can connect to the servers. In ZeroMQ, the queues are used to match senders and receivers. One big difference with most of these systems is that no actor is the exclusive owner of a mailbox, neither in sending nor in receiving. Many actors can send into and/or receive from the same mailbox. TCP socket ports for example are shared on the sender side but exclusive on the receiver side (only one process can receive from a given socket at a given point of time). A big difference with TCP sockets or MPI communications is that communications do not start right away after a :cpp:func:`Mailbox::put() `, but wait for the corresponding :cpp:func:`Mailbox::get() `. You can change this by :ref:`declaring a receiving actor `. A big difference with twitter hashtags is that SimGrid does not offer easy support to broadcast a given message to many receivers. So that would be like a twitter tag where each message is consumed by the first receiver. A big difference with the ZeroMQ queues is that you cannot filter on the data you want to get from the mailbox. To model such settings in SimGrid, you'd have one mailbox per potential topic, and subscribe to each topic individually with a :cpp:func:`get_async() ` on each mailbox. Then, use :cpp:func:`Comm::wait_any() ` to get the first message on any of the mailbox you are subscribed onto. The mailboxes are not located on the network, and you can access them without any latency. The network delay are only related to the location of the sender and receiver once the match between them is done on the mailbox. This is just like the phone number that you can use locally, and the geographical distance only comes into play once you start the communication by dialing this number. ===================== How to use Mailboxes? ===================== You can retrieve any existing mailbox from its name (which is a unique string, just like a twitter tag). This results in a versatile mechanism that can be used to build many different situations. To model classical socket communications, use "hostname:port" as mailbox names, and make sure that only one actor reads into a given mailbox. This does not make it easy to build a perfectly realistic model of the TCP sockets, but in most cases, this system is too cumbersome for your simulations anyway. You probably want something simpler, that turns our to be easy to build with the mailboxes. Many SimGrid examples use a sort of yellow page system where the mailbox names are the name of the service (such as "worker", "master" or "reducer"). That way, you don't have to know where your peer is located to contact it. You don't even need its name. Its function is enough for that. This also gives you some sort of load balancing for free if more than one actor pulls from the mailbox: the first actor that can deal with the request will handle it. ========================================= How put() and get() Requests are Matched? ========================================= The matching algorithm simple: first come, first serve. When a new send arrives, it matches the oldest enqueued receive. If no receive is currently enqueued, then the incoming send is enqueued. As you can see, the mailbox cannot contain both send and receive requests: all enqueued requests must be of the same sort. .. _s4u_receiving_actor: =========================== Declaring a Receiving Actor =========================== The last twist is that by default in the simulator, the data starts to be exchanged only when both the sender and the receiver are declared (it waits until both :cpp:func:`put() ` and :cpp:func:`get() ` are posted). In TCP, since you establish connexions beforehand, the data starts to flow as soon as the sender posts it, even if the receiver did not post its :cpp:func:`recv() ` yet. To model this in SimGrid, you can declare a specific receiver to a given mailbox (with the function :cpp:func:`set_receiver() `). That way, any :cpp:func:`put() ` posted to that mailbox will start as soon as possible, and the data will already be there on the receiver host when the receiver actor posts its :cpp:func:`get() ` Note that being permanent receivers of a mailbox prevents actors to be garbage-collected. If your simulation creates many short-lived actors that marked as permanent receiver, you should call ``mailbox->set_receiver(nullptr)`` by the end of the actors so that their memory gets properly reclaimed. This call should be at the end of the actor's function, not in a on_exit callback. Memory Management ***************** For sake of simplicity, we use `RAII `_ everywhere in S4U. This is an idiom where resources are automatically managed through the context. Provided that you never manipulate objects of type Foo directly but always FooPtr references (which are defined as `boost::intrusive_ptr `_ ), you will never have to explicitely release the resource that you use nor to free the memory of unused objects. Here is a little example: .. code-block:: cpp void myFunc() { simgrid::s4u::MutexPtr mutex = simgrid::s4u::Mutex::create(); // Too bad we cannot use `new` mutex->lock(); // use the mutex as a simple reference // bla bla mutex->unlock(); } // The mutex gets automatically freed because the only existing reference gets out of scope API Reference ************* .. _API_s4u_this_actor: ========================= namespace s4u::this_actor ========================= .. doxygennamespace:: simgrid::s4u::this_actor .. _API_s4u_Activity: ============= s4u::Activity ============= .. doxygenclass:: simgrid::s4u::Activity :members: :protected-members: :undoc-members: .. _API_s4u_Actor: ========== s4u::Actor ========== .. doxygentypedef:: ActorPtr .. doxygentypedef:: aid_t .. doxygenclass:: simgrid::s4u::Actor :members: :protected-members: :undoc-members: .. _API_s4u_Barrier: ============ s4u::Barrier ============ .. doxygentypedef:: BarrierPtr .. doxygenclass:: simgrid::s4u::Barrier :members: :protected-members: :undoc-members: .. _API_s4u_Comm: ========= s4u::Comm ========= .. doxygentypedef:: CommPtr .. doxygenclass:: simgrid::s4u::Comm :members: :protected-members: :undoc-members: .. _API_s4u_ConditionVariable: ====================== s4u::ConditionVariable ====================== .. doxygentypedef:: ConditionVariablePtr .. doxygenclass:: simgrid::s4u::ConditionVariable :members: :protected-members: :undoc-members: .. _API_s4u_Engine: =========== s4u::Engine =========== .. doxygenclass:: simgrid::s4u::Engine :members: :protected-members: :undoc-members: .. _API_s4u_Exec: ========= s4u::Exec ========= .. doxygentypedef:: ExecPtr .. doxygenclass:: simgrid::s4u::Exec :members: :protected-members: :undoc-members: .. _API_s4u_Host: ========= s4u::Host ========= .. doxygenclass:: simgrid::s4u::Host :members: :protected-members: :undoc-members: .. _API_s4u_Io: ======= s4u::Io ======= .. doxygentypedef:: IoPtr .. doxygenclass:: simgrid::s4u::Io :members: :protected-members: :undoc-members: .. _API_s4u_Link: ========= s4u::Link ========= .. doxygenclass:: simgrid::s4u::Link :members: :protected-members: :undoc-members: .. _API_s4u_Mailbox: ============ s4u::Mailbox ============ Please also refer to the :ref:`full doc on s4u::Mailbox `. .. doxygentypedef:: MailboxPtr .. doxygenclass:: simgrid::s4u::Mailbox :members: :protected-members: :undoc-members: .. _API_s4u_Mutex: ========== s4u::Mutex ========== .. doxygentypedef:: MutexPtr .. doxygenclass:: simgrid::s4u::Mutex :members: :protected-members: :undoc-members: .. _API_s4u_NetZone: ============ s4u::NetZone ============ .. doxygenclass:: simgrid::s4u::NetZone :members: :protected-members: :undoc-members: .. _API_s4u_Semaphore: ============== s4u::Semaphore ============== .. doxygentypedef:: SemaphorePtr .. doxygenclass:: simgrid::s4u::Semaphore :members: :protected-members: :undoc-members: .. _API_s4u_Storage: ============ s4u::Storage ============ .. doxygenclass:: simgrid::s4u::Storage :members: :protected-members: :undoc-members: .. _API_s4u_VirtualMachine: =================== s4u::VirtualMachine =================== .. doxygenclass:: simgrid::s4u::VirtualMachine :members: :protected-members: :undoc-members: