The different models rely on a linear inequalities solver to share
the underlying resources. SimGrid allows you to change the solver, but
be cautious, **don't change it unless you are 100% sure**.
-
+
- items ``cpu/solver``, ``network/solver``, ``disk/solver`` and ``host/solver``
allow you to change the solver for each model:
Configuring SMPI
----------------
-The SMPI interface provides several specific configuration items.
+The SMPI interface provides several specific configuration items.
These are not easy to see with ``--help-cfg``, since SMPI binaries are usually launched through the ``smiprun`` script.
.. _cfg=smpi/host-speed:
Alternative path into which Eigen3 should be searched for.
SIMGRID_PYTHON_LIBDIR (auto-detected)
- Where to install the Python module library. By default, it is set to the cmake Python3_SITEARCH variable if installing to /usr,
- and a modified version of that variable if installing to another path. Just force another value if the auto-detected default
+ Where to install the Python module library. By default, it is set to the cmake Python3_SITEARCH variable if installing to /usr,
+ and a modified version of that variable if installing to another path. Just force another value if the auto-detected default
does not fit your setup.
SMPI_C_FLAGS, SMPI_CXX_FLAGS, SMPI_Fortran_FLAGS (string)
- The **simulated platform**. This is a description of a given
distributed system (machines, links, disks, clusters, etc). Most of
- the platform files are written in XML but a new C++ programmatic
- interface has recently been introduced. SimGrid makes it easy to
- augment the Simulated Platform with a Dynamic Scenario where for
- example the links are slowed down (because of external usage) or the
- machines fail. You even have support to specify the applicative
+ the platform files are written in XML but a new C++ programmatic
+ interface has recently been introduced. SimGrid makes it easy to
+ augment the Simulated Platform with a Dynamic Scenario where for
+ example the links are slowed down (because of external usage) or the
+ machines fail. You even have support to specify the applicative
workload that you want to feed to your application
:ref:`(more info) <platform>`.
ways to model it, SimGrid does not do any modeling choice for you but
the most obvious ones.
-Any actor running on a host that is shut down will be killed and all
-its activities will be automatically canceled. If the actor killed was
-marked as auto-restartable (with :cpp:func:`simgrid::s4u::Actor::set_auto_restart`),
+Any actor running on a host that is shut down will be killed and all
+its activities will be automatically canceled. If the actor killed was
+marked as auto-restartable (with :cpp:func:`simgrid::s4u::Actor::set_auto_restart`),
it will start anew with the same parameters when the host boots back up.
By default, shutdowns and boots are instantaneous. If you want to
FMI-based models
****************
-`FMI <https://fmi-standard.org/>`_ is a standard to exchange models between simulators. If you want to plug such a model
-into SimGrid, you need the `SimGrid-FMI external plugin <https://framagit.org/simgrid/simgrid-FMI>`_.
+`FMI <https://fmi-standard.org/>`_ is a standard to exchange models between simulators. If you want to plug such a model
+into SimGrid, you need the `SimGrid-FMI external plugin <https://framagit.org/simgrid/simgrid-FMI>`_.
There is a specific `documentation <https://simgrid.frama.io/simgrid-FMI/index.html>`_ available for the plugin.
-This was used to accurately study a *Smart grid* through co-simulation: `PandaPower <http://www.pandapower.org/>`_ was
-used to simulate the power grid, `ns-3 <https://nsnam.org/>`_ was used to simulate the communication network while SimGrid was
-used to simulate the IT infrastructure. Please also refer to the `relevant publication <https://hal.science/hal-03217562>`_
+This was used to accurately study a *Smart grid* through co-simulation: `PandaPower <http://www.pandapower.org/>`_ was
+used to simulate the power grid, `ns-3 <https://nsnam.org/>`_ was used to simulate the communication network while SimGrid was
+used to simulate the IT infrastructure. Please also refer to the `relevant publication <https://hal.science/hal-03217562>`_
for more details.
.. _models_other:
ns-3
====
-When using :ref:`models_ns3`, SimGrid configures the ns-3 simulator according to the configured platform.
+When using :ref:`models_ns3`, SimGrid configures the ns-3 simulator according to the configured platform.
Since ns-3 uses a shortest path algorithm on its side, all routes must be of length 1.
.. _pf_routes:
**On the maintainance front,** we removed the ancient MSG interface which end-of-life was scheduled for 2020, the Java
bindings that was MSG-only and support for native builds on Windows (WSL is now required). Keeping SimGrid alive while
-adding new features require to remove old, unused stuff. The very rare users impacted by these removals are urged to
+adding new features require to remove old, unused stuff. The very rare users impacted by these removals are urged to
move to the new API and systems.
**On the model front,** we realized an idea that has been on the back of our minds for quite some time. The question
application. We will detail each part of the code and the necessary
configuration to make it work. After this tour, several exercises
are proposed to let you discover some of the SimGrid features, hands
-on the keyboard. This practical session will be given in C++ or Python,
+on the keyboard. This practical session will be given in C++ or Python,
which you are supposed to know beforehand.
.. image:: /tuto_s4u/img/intro.svg
:align: center
-The provided code dispatches these tasks in `round-robin scheduling <https://en.wikipedia.org/wiki/Round-robin_scheduling>`_,
+The provided code dispatches these tasks in `round-robin scheduling <https://en.wikipedia.org/wiki/Round-robin_scheduling>`_,
i.e. in circular order: tasks are dispatched to each worker one after the other, until all tasks are dispatched.
You will improve this scheme later in this tutorial.
.. group-tab:: Python
- That being said, an algorithm alone is not enough to define a simulation:
+ That being said, an algorithm alone is not enough to define a simulation:
you need a main block to setup the simulation and its components as follows.
This code creates a SimGrid simulation engine (on line 4), registers the actor
functions to the engine (on lines 7 and 8), loads the simulated platform
Using Docker
............
-The easiest way to take the tutorial is to use the dedicated Docker image.
+The easiest way to take the tutorial is to use the dedicated Docker image.
Once you `installed Docker itself <https://docs.docker.com/install/>`_, simply do the following:
.. code-block:: console
.. group-tab:: Python
To take the tutorial on your machine, you first need to :ref:`install
- a recent version of SimGrid <install>` and ``pajeng`` to visualize the
+ a recent version of SimGrid <install>` and ``pajeng`` to visualize the
traces. You may want to install `Vite <http://vite.gforge.inria.fr/>`_ to get a first glance at the traces.
On Debian and Ubuntu for example, you can get them as follows:
$ sudo apt install simgrid pajeng vite
- An initial version of the source code is provided on framagit.
+ An initial version of the source code is provided on framagit.
If SimGrid is correctly installed, you should be able to clone the `repository
<https://framagit.org/simgrid/simgrid-template-s4u>`_ and execute it as follows:
.. warning::
- If you use the stable version of Debian 11, Ubuntu 21.04 or Ubuntu 21.10, then you need the right version of this tutorial
+ If you use the stable version of Debian 11, Ubuntu 21.04 or Ubuntu 21.10, then you need the right version of this tutorial
(add ``--branch simgrid-v3.25`` as below). These distributions only contain SimGrid v3.25 while the latest version of this
tutorial needs at least SimGrid v3.27.
$ make master-workers
$ ./master-workers small_platform.xml master-workers_d.xml
- If you get an error message complaining that ``simgrid::s4u::Mailbox::get()`` does not exist,
+ If you get an error message complaining that ``simgrid::s4u::Mailbox::get()`` does not exist,
then your version of SimGrid is too old for the version of the tutorial that you got. Check again previous section.
.. group-tab:: Python
$ python master-workers.py small_platform.xml master-workers_d.xml
- If you get an error stating that the simgrid module does not exist, you need to get a newer version of SimGrid.
+ If you get an error stating that the simgrid module does not exist, you need to get a newer version of SimGrid.
You may want to take the tutorial from the docker to get the newest version.
For a classical Gantt-Chart visualization, you can use `Vite
.. rst-class:: compact-list
- **Learning goals:**
+ **Learning goals:**
* Get your hands on the code and change the communication pattern
* Discover the Mailbox mechanism
.. code-block:: cpp
- for i in range(tasks_count):
+ for i in range(tasks_count):
mailbox = Mailbox.by_name(str(i % worker_count))
mailbox.put(...)
initiators' location and then the real communication occurs between
the involved parties.
-Please refer to the full `Mailboxes' documentation <app_s4u.html#s4u-mailbox>`_
+Please refer to the full `Mailboxes' documentation <app_s4u.html#s4u-mailbox>`_
for more details.
.. rst-class:: compact-list
- **Learning goals:**
+ **Learning goals:**
* Interact with the platform (get the list of all hosts)
* Create actors directly from your program instead of the deployment file
.. group-tab:: Python
For that, the master needs to retrieve the list of hosts declared in
- the platform with :py:func:`simgrid.Engine.get_all_hosts`. Since this method is not static,
+ the platform with :py:func:`simgrid.Engine.get_all_hosts`. Since this method is not static,
you may want to call it on the Engine instance, as in ``Engine.instance().get_all_hosts()``.
Then, the master should start the worker actors with :py:func:`simgrid.Actor.create`.
.. rst-class:: compact-list
- **Learning goals:**
+ **Learning goals:**
* Forcefully kill actors, and stop the simulation at a given point of time
* Control the logging verbosity
Of course, usual time functions like ``gettimeofday`` will give you the
time on your real machine, which is pretty useless in the
simulation. Instead, retrieve the time in the simulated world with
-:cpp:func:`simgrid::s4u::Engine::get_clock` (C++) or
+:cpp:func:`simgrid::s4u::Engine::get_clock` (C++) or
:py:func:`simgrid.Engine.get_clock()`) (Python).
You can still stop your workers with a specific task as previously,
.................................
Not all messages are equally informative, so you probably want to
-change some of the *info* messages (C: :c:macro:`XBT_INFO`; Python: :py:func:`simgrid.this_actor.info`)
+change some of the *info* messages (C: :c:macro:`XBT_INFO`; Python: :py:func:`simgrid.this_actor.info`)
into *debug* messages`(C: :c:macro:`XBT_DEBUG`; Python: :py:func:`simgrid.this_actor.debug`) so that they are
hidden by default. For example, you may want to use an *info* message once
every 100 tasks and *debug* when sending all the other tasks. Or
.. rst-class:: compact-list
- **Learning goals:**
+ **Learning goals:**
* Change the platform characteristics during the simulation.
* Explore other communication patterns.
...................
Attach a profile to your hosts, so that their computational speed automatically vary over time, modeling an external load on these machines.
-This can be done with :cpp:func:`simgrid::s4u::Host::set_speed_profile` (C++) or :py:func:`simgrid.Host.set_speed_profile` (Python).
+This can be done with :cpp:func:`simgrid::s4u::Host::set_speed_profile` (C++) or :py:func:`simgrid.Host.set_speed_profile` (Python).
Make it so that one of the hosts get really really slow, and observe how your whole application performance decreases.
-This is because one slow host slows down the whole process. Instead of a round-robin dispatch push,
+This is because one slow host slows down the whole process. Instead of a round-robin dispatch push,
you should completely reorganize your application in a First-Come First-Served manner (FCFS).
Actors should pull a task whenever they are ready, so that fast actors can overpass slow ones in the queue.
to what would happen with communications based on BSD sockets while the second is closer to message queues. You could also decide to
model your socket application in the second manner if you want to neglect these details and keep your simulator simple. It's your decision.
-Changing the communication schema can be a bit hairy, but once it works, you will see that such as simple FCFS schema allows one to greatly
+Changing the communication schema can be a bit hairy, but once it works, you will see that such as simple FCFS schema allows one to greatly
increase the amount of tasks handled over time here. Things may be different with another platform file.
Communication speed
You can even have the bandwidth automatically vary over time with :cpp:func:`simgrid::s4u::Link::set_bandwidth_profile` (C++) or :py:func:`simgrid.Link.set_bandwidth_profile` (python).
Once implemented, you will notice that slow communications may still result in situations
-where one worker only works at a given point of time. To overcome that, your master needs
-to send data to several workers in parallel, using
+where one worker only works at a given point of time. To overcome that, your master needs
+to send data to several workers in parallel, using
:cpp:func:`simgrid::s4u::Mailbox::put_async` (C++) or :py:func:`simgrid.Mailbox.put_async` (Python)
to start several communications in parallel, and
-:cpp:func:`simgrid::s4u::Comm::wait_any` (C++) or and :py:func:`simgrid.Comm.wait_any` (Python)
-to react to the completion of one of these communications. Actually, since this code somewhat tricky
-to write, it's provided as :ref:`an example <s4u_ex_communication>` in the distribution (search for
-``wait_any`` in that page).
+:cpp:func:`simgrid::s4u::Comm::wait_any` (C++) or and :py:func:`simgrid.Comm.wait_any` (Python)
+to react to the completion of one of these communications. Actually, since this code somewhat tricky
+to write, it's provided as :ref:`an example <s4u_ex_communication>` in the distribution (search for
+``wait_any`` in that page).
Dealing with failures
.....................
-Turn a given link off with :cpp:func:`simgrid::s4u::Link::turn_off` (C++) or :py:func:`simgrid.Link.turn_off` (python).
-You can even implement churn where a link automatically turn off and on again over time with :cpp:func:`simgrid::s4u::Link::set_state_profile` (C++) or :py:func:`simgrid.Link.set_state_profile` (python).
+Turn a given link off with :cpp:func:`simgrid::s4u::Link::turn_off` (C++) or :py:func:`simgrid.Link.turn_off` (python).
+You can even implement churn where a link automatically turn off and on again over time with :cpp:func:`simgrid::s4u::Link::set_state_profile` (C++) or :py:func:`simgrid.Link.set_state_profile` (python).
-If a link fails while you try to use it, ``wait()`` will raise a ``NetworkFailureException`` that you need to catch.
+If a link fails while you try to use it, ``wait()`` will raise a ``NetworkFailureException`` that you need to catch.
Again, there is a nice example demoing this feature, :ref:`under platform-failures <s4u_ex_communication>`.
Lab 5: Competing Applications
.. rst-class:: compact-list
- **Learning goals:**
+ **Learning goals:**
* Advanced vizualization through tracing categories
Instead of starting the execution in one function call only with
``this_actor::execute(cost)``, you need to
-create the execution activity, set its tracing category, start it
+create the execution activity, set its tracing category, start it
and wait for its completion, as follows.
.. tabs::
.. toggle-header::
:header: Code of ``ndet-receive-s4u.cpp``: click here to open
-
+
You can also `view it online <https://framagit.org/simgrid/tutorial-model-checking/-/blob/main/ndet-receive-s4u.cpp>`_
.. literalinclude:: tuto_mc/ndet-receive-s4u.cpp
:``gw_src``: Netpoint (within src zone) from which this route starts. Must be an existing host/router.
:``gw_dst``: Netpoint (within dst zone) to which this route leads. Must be an existing host/router.
:``symmetrical``: Whether this route is symmetrical, ie, whether we are defining the route ``dst -> src`` at the same
- time. Valid values: ``yes``, ``no``, ``YES``, ``NO``.
+ time. Valid values: ``yes``, ``no``, ``YES``, ``NO``.
-------------------------------------------------------------------------------
:``gw_src``: Netpoint (within src zone) from which this route starts. Must be an existing host/router.
:``gw_dst``: Netpoint (within dst zone) to which this route leads. Must be an existing host/router.
:``symmetrical``: Whether this route is symmetrical, ie, whether we are defining the route ``dst -> src`` at the same
- time. Valid values: ``yes``, ``no``, ``YES``, ``NO``.
+ time. Valid values: ``yes``, ``no``, ``YES``, ``NO``.
-------------------------------------------------------------------------------
.. |br| raw:: html
-
+
<br />
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Actor
Basic management
.. doxygenclass:: simgrid::s4u::Engine
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Engine
Engin initialization
.. doxygenclass:: simgrid::s4u::Mailbox
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Mailbox
Please also refer to the :ref:`full doc on s4u::Mailbox <s4u_mailbox>`.
.. doxygenclass:: simgrid::s4u::Disk
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Disk
.. group-tab:: C
.. doxygenclass:: simgrid::s4u::Host
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Host
Basic management
.. autoattribute:: simgrid.Host.netpoint
.. automethod:: simgrid.Host.create_disk
-
+
.. automethod:: simgrid.Host.route_to
.. group-tab:: C
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Link
Basic management
.. doxygenclass:: simgrid::s4u::NetZone
.. group-tab:: Python
-
+
.. autoclass:: simgrid.NetZone
Basic management
.. doxygenclass:: simgrid::s4u::Comm
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Comm
Basic management
.. doxygenclass:: simgrid::s4u::Exec
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Exec
Basic management
.. doxygenclass:: simgrid::s4u::Io
.. group-tab:: Python
-
+
.. autoclass:: simgrid.Io
Basic management
Basic asynchronous communications
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Illustrates how to have non-blocking communications, that are communications running in the background leaving the process
+Illustrates how to have non-blocking communications, that are communications running in the background leaving the process
free to do something else during their completion.
.. tabs::
Dealing with network failures
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-This examples shows how to survive to network exceptions that occur when a link is turned off, or when the actor with whom
+This examples shows how to survive to network exceptions that occur when a link is turned off, or when the actor with whom
you communicate fails because its host is turned off. In this case, any blocking operation such as ``put``, ``get`` or
``wait`` will raise an exception that you can catch and react to. See also :ref:`howto_churn`,
-:ref:`this example <s4u_ex_platform_state_profile>` on how to attach a state profile to hosts and
+:ref:`this example <s4u_ex_platform_state_profile>` on how to attach a state profile to hosts and
:ref:`that example <s4u_ex_exec_failure>` on how to react to host failures.
.. tabs::
Dealing with host failures
^^^^^^^^^^^^^^^^^^^^^^^^^^
-This examples shows how to survive to host failure exceptions that occur when an host is turned off. The actor do not get notified when the host
+This examples shows how to survive to host failure exceptions that occur when an host is turned off. The actor do not get notified when the host
on which they run is turned off: they are just terminated in this case, and their ``on_exit()`` callback gets executed. For remote executions on
-failing hosts however, any blocking operation such as ``exec`` or ``wait`` will raise an exception that you can catch and react to. See also
+failing hosts however, any blocking operation such as ``exec`` or ``wait`` will raise an exception that you can catch and react to. See also
:ref:`howto_churn`,
:ref:`this example <s4u_ex_platform_state_profile>` on how to attach a state profile to hosts, and
:ref:`that example <s4u_ex_comm_failure>` on how to react to networ failures.
^^^^^^^^^^^^^^^^^^^^^^^^^
Shows how to specify when the resources must be turned off and on again, and how to react to such
-failures in your code. See also :ref:`howto_churn`,
-:ref:`this example <s4u_ex_comm_failure>` on how to react to communication failures, and
+failures in your code. See also :ref:`howto_churn`,
+:ref:`this example <s4u_ex_comm_failure>` on how to react to communication failures, and
:ref:`that example <s4u_ex_exec_failure>` on how to react to host failures.
.. tabs::
Modifying the platform
----------------------
-Serializing communications
+Serializing communications
^^^^^^^^^^^^^^^^^^^^^^^^^^
This example shows how to limit the amount of communications going through a given link.
It is very similar to the other asynchronous communication examples, but messages get serialized by the platform.
Without this call to ``Link::set_concurrency_limit(2)``, all messages would be received at the exact same timestamp since
-they are initiated at the same instant and are of the same size. But with this extra configuration to the link, at most 2
+they are initiated at the same instant and are of the same size. But with this extra configuration to the link, at most 2
messages can travel through the link at the same time.
.. tabs::