environment. In fact, almost all proxy apps provided by the `ExaScale
Project <https://proxyapps.exascaleproject.org/>`_ only require minor
modifications to `run on top of SMPI
-<https://github.com/simgrid/SMPI-proxy-apps/>`_.
+<https://framagit.org/simgrid/SMPI-proxy-apps>`_.
-This setting permits to debug your MPI applications in a perfectly
-reproducible setup, with no Heisenbugs. Enjoy the full Clairevoyance
-provided by the simulator while running what-if analysis on platforms
+This setting permits one to debug your MPI applications in a perfectly
+reproducible setup, with no Heisenbugs. Enjoy the full Clairvoyance
+provided by the simulator while running what-if analyses on platforms
that are still to be built! Several `production-grade MPI applications
<https://framagit.org/simgrid/SMPI-proxy-apps#full-scale-applications>`_
use SimGrid for their integration and performance testing.
In SMPI, communications are simulated while computations are
emulated. This means that while computations occur as they would in
-the real systems, communication calls are intercepted and achived by
+the real systems, communication calls are intercepted and achieved by
the simulator.
To start using SMPI, you just need to compile your application with
communication calls are implemented using SimGrid: data is exchanged
through memory copy, while the simulator's performance models are used
to predict the time taken by each communications. Any computations
-occuring between two MPI calls are benchmarked, and the corresponding
+occurring between two MPI calls are benchmarked, and the corresponding
time is reported into the simulator.
.. image:: /tuto_smpi/img/big-picture.svg
Simple Example with 3 hosts
...........................
-At the most basic level, you can describe your simulated platform as a
-graph of hosts and network links. For instance:
+Imagine you want to describe a little platform with three hosts,
+interconnected as follows:
.. image:: /tuto_smpi/3hosts.png
:align: center
+This can be done with the following platform file, that considers the
+simulated platform as a graph of hosts and network links.
+
.. literalinclude:: /tuto_smpi/3hosts.xml
:language: xml
-Note the way in which hosts, links, and routes are defined in
-this XML. All hosts are defined with a speed (in Gflops), and links
-with a latency (in us) and bandwidth (in MBytes per second). Other
-units are possible and written as expected. Routes specify the list of
-links encountered from one route to another. Routes are symmetrical by
-default.
+The elements basic elements (with :ref:`pf_tag_host` and
+:ref:`pf_tag_link`) are described first, and then the routes between
+any pair of hosts are explicitly given with :ref:`pf_tag_route`.
+
+Any host must be given a computational speed in flops while links must
+be given a latency and a bandwidth. You can write 1Gf for
+1,000,000,000 flops (full list of units in the reference guide of
+:ref:`pf_tag_host` and :ref:`pf_tag_link`).
+
+Routes defined with :ref:`pf_tag_route` are symmetrical by default,
+meaning that the list of traversed links from A to B is the same as
+from B to A. Explicitly define non-symmetrical routes if you prefer.
Cluster with a Crossbar
.......................
.. code-block:: shell
- sudo apt install simgrid pajeng make gcc g++ gfortran vite
+ sudo apt install simgrid pajeng make gcc g++ gfortran python3 vite
-For R analysis of the produced traces, you may want to install R,
-and the `pajengr<https://github.com/schnorr/pajengr#installation/>`_ package.
+For R analysis of the produced traces, you may want to install R,
+and the `pajengr <https://github.com/schnorr/pajengr#installation/>`_ package.
.. code-block:: shell
the documentation is up-to-date.
Lab 3: Execution Sampling on Matrix Multiplication example
--------------------------------
+----------------------------------------------------------
The second method to speed up simulations is to sample the computation
parts in the code. This means that the person doing the simulation
simulation accuracy. Furthermore there should not be any MPI calls
inside such parts of the code.
-Use for this part the `gemm_mpi.c
+Use for this part the `gemm_mpi.cpp
<https://gitlab.com/PRACE-4IP/CodeVault/raw/master/hpc_kernel_samples/dense_linear_algebra/gemm/mpi/src/gemm_mpi.cpp>`_
example, which is provided by the `PRACE Codevault repository
<http://www.prace-ri.eu/prace-codevault/>`_.
The computing part of this example is the matrix multiplication routine
.. literalinclude:: /tuto_smpi/gemm_mpi.cpp
- :language: c
+ :language: cpp
:lines: 4-19
-
.. code-block:: shell
- $ smpicc -O3 gemm_mpi.cpp -o gemm
+ $ smpicxx -O3 gemm_mpi.cpp -o gemm
$ time smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile --cfg=smpi/display-timing:yes --cfg=smpi/running-power:1000000000 ./gemm
-
+
This should end quite quickly, as the size of each matrix is only 1000x1000.
But what happens if we want to simulate larger runs ?
Replace the size by 2000, 3000, and try again.
Lab 4: Memory folding on large allocations
--------------------------------
+------------------------------------------
Another issue that can be encountered when simulation with SMPI is lack of memory.
Indeed we are executing all MPI processes on a single node, which can lead to crashes.
You may also be interested in the `SMPI reference article
<https://hal.inria.fr/hal-01415484>`_ or these `introductory slides
-<http://simgrid.org/tutorials/simgrid-smpi-101.pdf>`_. The `SMPI
-reference documentation <SMPI_doc>`_ covers much more content than
+<http://simgrid.org/tutorials/simgrid-smpi-101.pdf>`_. The :ref:`SMPI
+reference documentation <SMPI_doc>` covers much more content than
this short tutorial.
Finally, we regularly use SimGrid in our teachings on MPI. This way,