Discover SMPI
-------------
-SimGrid can not only :ref:`simulate algorithms <usecase_simalgo>`, but
+SimGrid can not only :ref:`simulate algorithms <Tutorial_Algorithms>`, but
it can also be used to execute real MPI applications on top of
virtual, simulated platforms with the SMPI module. Even complex
C/C++/F77/F90 applications should run out of the box in this
environment. In fact, almost all proxy apps provided by the `ExaScale
Project <https://proxyapps.exascaleproject.org/>`_ only require minor
modifications to `run on top of SMPI
-<https://github.com/simgrid/SMPI-proxy-apps/>`_.
+<https://framagit.org/simgrid/SMPI-proxy-apps>`_.
This setting permits to debug your MPI applications in a perfectly
reproducible setup, with no Heisenbugs. Enjoy the full Clairevoyance
Simple Example with 3 hosts
...........................
-At the most basic level, you can describe your simulated platform as a
-graph of hosts and network links. For instance:
+Imagine you want to describe a little platform with three hosts,
+interconnected as follows:
.. image:: /tuto_smpi/3hosts.png
:align: center
+This can be done with the following platform file, that considers the
+simulated platform as a graph of hosts and network links.
+
.. literalinclude:: /tuto_smpi/3hosts.xml
:language: xml
-Note the way in which hosts, links, and routes are defined in
-this XML. All hosts are defined with a speed (in Gflops), and links
-with a latency (in us) and bandwidth (in MBytes per second). Other
-units are possible and written as expected. Routes specify the list of
-links encountered from one route to another. Routes are symmetrical by
-default.
+The elements basic elements (with :ref:`pf_tag_host` and
+:ref:`pf_tag_link`) are described first, and then the routes between
+any pair of hosts are explicitely given with :ref:`pf_tag_route`. Any
+host must be given a computational speed (in flops) while links must
+be given a latency (in seconds) and a bandwidth (in bytes per
+second). Note that you can write 1Gflops instead of 1000000000flops,
+and similar. Last point: :ref:`pf_tag_route`s are symmetrical by
+default (but this can be changed).
Cluster with a Crossbar
.......................
when you log out of the container, so don't edit the other files!
All needed dependencies are already installed in this container
-(SimGrid, the C/C++/Fortran compilers, make, pajeng and R). Vite being
+(SimGrid, the C/C++/Fortran compilers, make, pajeng, R and pajengr). Vite being
only optional in this tutorial, it is not installed to reduce the
image size.
sudo apt install simgrid pajeng make gcc g++ gfortran vite
+For R analysis of the produced traces, you may want to install R,
+and the `pajengr <https://github.com/schnorr/pajengr#installation/>`_ package.
+
+.. code-block:: shell
+
+ sudo apt install r-base r-cran-devtools cmake flex bison
+ Rscript -e "library(devtools); install_github('schnorr/pajengr');"
+
To take this tutorial, you will also need the platform files from the
previous section as well as the source code of the NAS Parallel
Benchmarks. Just clone `this repository
.. code-block:: shell
smpirun -np 4 -platform ../cluster_backbone.xml -trace --cfg=tracing/filename:lu.S.4.trace bin/lu.S.4
- pj_dump --ignore-incomplete-links lu.S.4.trace | grep State > lu.S.4.state.csv
You can then produce a Gantt Chart with the following R chunk. You can
either copy/paste it in a R session, or `turn it into a Rscript executable
.. code-block:: R
+ library(pajengr)
library(ggplot2)
# Read the data
- df_state = read.csv("lu.S.4.state.csv", header=F, strip.white=T)
+ df_state = pajeng_read("lu.S.4.trace")
names(df_state) = c("Type", "Rank", "Container", "Start", "End", "Duration", "Level", "State");
df_state = df_state[!(names(df_state) %in% c("Type","Container","Level"))]
df_state$Rank = as.numeric(gsub("rank-","",df_state$Rank))
the documentation is up-to-date.
Lab 3: Execution Sampling on Matrix Multiplication example
--------------------------------
+----------------------------------------------------------
The second method to speed up simulations is to sample the computation
parts in the code. This means that the person doing the simulation
.. literalinclude:: /tuto_smpi/gemm_mpi.cpp
:language: c
:lines: 4-19
-
.. code-block:: shell
$ smpicc -O3 gemm_mpi.cpp -o gemm
$ time smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile --cfg=smpi/display-timing:yes --cfg=smpi/running-power:1000000000 ./gemm
-
+
This should end quite quickly, as the size of each matrix is only 1000x1000.
But what happens if we want to simulate larger runs ?
Replace the size by 2000, 3000, and try again.
Lab 4: Memory folding on large allocations
--------------------------------
+------------------------------------------
Another issue that can be encountered when simulation with SMPI is lack of memory.
Indeed we are executing all MPI processes on a single node, which can lead to crashes.
You may also be interested in the `SMPI reference article
<https://hal.inria.fr/hal-01415484>`_ or these `introductory slides
-<http://simgrid.org/tutorials/simgrid-smpi-101.pdf>`_. The `SMPI
-reference documentation <SMPI_doc>`_ covers much more content than
+<http://simgrid.org/tutorials/simgrid-smpi-101.pdf>`_. The :ref:`SMPI
+reference documentation <app_smpi>` covers much more content than
this short tutorial.
Finally, we regularly use SimGrid in our teachings on MPI. This way,