image. Once you `installed Docker itself
<https://docs.docker.com/install/>`_, simply do the following:
-.. code-block:: shell
+.. code-block:: console
- docker pull simgrid/tuto-smpi
- docker run -it --rm --name simgrid --volume ~/smpi-tutorial:/source/tutorial simgrid/tuto-smpi bash
+ $ docker pull simgrid/tuto-smpi
+ $ docker run -it --rm --name simgrid --volume ~/smpi-tutorial:/source/tutorial simgrid/tuto-smpi bash
This will start a new container with all you need to take this
tutorial, and create a ``smpi-tutorial`` directory in your home on
``/source/simgrid-template-smpi`` in the image. You should copy it to
your working directory when you first log in:
-.. code-block:: shell
+.. code-block:: console
- cp -r /source/simgrid-template-smpi/* /source/tutorial
- cd /source/tutorial
+ $ cp -r /source/simgrid-template-smpi/* /source/tutorial
+ $ cd /source/tutorial
Using your Computer Natively
............................
traces. The provided code template requires ``make`` to compile. On
Debian and Ubuntu, you can get them as follows:
-.. code-block:: shell
+.. code-block:: console
- sudo apt install simgrid pajeng make gcc g++ gfortran python3 vite
+ $ sudo apt install simgrid pajeng make gcc g++ gfortran python3 vite
For R analysis of the produced traces, you may want to install R
and the `pajengr <https://github.com/schnorr/pajengr#installation/>`_ package.
-.. code-block:: shell
+.. code-block:: console
# install R and necessary packages
- sudo apt install r-base r-cran-devtools r-cran-tidyverse
+ $ sudo apt install r-base r-cran-devtools r-cran-tidyverse
# install pajengr dependencies
- sudo apt install git cmake flex bison
+ $ sudo apt install git cmake flex bison
# install the pajengr R package
- Rscript -e "library(devtools); install_github('schnorr/pajengr');"
+ $ Rscript -e "library(devtools); install_github('schnorr/pajengr');"
To take this tutorial, you will also need the platform files from the
previous section as well as the source code of the NAS Parallel
Benchmarks. Just clone `this repository
<https://framagit.org/simgrid/simgrid-template-smpi>`_ to get them all:
-.. code-block:: shell
+.. code-block:: console
- git clone https://framagit.org/simgrid/simgrid-template-smpi.git
- cd simgrid-template-smpi/
+ $ git clone https://framagit.org/simgrid/simgrid-template-smpi.git
+ $ cd simgrid-template-smpi/
If you struggle with the compilation, then you should double-check
your :ref:`SimGrid installation <install>`. On need, please refer to
:ref:`SimGrid installation <install>` if you get an error message):
-.. code-block:: shell
+.. code-block:: console
$ smpicc -O3 roundtrip.c -o roundtrip
Once compiled, you can simulate the execution of this program on 16
nodes from the ``cluster_crossbar.xml`` platform as follows:
-.. code-block:: shell
+.. code-block:: console
$ smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile ./roundtrip
<https://www.nas.nasa.gov/publications/npb_problem_sizes.html>`_) with
4 nodes.
-.. code-block:: shell
+.. code-block:: console
$ make lu NPROCS=4 CLASS=S
(compilation logs)
visualization tracing, and convert the produced trace for later
use:
-.. code-block:: shell
+.. code-block:: console
- smpirun -np 4 -platform ../cluster_backbone.xml -trace --cfg=tracing/filename:lu.S.4.trace bin/lu.S.4
+ $ smpirun -np 4 -platform ../cluster_backbone.xml -trace --cfg=tracing/filename:lu.S.4.trace bin/lu.S.4
You can then produce a Gantt Chart with the following R chunk. You can
either copy/paste it in an R session, or `turn it into a Rscript executable
Now compile and execute the LU benchmark, class A, with 32 nodes.
-.. code-block:: shell
+.. code-block:: console
$ make lu NPROCS=32 CLASS=A
You can even generate the trace during the live simulation as follows:
-.. code-block:: shell
+.. code-block:: console
$ smpirun -trace-ti --cfg=tracing/filename:LU.A.32 -np 32 -platform ../cluster_backbone.xml bin/lu.A.32
``LU.A.32_files``. You can replay this trace with SMPI thanks to ``smpirun``.
For example, the following command replays the trace on a different platform:
-.. code-block:: shell
+.. code-block:: console
$ smpirun -np 32 -platform ../cluster_crossbar.xml -hostfile ../cluster_hostfile -replay LU.A.32
:language: cpp
:lines: 9-24
-.. code-block:: shell
+.. code-block:: console
$ smpicxx -O3 gemm_mpi.cpp -o gemm
$ time smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile --cfg=smpi/display-timing:yes --cfg=smpi/running-power:1000000000 ./gemm
processors we are simulating (1Gf), so that 1 second of computation is injected
as 1 second in the simulation.
-.. code-block:: shell
+.. code-block:: console
[5.568556] [smpi_kernel/INFO] Simulated time: 5.56856 seconds.
Now run the code again with various sizes and parameters and check the time taken for the
simulation, as well as the resulting simulated time.
-.. code-block:: shell
+.. code-block:: console
[5.575691] [smpi_kernel/INFO] Simulated time: 5.57569 seconds.
The simulation took 1.23698 seconds (after parsing and platform setup)
Once done, you can now run
-.. code-block:: shell
+.. code-block:: console
$ make dt NPROCS=85 CLASS=C
(compilation logs)
targets for sharing, or setting the threshold for automatic ones.
For DT, the process would be to run a smaller class of problems,
-.. code-block:: shell
+.. code-block:: console
$ make dt NPROCS=21 CLASS=A
$ smpirun --cfg=smpi/display-allocs:yes -np 21 -platform ../cluster_backbone.xml bin/dt.A.x BH
Which should output:
-.. code-block:: shell
+.. code-block:: console
[smpi_utils/INFO] Memory Usage: Simulated application allocated 198533192 bytes during its lifetime through malloc/calloc calls.
Largest allocation at once from a single process was 3553184 bytes, at dt.c:388. It was called 3 times during the whole simulation.