image. Once you `installed Docker itself
<https://docs.docker.com/install/>`_, simply do the following:
-.. code-block:: shell
+.. code-block:: console
- docker pull simgrid/tuto-smpi
- docker run -it --rm --name simgrid --volume ~/smpi-tutorial:/source/tutorial simgrid/tuto-smpi bash
+ $ docker pull simgrid/tuto-smpi
+ $ docker run -it --rm --name simgrid --volume ~/smpi-tutorial:/source/tutorial simgrid/tuto-smpi bash
This will start a new container with all you need to take this
tutorial, and create a ``smpi-tutorial`` directory in your home on
``/source/simgrid-template-smpi`` in the image. You should copy it to
your working directory when you first log in:
-.. code-block:: shell
+.. code-block:: console
- cp -r /source/simgrid-template-smpi/* /source/tutorial
- cd /source/tutorial
+ $ cp -r /source/simgrid-template-smpi/* /source/tutorial
+ $ cd /source/tutorial
Using your Computer Natively
............................
traces. The provided code template requires ``make`` to compile. On
Debian and Ubuntu, you can get them as follows:
-.. code-block:: shell
+.. code-block:: console
- sudo apt install simgrid pajeng make gcc g++ gfortran python3 vite
+ $ sudo apt install simgrid pajeng make gcc g++ gfortran python3 vite
For R analysis of the produced traces, you may want to install R
and the `pajengr <https://github.com/schnorr/pajengr#installation/>`_ package.
-.. code-block:: shell
+.. code-block:: console
- sudo apt install r-base r-cran-devtools cmake flex bison
- Rscript -e "library(devtools); install_github('schnorr/pajengr');"
+ # install R and necessary packages
+ $ sudo apt install r-base r-cran-devtools r-cran-tidyverse
+ # install pajengr dependencies
+ $ sudo apt install git cmake flex bison
+ # install the pajengr R package
+ $ Rscript -e "library(devtools); install_github('schnorr/pajengr');"
To take this tutorial, you will also need the platform files from the
previous section as well as the source code of the NAS Parallel
Benchmarks. Just clone `this repository
<https://framagit.org/simgrid/simgrid-template-smpi>`_ to get them all:
-.. code-block:: shell
+.. code-block:: console
- git clone https://framagit.org/simgrid/simgrid-template-smpi.git
- cd simgrid-template-smpi/
+ $ git clone https://framagit.org/simgrid/simgrid-template-smpi.git
+ $ cd simgrid-template-smpi/
If you struggle with the compilation, then you should double-check
your :ref:`SimGrid installation <install>`. On need, please refer to
:ref:`SimGrid installation <install>` if you get an error message):
-.. code-block:: shell
+.. code-block:: console
$ smpicc -O3 roundtrip.c -o roundtrip
Once compiled, you can simulate the execution of this program on 16
nodes from the ``cluster_crossbar.xml`` platform as follows:
-.. code-block:: shell
+.. code-block:: console
$ smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile ./roundtrip
<https://www.nas.nasa.gov/publications/npb_problem_sizes.html>`_) with
4 nodes.
-.. code-block:: shell
+.. code-block:: console
$ make lu NPROCS=4 CLASS=S
(compilation logs)
visualization tracing, and convert the produced trace for later
use:
-.. code-block:: shell
+.. code-block:: console
- smpirun -np 4 -platform ../cluster_backbone.xml -trace --cfg=tracing/filename:lu.S.4.trace bin/lu.S.4
+ $ smpirun -np 4 -platform ../cluster_backbone.xml -trace --cfg=tracing/filename:lu.S.4.trace bin/lu.S.4
You can then produce a Gantt Chart with the following R chunk. You can
either copy/paste it in an R session, or `turn it into a Rscript executable
.. code-block:: R
- library(pajengr)
- library(ggplot2)
-
# Read the data
- df_state = pajeng_read("lu.S.4.trace")
- names(df_state) = c("Type", "Rank", "Container", "Start", "End", "Duration", "Level", "State");
- df_state = df_state[!(names(df_state) %in% c("Type","Container","Level"))]
- df_state$Rank = as.numeric(gsub("rank-","",df_state$Rank))
+ library(tidyverse)
+ library(pajengr)
+ dta <- pajeng_read("lu.S.4.trace")
+
+ # Manipulate the data
+ dta$state %>%
+ # Remove some unnecessary columns for this example
+ select(-Type, -Imbrication) %>%
+ # Create the nice MPI rank and operations identifiers
+ mutate(Container = as.integer(gsub("rank-", "", Container)),
+ Value = gsub("^PMPI_", "MPI_", Value)) %>%
+ # Rename some columns so it can better fit MPI terminology
+ rename(Rank = Container,
+ Operation = Value) -> df.states
# Draw the Gantt Chart
- gc = ggplot(data=df_state) + geom_rect(aes(xmin=Start, xmax=End, ymin=Rank, ymax=Rank+1,fill=State))
-
- # Produce the output
- plot(gc)
- dev.off()
-
-This produces a file called ``Rplots.pdf`` with the following
+ df.states %>%
+ ggplot() +
+ # Each MPI operation is becoming a rectangle
+ geom_rect(aes(xmin=Start, xmax=End,
+ ymin=Rank, ymax=Rank + 1,
+ fill=Operation)) +
+ # Cosmetics
+ xlab("Time [seconds]") +
+ ylab("Rank [count]") +
+ theme_bw(base_size=14) +
+ theme(
+ plot.margin = unit(c(0,0,0,0), "cm"),
+ legend.margin = margin(t = 0, unit='cm'),
+ panel.grid = element_blank(),
+ legend.position = "top",
+ legend.justification = "left",
+ legend.box.spacing = unit(0, "pt"),
+ legend.box.margin = margin(0,0,0,0),
+ legend.title = element_text(size=10)) -> plot
+
+ # Save the plot in a PNG file (dimensions in inches)
+ ggsave("smpi.png",
+ plot,
+ width = 10,
+ height = 3)
+
+This produces a file called ``smpi.png`` with the following
content. You can find more visualization examples `online
<https://simgrid.org/contrib/R_visualization.html>`_.
Now compile and execute the LU benchmark, class A, with 32 nodes.
-.. code-block:: shell
+.. code-block:: console
$ make lu NPROCS=32 CLASS=A
You can even generate the trace during the live simulation as follows:
-.. code-block:: shell
+.. code-block:: console
$ smpirun -trace-ti --cfg=tracing/filename:LU.A.32 -np 32 -platform ../cluster_backbone.xml bin/lu.A.32
``LU.A.32_files``. You can replay this trace with SMPI thanks to ``smpirun``.
For example, the following command replays the trace on a different platform:
-.. code-block:: shell
+.. code-block:: console
$ smpirun -np 32 -platform ../cluster_crossbar.xml -hostfile ../cluster_hostfile -replay LU.A.32
.. literalinclude:: /tuto_smpi/gemm_mpi.cpp
:language: cpp
- :lines: 4-19
+ :lines: 9-24
-.. code-block:: shell
+.. code-block:: console
$ smpicxx -O3 gemm_mpi.cpp -o gemm
$ time smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile --cfg=smpi/display-timing:yes --cfg=smpi/running-power:1000000000 ./gemm
processors we are simulating (1Gf), so that 1 second of computation is injected
as 1 second in the simulation.
-.. code-block:: shell
+.. code-block:: console
[5.568556] [smpi_kernel/INFO] Simulated time: 5.56856 seconds.
Now run the code again with various sizes and parameters and check the time taken for the
simulation, as well as the resulting simulated time.
-.. code-block:: shell
+.. code-block:: console
[5.575691] [smpi_kernel/INFO] Simulated time: 5.57569 seconds.
The simulation took 1.23698 seconds (after parsing and platform setup)
Once done, you can now run
-.. code-block:: shell
+.. code-block:: console
$ make dt NPROCS=85 CLASS=C
(compilation logs)
targets for sharing, or setting the threshold for automatic ones.
For DT, the process would be to run a smaller class of problems,
-.. code-block:: shell
+.. code-block:: console
$ make dt NPROCS=21 CLASS=A
$ smpirun --cfg=smpi/display-allocs:yes -np 21 -platform ../cluster_backbone.xml bin/dt.A.x BH
Which should output:
-.. code-block:: shell
+.. code-block:: console
[smpi_utils/INFO] Memory Usage: Simulated application allocated 198533192 bytes during its lifetime through malloc/calloc calls.
Largest allocation at once from a single process was 3553184 bytes, at dt.c:388. It was called 3 times during the whole simulation.