time is reported into the simulator.
.. image:: /tuto_smpi/img/big-picture.svg
- :align: center
+ :align: center
Describing Your Platform
------------------------
cluster. The route from ``node-0.simgrid.org`` to ``node-1.simgrid.org``
counts 3 links: the private link of ``node-0.simgrid.org``, the backbone
and the private link of ``node-1.simgrid.org``.
-
+
.. todo::
Add the picture.
This topology was introduced to further reduce the amount of links
while maintaining a high bandwidth for local communications. To model
this in SimGrid, pass a ``topology="DRAGONFLY"`` attribute to your
-cluster. It's based on the implementation of the topology used on
+cluster. It's based on the implementation of the topology used on
Cray XC systems, described in paper
`Cray Cascade: A scalable HPC system based on a Dragonfly network <https://dl.acm.org/citation.cfm?id=2389136>`_.
For example, ``3,4 ; 3,2 ; 3,1 ; 2``:
- ``3,4``: There are 3 groups with 4 links between each (blue level).
- Links to nth group are attached to the nth router of the group
+ Links to nth group are attached to the nth router of the group
on our implementation.
- ``3,2``: In each group, there are 3 chassis with 2 links between each nth router
of each group (black level)
- ``3,1``: In each chassis, 3 routers are connected together with a single link
(green level)
-- ``2``: Each router has two nodes attached (single link)
+- ``2``: Each router has two nodes attached (single link)
.. image:: ../../examples/platforms/cluster_dragonfly.svg
:align: center
All needed dependencies are already installed in this container
(SimGrid, the C/C++/Fortran compilers, make, pajeng and R). Vite being
only optional in this tutorial, it is not installed to reduce the
-image size.
+image size.
The container also include the example platform files from the
previous section as well as the source code of the NAS Parallel
Compiling and Executing
.......................
-
+
Compiling the program is straightforward (double check your
:ref:`SimGrid installation <install>` if you get an error message):
.. code-block:: shell
-
+
$ smpicc -O3 roundtrip.c -o roundtrip
$ smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile ./roundtrip
- The ``-np 16`` option, just like in regular MPI, specifies the
- number of MPI processes to use.
+ number of MPI processes to use.
- The ``-hostfile cluster_hostfile`` option, just like in regular
MPI, specifies the host file. If you omit this option, ``smpirun``
will deploy the application on the first machines of your platform.
- The ``-platform cluster_crossbar.xml`` option, **which doesn't exist
in regular MPI**, specifies the platform configuration to be
- simulated.
+ simulated.
- At the end of the line, one finds the executable name and
command-line arguments (if any -- roundtrip does not expect any arguments).
replay is much faster than live simulation, as the computations are
skipped (the application must be network-dependent for this to work).
-You can even generate the trace during as live simulation, as follows:
+You can even generate the trace during the live simulation as follows:
.. code-block:: shell
- $ smpirun -trace-ti --cfg=tracing/filename:LU.A.32 -np 32 -platform ../cluster_backbone.xml bin/lu.A.32
+ $ smpirun -trace-ti --cfg=tracing/filename:LU.A.32 -np 32 -platform ../cluster_backbone.xml bin/lu.A.32
The produced trace is composed of a file ``LU.A.32`` and a folder
-``LU.A.32_files``. To replay this with SMPI, you need to first compile
-the provided ``smpi_replay.cpp`` file, that comes from
-`simgrid/examples/smpi/replay
-<https://framagit.org/simgrid/simgrid/tree/master/examples/smpi/replay>`_.
+``LU.A.32_files``. You can replay this trace with SMPI thanks to ``smpirun``.
+For example, the following command replays the trace on a different platform:
.. code-block:: shell
- $ smpicxx ../replay.cpp -O3 -o ../smpi_replay
-
-Afterward, you can replay your trace in SMPI as follows:
-
- $ smpirun -np 32 -platform ../cluster_torus.xml -ext smpi_replay ../smpi_replay LU.A.32
+ $ smpirun -np 32 -platform ../cluster_crossbar.xml -hostfile ../cluster_hostfile -replay LU.A.32
All the outputs are gone, as the application is not really simulated
here. Its trace is simply replayed. But if you visualize the live
example, but this becomes very interesting when your application
is computationally hungry.
-.. todo:: smpi_replay should be installed by SimGrid, and smpirun interface could be simplified here.
+.. todo::
+
+ The commands should be separated and executed by some CI to make sure
+ the documentation is up-to-date.
Lab 3: Execution Sampling on EP
-------------------------------