Minor fixes in doc.

[simgrid.git] / docs / source / app_smpi.rst
diff --git a/docs/source/app_smpi.rst b/docs/source/app_smpi.rst

index 5f5d529..513bfa1 100644 (file)
--- a/docs/source/app_smpi.rst
+++ b/docs/source/app_smpi.rst
@@ -6,7 +6,7 @@ SMPI: Simulate MPI Applications
  
  .. raw:: html
  
-   <object id="TOC" data="graphical-toc.svg" width="100%" type="image/svg+xml"></object>
+   <object id="TOC" data="graphical-toc.svg" type="image/svg+xml"></object>
     <script>
     window.onload=function() { // Wait for the SVG to be loaded before changing it
       var elem=document.querySelector("#TOC").contentDocument.getElementById("SMPIBox")
@@ -102,6 +102,11 @@ on which host each rank gets mapped of ``-trace`` to activate the
  tracing during the simulation. You can get the full list by running
  ``smpirun -help``
  
+Finally, you can pass :ref:`any valid SimGrid parameter <options>` to your
+program. In particular, you can pass ``--cfg=network/model:ns-3`` to
+switch to use :ref:`model_ns3`. These parameters should be placed after
+the name of your binary on the command line.
+
  ...............................
  Debugging your Code within SMPI
  ...............................
@@ -120,6 +125,30 @@ usual.
     smpirun -wrapper valgrind ...other args...
     smpirun -wrapper "gdb --args" --cfg=contexts/factory:thread ...other args...
  
+Some shortcuts are available:
+
+- ``-gdb`` is equivalent to ``-wrapper "gdb --args" -keep-temps``, to run within gdb debugger
+- ``-lldb`` is equivalent to ``-wrapper "lldb --" -keep-temps``, to run within lldb debugger
+- ``-vgdb`` is equivalent to ``-wrapper "valgrind --vgdb=yes --vgdb-error=0" -keep-temps``,
+  to run within valgrind and allow to attach a debugger
+
+To help locate bottlenecks and largest allocations in the simulated application,
+the -analyze flag can be passed to smpirun. It will activate
+:ref:`smpi/display-timing<cfg=smpi/display-timing>` and
+:ref:`smpi/display-allocs<cfg=smpi/display-allocs>` options and provide hints
+at the end of execution.
+
+SMPI will also report MPI handle (Comm, Request, Op, Datatype...) leaks
+at the end of execution. This can help identify memory leaks that can trigger
+crashes and slowdowns.
+By default it only displays the number of leaked items detected.
+Option :ref:`smpi/list-leaks:n<cfg=smpi/list-leaks>` can be used to display the
+n first leaks encountered and their type. To get more information, running smpirun
+with ``-wrapper "valgrind --leak-check=full --track-origins=yes"`` should show
+the exact origin of leaked handles.
+Known issue : MPI_Cancel may trigger internal leaks within SMPI.
+
+
  .. _SMPI_use_colls:
  
  ................................
@@ -186,7 +215,7 @@ Most of these are best described in `STAR-MPI's white paper <https://doi.org/10.
   - mvapich2: use mvapich2 selector for the alltoall operations
   - impi: use intel mpi selector for the alltoall operations
   - automatic (experimental): use an automatic self-benchmarking algorithm
- - bruck: Described by Bruck et.al. in <a href="http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=642949">this paper</a>
+ - bruck: Described by Bruck et.al. in `this paper <http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=642949>`_
   - 2dmesh: organizes the nodes as a two dimensional mesh, and perform allgather
     along the dimensions
   - 3dmesh: adds a third dimension to the previous algorithm
@@ -313,9 +342,9 @@ MPI_Allreduce
   - impi: use intel mpi selector for the allreduce operations
   - automatic (experimental): use an automatic self-benchmarking algorithm
   - lr: logical ring reduce-scatter then logical ring allgather
- - rab1: variations of the  <a href="https://fs.hlrs.de/projects/par/mpi//myreduce.html">Rabenseifner</a> algorithm: reduce_scatter then allgather
- - rab2: variations of the  <a href="https://fs.hlrs.de/projects/par/mpi//myreduce.html">Rabenseifner</a> algorithm: alltoall then allgather
- - rab_rsag: variation of the  <a href="https://fs.hlrs.de/projects/par/mpi//myreduce.html">Rabenseifner</a> algorithm: recursive doubling
+ - rab1: variations of the  `Rabenseifner <https://fs.hlrs.de/projects/par/mpi//myreduce.html>`_ algorithm: reduce_scatter then allgather
+ - rab2: variations of the  `Rabenseifner <https://fs.hlrs.de/projects/par/mpi//myreduce.html>`_ algorithm: alltoall then allgather
+ - rab_rsag: variation of the  `Rabenseifner <https://fs.hlrs.de/projects/par/mpi//myreduce.html>`_ algorithm: recursive doubling
     reduce_scatter then recursive doubling allgather
   - rdb: recursive doubling
   - smp_binomial: binomial tree with smp: binomial intra
@@ -514,7 +543,7 @@ file `include/smpi/smpi.h
  <https://framagit.org/simgrid/simgrid/tree/master/include/smpi/smpi.h>`_
  in your version of SimGrid, between two lines containing the ``FIXME``
  marker. If you really miss a feature, please get in touch with us: we
-can guide you though the SimGrid code to help you implementing it, and
+can guide you through the SimGrid code to help you implementing it, and
  we'd be glad to integrate your contribution to the main project.
  
  .. _SMPI_what_globals:
@@ -658,7 +687,13 @@ their duration, and this duration will be used for the subsequent
  iterations. These samples are done per processor with
  SMPI_SAMPLE_LOCAL, and shared between all processors with
  SMPI_SAMPLE_GLOBAL. Of course, none of this will work if the execution
-time of your loop iteration are not stable.
+time of your loop iteration are not stable. If some parameters have an
+incidence on the timing of a kernel, and if they are reused often 
+(same kernel launched with a few different sizes during the run, for example), 
+SMPI_SAMPLE_LOCAL_TAG and SMPI_SAMPLE_GLOBAL_TAG can be used, with a tag 
+as last parameter, to differentiate between calls. The tag is a character 
+chain crafted by the user, with a maximum size of 128, and should include
+what is necessary to group calls of a given size together. 
  
  This feature is demoed by the example file
  `examples/smpi/NAS/ep.c <https://framagit.org/simgrid/simgrid/tree/master/examples/smpi/NAS/ep.c>`_
@@ -705,6 +740,36 @@ Finally, you may want to check `this article
  <https://hal.inria.fr/hal-00907887>`_ on the classical pitfalls in
  modeling distributed systems.
  
+----------------------
+Examples of SMPI Usage
+----------------------
+
+A small amount of examples can be found directly in the SimGrid
+archive, under `examples/smpi <https://framagit.org/simgrid/simgrid/-/tree/master/examples/smpi>`_.
+Some show how to simply run MPI code in SimGrid, how to use the
+tracing/replay mechanism or how to use plugins written in S4U to
+extend the simulator abilities.
+
+Another source of examples lay in the SimGrid archive, under 
+`teshsuite/smpi <https://framagit.org/simgrid/simgrid/-/tree/master/examples/smpi>`_.
+They are not in the ``examples`` directory because they probably don't
+constitute pedagogical examples. Instead, they are intended to stress
+our implementation during the tests. Some of you may be interested
+anyway.
+ 
+But the best source of SMPI examples is certainly the `proxy app
+<https://framagit.org/simgrid/SMPI-proxy-apps>`_ external project.
+Proxy apps are scale models of real, massive HPC applications: each of
+them exhibits the same communication and computation patterns than the
+massive application that it stands for. But they last only a few
+thousands lines instead of some millions of lines. These proxy apps
+are usually provided for educational purpose, and also to ensure that
+the represented large HPC applications will correctly work with the
+next generation of runtimes and hardware. `This project
+<https://framagit.org/simgrid/SMPI-proxy-apps>`_ gathers proxy apps
+from different sources, along with the patches needed (if any) to run
+them on top of SMPI.
+
  -------------------------
  Troubleshooting with SMPI
  -------------------------
@@ -755,7 +820,7 @@ lower case) or similar. Just check the logs.
  error: unknown type name 'useconds_t'
  .....................................
  
-Try to add ``-D_GNU_SOURCE`` to your compilation line to get ride
+Try to add ``-D_GNU_SOURCE`` to your compilation line to get rid
  of that error.
  
  The reason is that SMPI provides its own version of ``usleep(3)``