[doc] various doc small improvements

[simgrid.git] / docs / source / Tutorial_MPI_Applications.rst
diff --git a/docs/source/Tutorial_MPI_Applications.rst b/docs/source/Tutorial_MPI_Applications.rst

index 4c641cc..eae9d38 100644 (file)
--- a/docs/source/Tutorial_MPI_Applications.rst
+++ b/docs/source/Tutorial_MPI_Applications.rst
@@ -13,7 +13,7 @@ C/C++/F77/F90 applications should run out of the box in this
  environment. In fact, almost all proxy apps provided by the `ExaScale
  Project <https://proxyapps.exascaleproject.org/>`_ only require minor
  modifications to `run on top of SMPI
  environment. In fact, almost all proxy apps provided by the `ExaScale
  Project <https://proxyapps.exascaleproject.org/>`_ only require minor
  modifications to `run on top of SMPI
-<https://github.com/simgrid/SMPI-proxy-apps/>`_.
+<https://framagit.org/simgrid/SMPI-proxy-apps>`_.
  
  This setting permits to debug your MPI applications in a perfectly
  reproducible setup, with no Heisenbugs. Enjoy the full Clairevoyance
  
  This setting permits to debug your MPI applications in a perfectly
  reproducible setup, with no Heisenbugs. Enjoy the full Clairevoyance
@@ -81,21 +81,30 @@ examples.
  Simple Example with 3 hosts
  ...........................
  
  Simple Example with 3 hosts
  ...........................
  
-At the most basic level, you can describe your simulated platform as a
-graph of hosts and network links. For instance:
+Imagine you want to describe a little platform with three hosts,
+interconnected as follows:
  
  .. image:: /tuto_smpi/3hosts.png
     :align: center
  
  
  .. image:: /tuto_smpi/3hosts.png
     :align: center
  
+This can be done with the following platform file, that considers the
+simulated platform as a graph of hosts and network links.
+
  .. literalinclude:: /tuto_smpi/3hosts.xml
     :language: xml
  
  .. literalinclude:: /tuto_smpi/3hosts.xml
     :language: xml
  
-Note the way in which hosts, links, and routes are defined in
-this XML. All hosts are defined with a speed (in Gflops), and links
-with a latency (in us) and bandwidth (in MBytes per second). Other
-units are possible and written as expected. Routes specify the list of
-links encountered from one route to another. Routes are symmetrical by
-default.
+The elements basic elements (with :ref:`pf_tag_host` and
+:ref:`pf_tag_link`) are described first, and then the routes between
+any pair of hosts are explicitely given with :ref:`pf_tag_route`. 
+
+Any host must be given a computational speed in flops while links must
+be given a latency and a bandwidth. You can write 1Gf for
+1,000,000,000 flops (full list of units in the reference guide of 
+:ref:`pf_tag_host` and :ref:`pf_tag_link`). 
+
+Routes defined with :ref:`pf_tag_route` are symmetrical by default,
+meaning that the list of traversed links from A to B is the same as
+from B to A. Explicitely define non-symmetrical routes if you prefer.
  
  Cluster with a Crossbar
  .......................
  
  Cluster with a Crossbar
  .......................
@@ -274,7 +283,7 @@ container to enjoy the provided dependencies.
     when you log out of the container, so don't edit the other files!
  
  All needed dependencies are already installed in this container
     when you log out of the container, so don't edit the other files!
  
  All needed dependencies are already installed in this container
-(SimGrid, the C/C++/Fortran compilers, make, pajeng and R). Vite being
+(SimGrid, the C/C++/Fortran compilers, make, pajeng, R and pajengr). Vite being
  only optional in this tutorial, it is not installed to reduce the
  image size.
  
  only optional in this tutorial, it is not installed to reduce the
  image size.
  
@@ -303,6 +312,14 @@ Debian and Ubuntu for example, you can get them as follows:
  
     sudo apt install simgrid pajeng make gcc g++ gfortran vite
  
  
     sudo apt install simgrid pajeng make gcc g++ gfortran vite
  
+For R analysis of the produced traces, you may want to install R,
+and the `pajengr <https://github.com/schnorr/pajengr#installation/>`_ package.
+
+.. code-block:: shell
+
+   sudo apt install r-base r-cran-devtools cmake flex bison
+   Rscript -e "library(devtools); install_github('schnorr/pajengr');"
+
  To take this tutorial, you will also need the platform files from the
  previous section as well as the source code of the NAS Parallel
  Benchmarks. Just  clone `this repository
  To take this tutorial, you will also need the platform files from the
  previous section as well as the source code of the NAS Parallel
  Benchmarks. Just  clone `this repository
@@ -397,7 +414,6 @@ use:
  .. code-block:: shell
  
     smpirun -np 4 -platform ../cluster_backbone.xml -trace --cfg=tracing/filename:lu.S.4.trace bin/lu.S.4
  .. code-block:: shell
  
     smpirun -np 4 -platform ../cluster_backbone.xml -trace --cfg=tracing/filename:lu.S.4.trace bin/lu.S.4
-   pj_dump --ignore-incomplete-links lu.S.4.trace | grep State > lu.S.4.state.csv
  
  You can then produce a Gantt Chart with the following R chunk. You can
  either copy/paste it in a R session, or `turn it into a Rscript executable
  
  You can then produce a Gantt Chart with the following R chunk. You can
  either copy/paste it in a R session, or `turn it into a Rscript executable
@@ -406,10 +422,11 @@ run it again and again.
  
  .. code-block:: R
  
  
  .. code-block:: R
  
+   library(pajengr)
     library(ggplot2)
  
     # Read the data
     library(ggplot2)
  
     # Read the data
-   df_state = read.csv("lu.S.4.state.csv", header=F, strip.white=T)
+   df_state = pajeng_read("lu.S.4.trace")
     names(df_state) = c("Type", "Rank", "Container", "Start", "End", "Duration", "Level", "State");
     df_state = df_state[!(names(df_state) %in% c("Type","Container","Level"))]
     df_state$Rank = as.numeric(gsub("rank-","",df_state$Rank))
     names(df_state) = c("Type", "Rank", "Container", "Start", "End", "Duration", "Level", "State");
     df_state = df_state[!(names(df_state) %in% c("Type","Container","Level"))]
     df_state$Rank = as.numeric(gsub("rank-","",df_state$Rank))
@@ -473,7 +490,7 @@ is computationally hungry.
      the documentation is up-to-date.
  
  Lab 3: Execution Sampling on Matrix Multiplication example
      the documentation is up-to-date.
  
  Lab 3: Execution Sampling on Matrix Multiplication example
--------------------------------
+----------------------------------------------------------
  
  The second method to speed up simulations is to sample the computation
  parts in the code.  This means that the person doing the simulation
  
  The second method to speed up simulations is to sample the computation
  parts in the code.  This means that the person doing the simulation
@@ -492,13 +509,12 @@ The computing part of this example is the matrix multiplication routine
  .. literalinclude:: /tuto_smpi/gemm_mpi.cpp
     :language: c
     :lines: 4-19
  .. literalinclude:: /tuto_smpi/gemm_mpi.cpp
     :language: c
     :lines: 4-19
-   
  
  .. code-block:: shell
  
    $ smpicc -O3 gemm_mpi.cpp -o gemm
    $ time smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile --cfg=smpi/display-timing:yes --cfg=smpi/running-power:1000000000 ./gemm
  
  .. code-block:: shell
  
    $ smpicc -O3 gemm_mpi.cpp -o gemm
    $ time smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile --cfg=smpi/display-timing:yes --cfg=smpi/running-power:1000000000 ./gemm
-  
+
  This should end quite quickly, as the size of each matrix is only 1000x1000. 
  But what happens if we want to simulate larger runs ?
  Replace the size by 2000, 3000, and try again.
  This should end quite quickly, as the size of each matrix is only 1000x1000. 
  But what happens if we want to simulate larger runs ?
  Replace the size by 2000, 3000, and try again.
@@ -572,7 +588,7 @@ so these macros cannot be used when results are critical for the application beh
  
  
  Lab 4: Memory folding on large allocations
  
  
  Lab 4: Memory folding on large allocations
--------------------------------
+------------------------------------------
  
  Another issue that can be encountered when simulation with SMPI is lack of memory.
  Indeed we are executing all MPI processes on a single node, which can lead to crashes.
  
  Another issue that can be encountered when simulation with SMPI is lack of memory.
  Indeed we are executing all MPI processes on a single node, which can lead to crashes.
@@ -610,8 +626,8 @@ Further Readings
  
  You may also be interested in the `SMPI reference article
  <https://hal.inria.fr/hal-01415484>`_ or these `introductory slides
  
  You may also be interested in the `SMPI reference article
  <https://hal.inria.fr/hal-01415484>`_ or these `introductory slides
-<http://simgrid.org/tutorials/simgrid-smpi-101.pdf>`_. The `SMPI
-reference documentation <SMPI_doc>`_ covers much more content than
+<http://simgrid.org/tutorials/simgrid-smpi-101.pdf>`_. The :ref:`SMPI
+reference documentation <SMPI_doc>` covers much more content than
  this short tutorial.
  
  Finally, we regularly use SimGrid in our teachings on MPI. This way,
  this short tutorial.
  
  Finally, we regularly use SimGrid in our teachings on MPI. This way,