Merge branch 'random_readwritestate' into 'master'

[simgrid.git] / docs / source / Tutorial_MPI_Applications.rst
diff --git a/docs/source/Tutorial_MPI_Applications.rst b/docs/source/Tutorial_MPI_Applications.rst

index 703f4ee..355464a 100644 (file)
--- a/docs/source/Tutorial_MPI_Applications.rst
+++ b/docs/source/Tutorial_MPI_Applications.rst
@@ -13,11 +13,11 @@ C/C++/F77/F90 applications should run out of the box in this
  environment. In fact, almost all proxy apps provided by the `ExaScale
  Project <https://proxyapps.exascaleproject.org/>`_ only require minor
  modifications to `run on top of SMPI
-<https://github.com/simgrid/SMPI-proxy-apps/>`_.
+<https://framagit.org/simgrid/SMPI-proxy-apps>`_.
  
-This setting permits to debug your MPI applications in a perfectly
-reproducible setup, with no Heisenbugs. Enjoy the full Clairevoyance
-provided by the simulator while running what-if analysis on platforms
+This setting permits one to debug your MPI applications in a perfectly
+reproducible setup, with no Heisenbugs. Enjoy the full Clairvoyance
+provided by the simulator while running what-if analyses on platforms
  that are still to be built! Several `production-grade MPI applications
  <https://framagit.org/simgrid/SMPI-proxy-apps#full-scale-applications>`_
  use SimGrid for their integration and performance testing.
@@ -41,7 +41,7 @@ How does it work?
  
  In SMPI, communications are simulated while computations are
  emulated. This means that while computations occur as they would in
-the real systems, communication calls are intercepted and achived by
+the real systems, communication calls are intercepted and achieved by
  the simulator.
  
  To start using SMPI, you just need to compile your application with
@@ -58,7 +58,7 @@ per MPI rank as if it was another dynamic library. Then, MPI
  communication calls are implemented using SimGrid: data is exchanged
  through memory copy, while the simulator's performance models are used
  to predict the time taken by each communications. Any computations
-occuring between two MPI calls are benchmarked, and the corresponding
+occurring between two MPI calls are benchmarked, and the corresponding
  time is reported into the simulator.
  
  .. image:: /tuto_smpi/img/big-picture.svg
@@ -81,21 +81,30 @@ examples.
  Simple Example with 3 hosts
  ...........................
  
-At the most basic level, you can describe your simulated platform as a
-graph of hosts and network links. For instance:
+Imagine you want to describe a little platform with three hosts,
+interconnected as follows:
  
  .. image:: /tuto_smpi/3hosts.png
     :align: center
  
+This can be done with the following platform file, that considers the
+simulated platform as a graph of hosts and network links.
+
  .. literalinclude:: /tuto_smpi/3hosts.xml
     :language: xml
  
-Note the way in which hosts, links, and routes are defined in
-this XML. All hosts are defined with a speed (in Gflops), and links
-with a latency (in us) and bandwidth (in MBytes per second). Other
-units are possible and written as expected. Routes specify the list of
-links encountered from one route to another. Routes are symmetrical by
-default.
+The elements basic elements (with :ref:`pf_tag_host` and
+:ref:`pf_tag_link`) are described first, and then the routes between
+any pair of hosts are explicitly given with :ref:`pf_tag_route`. 
+
+Any host must be given a computational speed in flops while links must
+be given a latency and a bandwidth. You can write 1Gf for
+1,000,000,000 flops (full list of units in the reference guide of 
+:ref:`pf_tag_host` and :ref:`pf_tag_link`). 
+
+Routes defined with :ref:`pf_tag_route` are symmetrical by default,
+meaning that the list of traversed links from A to B is the same as
+from B to A. Explicitly define non-symmetrical routes if you prefer.
  
  Cluster with a Crossbar
  .......................
@@ -301,10 +310,10 @@ Debian and Ubuntu for example, you can get them as follows:
  
  .. code-block:: shell
  
-   sudo apt install simgrid pajeng make gcc g++ gfortran vite
+   sudo apt install simgrid pajeng make gcc g++ gfortran python3 vite
  
-For R analysis of the produced traces, you may want to install R, 
-and the `pajengr<https://github.com/schnorr/pajengr#installation/>`_ package.
+For R analysis of the produced traces, you may want to install R,
+and the `pajengr <https://github.com/schnorr/pajengr#installation/>`_ package.
  
  .. code-block:: shell
  
@@ -481,7 +490,7 @@ is computationally hungry.
      the documentation is up-to-date.
  
  Lab 3: Execution Sampling on Matrix Multiplication example
--------------------------------
+----------------------------------------------------------
  
  The second method to speed up simulations is to sample the computation
  parts in the code.  This means that the person doing the simulation
@@ -490,7 +499,7 @@ intensive and take time, while being regular enough not to ruin
  simulation accuracy. Furthermore there should not be any MPI calls
  inside such parts of the code.
  
-Use for this part the `gemm_mpi.c
+Use for this part the `gemm_mpi.cpp
  <https://gitlab.com/PRACE-4IP/CodeVault/raw/master/hpc_kernel_samples/dense_linear_algebra/gemm/mpi/src/gemm_mpi.cpp>`_
  example, which is provided by the `PRACE Codevault repository
  <http://www.prace-ri.eu/prace-codevault/>`_.
@@ -498,15 +507,14 @@ example, which is provided by the `PRACE Codevault repository
  The computing part of this example is the matrix multiplication routine
  
  .. literalinclude:: /tuto_smpi/gemm_mpi.cpp
-   :language: c
+   :language: cpp
     :lines: 4-19
-   
  
  .. code-block:: shell
  
-  $ smpicc -O3 gemm_mpi.cpp -o gemm
+  $ smpicxx -O3 gemm_mpi.cpp -o gemm
    $ time smpirun -np 16 -platform cluster_crossbar.xml -hostfile cluster_hostfile --cfg=smpi/display-timing:yes --cfg=smpi/running-power:1000000000 ./gemm
-  
+
  This should end quite quickly, as the size of each matrix is only 1000x1000. 
  But what happens if we want to simulate larger runs ?
  Replace the size by 2000, 3000, and try again.
@@ -580,7 +588,7 @@ so these macros cannot be used when results are critical for the application beh
  
  
  Lab 4: Memory folding on large allocations
--------------------------------
+------------------------------------------
  
  Another issue that can be encountered when simulation with SMPI is lack of memory.
  Indeed we are executing all MPI processes on a single node, which can lead to crashes.
@@ -618,8 +626,8 @@ Further Readings
  
  You may also be interested in the `SMPI reference article
  <https://hal.inria.fr/hal-01415484>`_ or these `introductory slides
-<http://simgrid.org/tutorials/simgrid-smpi-101.pdf>`_. The `SMPI
-reference documentation <SMPI_doc>`_ covers much more content than
+<http://simgrid.org/tutorials/simgrid-smpi-101.pdf>`_. The :ref:`SMPI
+reference documentation <SMPI_doc>` covers much more content than
  this short tutorial.
  
  Finally, we regularly use SimGrid in our teachings on MPI. This way,