s4u: implement Engine::setConfig()

[simgrid.git] / doc / doxygen / module-smpi.doc
diff --git a/doc/doxygen/module-smpi.doc b/doc/doxygen/module-smpi.doc

index 52f1353..811a399 100644 (file)
--- a/doc/doxygen/module-smpi.doc
+++ b/doc/doxygen/module-smpi.doc
@@ -49,10 +49,10 @@ go.
  
  For that, simply use <tt>smpicc</tt> as a compiler just
  like you use mpicc with other MPI implementations. This script
-replaces your usual compiler (gcc, clang, whatever) and adds the right
+still calls your default compiler (gcc, clang, ...) and adds the right
  compilation flags along the way.
  
-This may badly interact with some building infrastructure. Your
+Alas, some building infrastructures cannot cope with that and your
  <tt>./configure</tt> may fail, reporting that the compiler is not
  functional. If this happens, define the <tt>SMPI_PRETEND_CC</tt>
  environment variable before running the configuration. Do not define
@@ -63,6 +63,12 @@ SMPI_PRETEND_CC=1 ./configure # here come the configure parameters
  make
  @endverbatim
  
+\warning
+  Again, make sure that SMPI_PRETEND_CC is not set when you actually 
+  compile your application. It is just a work-around for some configure-scripts
+  and replaces some internals by "return 0;". Your simulation will not
+  work with this variable set!
+
  @subsection SMPI_use_exec Executing your code on the simulator
  
  Use the <tt>smpirun</tt> script as follows for that:
@@ -150,6 +156,7 @@ Most of these are best described in <a href="http://www.cs.arizona.edu/~dkl/rese
   - mvapich2: use mvapich2 selector for the alltoall operations
   - impi: use intel mpi selector for the alltoall operations
   - automatic (experimental): use an automatic self-benchmarking algorithm 
+ - bruck: Described by Bruck et.al. in <a href="http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=642949">this paper</a>
   - 2dmesh: organizes the nodes as a two dimensional mesh, and perform allgather 
     along the dimensions
   - 3dmesh: adds a third dimension to the previous algorithm
@@ -472,14 +479,14 @@ in production. This issue, as well as several potential solutions, is
  discussed in this article: "Automatic Handling of Global Variables for
  Multi-threaded MPI Programs",
  available at http://charm.cs.illinois.edu/newPapers/11-23/paper.pdf
-(note that this article does not deal with SMPI but with a concurrent
+(note that this article does not deal with SMPI but with a competing
  solution called AMPI that suffers of the same issue). 
  
  SimGrid can duplicate and dynamically switch the .data and .bss
  segments of the ELF process when switching the MPI ranks, allowing
  each ranks to have its own copy of the global variables. This feature
  is expected to work correctly on Linux and BSD, so smpirun activates
-it by default. %As no copy is involved, performance should not be
+it by default. As no copy is involved, performance should not be
  altered (but memory occupation will be higher).
  
  If you want to turn it off, pass \c -no-privatize to smpirun. This may
@@ -521,7 +528,7 @@ area between processes does not seem very wise. You cannot use the
  SMPI_SHARED_MALLOC macro in this case, sorry.
  
  This feature is demoed by the example file
-<tt>examples/smpi/NAS/DT-folding/dt.c</tt>
+<tt>examples/smpi/NAS/dt.c</tt>
  
  @subsection SMPI_adapting_speed Toward faster simulations
  
@@ -535,13 +542,49 @@ SMPI_SAMPLE_GLOBAL. Of course, none of this will work if the execution
  time of your loop iteration are not stable.
  
  This feature is demoed by the example file 
-<tt>examples/smpi/NAS/EP-sampling/ep.c</tt>
+<tt>examples/smpi/NAS/ep.c</tt>
  
  @section SMPI_accuracy Ensuring accurate simulations
  
-TBD
+Out of the box, SimGrid may give you fairly accurate results, but
+there is a plenty of factors that could go wrong and make your results
+inaccurate or even plainly wrong. Actually, you can only get accurate
+results of a nicely built model, including both the system hardware
+and your application. Such models are hard to pass over and reuse in
+other settings, because elements that are not relevant to an
+application (say, the latency of point-to-point communications,
+collective operation implementation details or CPU-network
+interaction) may be irrelevant to another application. The dream of
+the perfect model, encompassing every aspects is only a chimera, as
+the only perfect model of the reality is the reality. If you go for
+simulation, then you have to ignore some irrelevant aspects of the
+reality, but which aspects are irrelevant is actually
+application-dependent...
+
+The only way to assess whether your settings provide accurate results
+is to double-check these results. If possible, you should first run
+the same experiment in simulation and in real life, gathering as much
+information as you can. Try to understand the discrepancies in the
+results that you observe between both settings (visualization can be
+precious for that). Then, try to modify your model (of the platform,
+of the collective operations) to reduce the most preeminent differences.
+
+If the discrepancies come from the computing time, try adapting the \c
+smpi/host-speed: reduce it if your simulation runs faster than in
+reality. If the error come from the communication, then you need to
+fiddle with your platform file.
+
+Be inventive in your modeling. Don't be afraid if the names given by
+SimGrid does not match the real names: we got very good results by
+modeling multicore/GPU machines with a set of separate hosts
+interconnected with very fast networks (but don't trust your model
+because it has the right names in the right place either).
+
+Finally, you may want to check [this
+article](https://hal.inria.fr/hal-00907887) on the classical pitfalls
+in modeling distributed systems.
  
  */
  
  
-/** @example include/smpi/smpi.h */
-\ No newline at end of file
+/** @example include/smpi/smpi.h */