[Doc] Added tentative note on collective algorithms to doc

[simgrid.git] / doc / doxygen / options.doc
diff --git a/doc/doxygen/options.doc b/doc/doxygen/options.doc

index eae314b..f0318bb 100644 (file)
--- a/doc/doxygen/options.doc
+++ b/doc/doxygen/options.doc
@@ -64,9 +64,11 @@ int main(int argc, char *argv[]) {
  
  \section options_model Configuring the platform models
  
+\anchor options_storage_model
+\anchor options_vm_workstation_model
  \subsection options_model_select Selecting the platform models
  
-SimGrid comes with several network and CPU models built in, and you
+SimGrid comes with several network, CPU and storage models built in, and you
  can change the used model at runtime by changing the passed
  configuration. The three main configuration items are given below.
  For each of these items, passing the special \c help value gives
@@ -75,6 +77,8 @@ should provide information about all models for all existing resources.
     - \b network/model: specify the used network model
     - \b cpu/model: specify the used CPU model
     - \b workstation/model: specify the used workstation model
+   - \b storage/model: specify the used storage model (there is currently only one such model - this option is hence only useful for future releases)
+   - \b vm_workstation/model: specify the workstation model for virtual machines (there is currently only one such model - this option is hence only useful for future releases)
  
  %As of writing, the following network models are accepted. Over
  the time new models can be added, and some experimental models can be
@@ -306,6 +310,7 @@ be executed with the command line argument
  \verbatim
  --cfg=model-check:1
  \endverbatim
+
  Safety properties are expressed as assertions using the function
  \verbatim
  void MC_assert(int prop);
@@ -370,20 +375,158 @@ Of course, specifying a reduction technique enables the model-checking
  so that you don't have to give <tt>--cfg=model-check:1</tt> in
  addition.
  
+\subsection options_modelchecking_visited model-check/visited, Cycle detection
+
+In order to detect cycles, the model-checker needs to check if a new explored
+state is in fact the same state than a previous one. In order to do this,
+the model-checker can take a snapshot of each visited state: this snapshot is
+then used to compare it with subsequent states in the exploration graph.
+
+The \b model-check/visited is the maximum number of states which are stored in
+memory. If the maximum number of snapshotted state is reached some states will
+be removed from the memory and some cycles might be missed.
+
+By default, no state is snapshotted and cycles cannot be detected.
+
+\subsection options_modelchecking_termination model-check/termination, Non termination detection
+
+The \b model-check/termination configuration item can be used to report if a
+non-termination execution path has been found. This is a path with a cycle
+which means that the program might never terminate.
+
+This only works in safety mode.
+
+This options is disabled by default.
+
+\subsection options_modelchecking_dot_output model-check/dot_output, Dot output
+
+If set, the \b model-check/dot_output configuration item is the name of a file
+in which to write a dot file of the path leading the found property (safety or
+liveness violation) as well as the cycle for liveness properties. This dot file
+can then fed to the graphviz dot tool to generate an corresponding graphical
+representation.
+
+\subsection options_modelchecking_max_depth model-check/max_depth, Depth limit
+
+The \b model-checker/max_depth can set the maximum depth of the exploration
+graph of the model-checker. If this limit is reached, a logging message is
+sent and the results might not be exact.
+
+By default, there is not depth limit.
+
+\subsection options_modelchecking_timeout Handling of timeout
+
+By default, the model-checker does not handle timeout conditions: the `wait`
+operations never time out. With the \b model-check/timeout configuration item
+set to \b yes, the model-checker will explore timeouts of `wait` operations.
+
+\subsection options_modelchecking_comm_determinism Communication determinism
+
+The \b model-check/communications_determinism and
+\b model-check/send_determinism items can be used to select the communication
+determinism mode of the model-checker which checks determinism properties of
+the communications of an application.
+
+\subsection options_modelchecking_sparse_checkpoint Per page checkpoints
+
+When the model-checker is configured to take a snapshot of each explored state
+(with the \b model-checker/visited item), the memory consumption can rapidly
+reach GiB ou Tib of memory. However, for many workloads, the memory does not
+change much between different snapshots and taking a complete copy of each
+snapshot is a waste of memory.
+
+The \b model-check/sparse-checkpoint option item can be set to \b yes in order
+to avoid making a complete copy of each snapshot: instead, each snapshot will be
+decomposed in blocks which will be stored separately.
+If multiple snapshots share the same block (or if the same block
+is used in the same snapshot), the same copy of the block will be shared leading
+to a reduction of the memory footprint.
+
+For many applications, this option considerably reduces the memory consumption.
+In somes cases, the model-checker might be slightly slower because of the time
+taken to manage the metadata about the blocks. In other cases however, this
+snapshotting strategy will be much faster by reducing the cache consumption.
+When the memory consumption is important, by avoiding to hit the swap or
+reducing the swap usage, this option might be much faster than the basic
+snapshotting strategy.
+
+This option is currently disabled by default.
+
  \subsection options_mc_perf Performance considerations for the model checker
  
  The size of the stacks can have a huge impact on the memory
-consumption when using model-checking. Currently each snapshot, will
-save a copy of the whole stack and not only of the part which is
+consumption when using model-checking. By default, each snapshot will
+save a copy of the whole stacks and not only of the part which is
  really meaningful: you should expect the contribution of the memory
  consumption of the snapshots to be \f$ \mbox{number of processes}
  \times \mbox{stack size} \times \mbox{number of states} \f$.
  
-However, when compiled against the model checker, the stacks are not
+The \b model-check/sparse-checkpoint can be used to reduce the memory
+consumption by trying to share memory between the different snapshots.
+
+When compiled against the model checker, the stacks are not
  protected with guards: if the stack size is too small for your
  application, the stack will silently overflow on other parts of the
  memory.
  
+\subsection options_modelchecking_hash Hashing of the state (experimental)
+
+Usually most of the time of the model-checker is spent comparing states. This
+process is complicated and consumes a lot of bandwidth and cache.
+In order to speedup the state comparison, the experimental \b model-checker/hash
+configuration item enables the computation of a hash summarizing as much
+information of the state as possible into a single value. This hash can be used
+to avoid most of the comparisons: the costly comparison is then only used when
+the hashes are identical.
+
+Currently most of the state is not included in the hash because the
+implementation was found to be buggy and this options is not as useful as
+it could be. For this reason, it is currently disabled by default.
+
+\subsection options_modelchecking_recordreplay Record/replay (experimental)
+
+As the model-checker keeps jumping at different places in the execution graph,
+it is difficult to understand what happens when trying to debug an application
+under the model-checker. Event the output of the program is difficult to
+interpret. Moreover, the model-checker does not behave nicely with advanced
+debugging tools such as valgrind. For those reason, to identify a trajectory
+in the execution graph with the model-checker and replay this trajcetory and
+without the model-checker black-magic but with more standard tools
+(such as a debugger, valgrind, etc.). For this reason, Simgrid implements an
+experimental record/replay functionnality in order to record a trajectory with
+the model-checker and replay it without the model-checker.
+
+When the model-checker finds an interesting path in the application execution
+graph (where a safety or liveness property is violated), it can generate an
+identifier for this path. In order to enable this behavious the
+\b model-check/record must be set to \b yes. By default, this behaviour is not
+enabled.
+
+This is an example of output:
+
+<pre>
+[  0.000000] (0:@) Check a safety property
+[  0.000000] (0:@) **************************
+[  0.000000] (0:@) *** PROPERTY NOT VALID ***
+[  0.000000] (0:@) **************************
+[  0.000000] (0:@) Counter-example execution trace:
+[  0.000000] (0:@) Path = 1/3;1/4
+[  0.000000] (0:@) [(1)Tremblay (app)] MC_RANDOM(3)
+[  0.000000] (0:@) [(1)Tremblay (app)] MC_RANDOM(4)
+[  0.000000] (0:@) Expanded states = 27
+[  0.000000] (0:@) Visited states = 68
+[  0.000000] (0:@) Executed transitions = 46
+</pre>
+
+This path can then be replayed outside of the model-checker (and even in
+non-MC build of simgrid) by setting the \b model-check/replay item to the given
+path. The other options should be the same (but the model-checker should
+be disabled).
+
+The format and meaning of the path may change between different releases so
+the same release of Simgrid should be used for the record phase and the replay
+phase.
+
  \section options_virt Configuring the User Process Virtualization
  
  \subsection options_virt_factory Selecting the virtualization factory
@@ -539,6 +682,21 @@ Please, use these two parameters (for comments) to make reproducible
  simulations. For additional details about this and all tracing
  options, check See the \ref tracing_tracing_options.
  
+\section options_msg Configuring MSG
+
+\subsection options_msg_debug_multiple_use Debugging MSG
+
+Sometimes your application may try to send a task that is still being
+executed somewhere else, making it impossible to send this task. However,
+for debugging purposes, one may want to know what the other host is/was
+doing. This option shows a backtrace of the other process.
+
+Enable this option by adding
+
+\verbatim
+--cfg=msg/debug_multiple_use:on
+\endverbatim
+
  \section options_smpi Configuring SMPI
  
  The SMPI interface provides several specific configuration items.
@@ -562,10 +720,17 @@ to update it to get accurate simulation results.
  When the code is constituted of numerous consecutive MPI calls, the
  previous mechanism feeds the simulation kernel with numerous tiny
  computations. The \b smpi/cpu_threshold item becomes handy when this
-impacts badly the simulation performance. It specify a threshold (in
-second) under which the execution chunks are not reported to the
-simulation kernel (default value: 1e-6). Please note that in some
-circonstances, this optimization can hinder the simulation accuracy.
+impacts badly the simulation performance. It specifies a threshold (in
+seconds) below which the execution chunks are not reported to the
+simulation kernel (default value: 1e-6).
+
+
+\note
+    The option smpi/cpu_threshold ignores any computation time spent
+    below this threshold. SMPI does not consider the \a amount of these
+    computations; there is no offset for this. Hence, by using a
+    value that is too low, you may end up with unreliable simulation
+    results.
  
   In some cases, however, one may wish to disable simulation of
  application computation. This is the case when SMPI is used not to
@@ -598,8 +763,16 @@ So, messages with size 65472 and more will get a total of MAX_BANDWIDTH*0.940694
  messages of size 15424 to 65471 will get MAX_BANDWIDTH*0.697866 and so on.
  Here, MAX_BANDWIDTH denotes the bandwidth of the link.
  
+\note
+    The SimGrid-Team has developed a script to help you determine these
+    values. You can find more information and the download here:
+    1. http://simgrid.gforge.inria.fr/contrib/smpi-calibration-doc.html
+    2. http://simgrid.gforge.inria.fr/contrib/smpi-saturation-doc.html
+
  \subsection options_smpi_timing smpi/display_timing: Reporting simulation time
  
+\b Default: 0 (false)
+
  Most of the time, you run MPI code through SMPI to compute the time it
  would take to run it on a platform that you don't have. But since the
  code is run through the \c smpirun script, you don't have any control
@@ -609,7 +782,28 @@ to 1, \c smpirun will display this information when the simulation ends. \verbat
  Simulation time: 1e3 seconds.
  \endverbatim
  
-\subsection options_smpi_global Automatic privatization of global variables
+\subsection options_model_smpi_lat_factor smpi/lat_factor: Latency factors
+
+The motivation and syntax for this option is identical to the motivation/syntax
+of smpi/bw_factor, see \ref options_model_smpi_bw_factor for details.
+
+There is an important difference, though: While smpi/bw_factor \a reduces the
+actual bandwidth (i.e., values between 0 and 1 are valid), latency factors
+increase the latency, i.e., values larger than or equal to 1 are valid here.
+
+This is the default value:
+
+\verbatim
+65472:11.6436;15424:3.48845;9376:2.59299;5776:2.18796;3484:1.88101;1426:1.61075;732:1.9503;257:1.95341;0:2.01467
+\endverbatim
+
+\note
+    The SimGrid-Team has developed a script to help you determine these
+    values. You can find more information and the download here:
+    1. http://simgrid.gforge.inria.fr/contrib/smpi-calibration-doc.html
+    2. http://simgrid.gforge.inria.fr/contrib/smpi-saturation-doc.html
+
+\subsection options_smpi_global smpi/privatize_global_variables: Automatic privatization of global variables
  
  MPI executables are meant to be executed in separated processes, but SMPI is
  executed in only one process. Global variables from executables will be placed
@@ -656,6 +850,127 @@ simulate the behavior of most of the existing MPI libraries. The \b smpi/coll_se
  uses naive version of collective operations). Each collective operation can be manually selected with a
  \b smpi/collective_name:algo_name. Available algorithms are listed in \ref SMPI_collective_algorithms .
  
+\subsection options_model_smpi_iprobe smpi/iprobe: Inject constant times for calls to MPI_Iprobe
+
+\b Default value: 0.0001
+
+The behavior and motivation for this configuration option is identical with \a smpi/test, see
+Section \ref options_model_smpi_test for details.
+
+\subsection options_model_smpi_ois smpi/ois: Inject constant times for asynchronous send operations
+
+This configuration option works exactly as \a smpi/os, see Section \ref options_model_smpi_os.
+Of course, \a smpi/ois is used to account for MPI_Isend instead of MPI_Send.
+
+\subsection options_model_smpi_os smpi/os: Inject constant times for send operations
+
+In several network models such as LogP, send (MPI_Send, MPI_Isend) and receive (MPI_Recv)
+operations incur costs (i.e., they consume CPU time). SMPI can factor these costs in as well, but the
+user has to configure SMPI accordingly as these values may vary by machine.
+This can be done by using smpi/os for MPI_Send operations; for MPI_Isend and
+MPI_Recv, use \a smpi/ois and \a smpi/or, respectively. These work exactly as
+\a smpi/ois.
+
+\a smpi/os can consist of multiple sections; each section takes three values, for example:
+
+\verbatim
+    1:3:2;10:5:1
+\endverbatim
+
+Here, the sections are divided by ";" (that is, this example contains two sections).
+Furthermore, each section consists of three values.
+
+1. The first value denotes the minimum size for this section to take effect;
+   read it as "if message size is greater than this value (and other section has a larger
+   first value that is also smaller than the message size), use this".
+   In the first section above, this value is "1".
+
+2. The second value is the startup time; this is a constant value that will always
+   be charged, no matter what the size of the message. In the first section above,
+   this value is "3".
+
+3. The third value is the \a per-byte cost. That is, it is charged for every
+   byte of the message (incurring cost messageSize*cost_per_byte)
+   and hence accounts also for larger messages. In the first
+   section of the example above, this value is "2".
+
+Now, SMPI always checks which section it should take for a given message; that is,
+if a message of size 11 is sent with the configuration of the example above, only
+the second section will be used, not the first, as the first value of the second
+section is closer to the message size. Hence, a message of size 11 incurs the
+following cost inside MPI_Send:
+
+\verbatim
+    5+11*1
+\endverbatim
+
+%As 5 is the startup cost and 1 is the cost per byte.
+
+\note
+    The order of sections can be arbitrary; they will be ordered internally.
+
+\subsection options_model_smpi_or smpi/or: Inject constant times for receive operations
+
+This configuration option works exactly as \a smpi/os, see Section \ref options_model_smpi_os.
+Of course, \a smpi/or is used to account for MPI_Recv instead of MPI_Send.
+
+\subsection options_model_smpi_test smpi/test: Inject constant times for calls to MPI_Test
+
+\b Default value: 0.0001
+
+By setting this option, you can control the amount of time a process sleeps
+when MPI_Test() is called; this is important, because SimGrid normally only
+advances the time while communication is happening and thus,
+MPI_Test will not add to the time, resulting in a deadlock if used as a
+break-condition.
+
+Here is an example:
+
+\code{.unparsed}
+    while(!flag) {
+        MPI_Test(request, flag, status);
+        ...
+    }
+\endcode
+
+\note
+    Internally, in order to speed up execution, we use a counter to keep track
+    on how often we already checked if the handle is now valid or not. Hence, we
+    actually use counter*SLEEP_TIME, that is, the time MPI_Test() causes the process
+    to sleep increases linearly with the number of previously failed testk.
+
+
+\subsection options_model_smpi_use_shared_malloc smpi/use_shared_malloc: Use shared memory
+
+\b Default: 1
+
+SMPI can use shared memory by calling shm_* functions; this might speed up the simulation.
+This opens or creates a new POSIX shared memory object, kept in RAM, in /dev/shm.
+
+If you want to disable this behavior, set the value to 0.
+
+\subsection options_model_smpi_wtime smpi/wtime: Inject constant times for calls to MPI_Wtime
+
+\b Default value: 0
+
+By setting this option, you can control the amount of time a process sleeps
+when MPI_Wtime() is called; this is important, because SimGrid normally only
+advances the time while communication is happening and thus,
+MPI_Wtime will not add to the time, resulting in a deadlock if used as a
+break-condition.
+
+Here is an example:
+
+\code{.unparsed}
+    while(MPI_Wtime() < some_time_bound) {
+        ...
+    }
+\endcode
+
+If the time is never advanced, this loop will clearly never end as MPI_Wtime()
+always returns the same value. Hence, pass a (small) value to the smpi/wtime
+option to force a call to MPI_Wtime to advance the time as well.
+
  
  \section options_generic Configuring other aspects of SimGrid
  
@@ -726,7 +1041,7 @@ silently overflow on other parts of the memory.
    for the moment (May 2015).
  
  \note
-  \b Please \b note: You can also pass the command-line option "\b--help" and
+  \b Please \b note: You can also pass the command-line option "--help" and
       "--help-cfg" to an executable that uses simgrid.
  
  - \c clean_atexit: \ref options_generic_clean_atexit
@@ -754,15 +1069,15 @@ silently overflow on other parts of the memory.
  - \c model-check: \ref options_modelchecking
  - \c model-check/checkpoint: \ref options_modelchecking_steps
  - \c model-check/communications_determinism: \ref options_modelchecking_comm_determinism
+- \c model-check/send_determinism: \ref options_modelchecking_comm_determinism
  - \c model-check/dot_output: \ref options_modelchecking_dot_output
  - \c model-check/hash: \ref options_modelchecking_hash
  - \c model-check/property: \ref options_modelchecking_liveness
  - \c model-check/max_depth: \ref options_modelchecking_max_depth
-- \c model-check/record: \ref options_modelchecking_record
+- \c model-check/record: \ref options_modelchecking_recordreplay
  - \c model-check/reduction: \ref options_modelchecking_reduction
-- \c model-check/replay: \ref options_modelchecking_replay
+- \c model-check/replay: \ref options_modelchecking_recordreplay
  - \c model-check/send_determinism: \ref options_modelchecking_sparse_checkpoint
-- \c model-check/snapshot_fds: \ref options_modelchecking_snapshot_fds
  - \c model-check/sparse-checkpoint: \ref options_modelchecking_sparse_checkpoint
  - \c model-check/termination: \ref options_modelchecking_termination
  - \c model-check/timeout: \ref options_modelchecking_timeout
@@ -815,16 +1130,7 @@ silently overflow on other parts of the memory.
  - \c workstation/model: \ref options_model_select
  
  \subsection options_index_smpi_coll Index of SMPI collective algorithms options
-- \c smpi/allgather: \ref options_model_smpi_coll_allgather
-- \c smpi/allgatherv: \ref options_model_smpi_coll_allgatherv
-- \c smpi/allreduce: \ref options_model_smpi_coll_allreduce
-- \c smpi/alltoall: \ref options_model_smpi_coll_alltoall
-- \c smpi/alltoallv: \ref options_model_smpi_coll_alltoallv
-- \c smpi/barrier: \ref options_model_smpi_coll_barrier
-- \c smpi/bcast: \ref options_model_smpi_coll_bcast
-- \c smpi/gather: \ref options_model_smpi_coll_gather
-- \c smpi/reduce: \ref options_model_smpi_coll_reduce
-- \c smpi/reduce_scatter: \ref options_model_smpi_coll_reduce_scatter
-- \c smpi/scatter: \ref options_model_smpi_coll_scatter
+
+TODO: All available collective algorithms will be made available via the ``smpirun --help-coll`` command.
  
  */