+\section options_virt Configuring the User Process Virtualization
+
+\subsection options_virt_factory Selecting the virtualization factory
+
+In SimGrid, the user code is virtualized in a specific mecanism
+allowing the simulation kernel to control its execution: when a user
+process requires a blocking action (such as sending a message), it is
+interrupted, and only gets released when the simulated clock reaches
+the point where the blocking operation is done.
+
+In SimGrid, the containers in which user processes are virtualized are
+called contexts. Several context factory are provided, and you can
+select the one you want to use with the \b contexts/factory
+configuration item. Some of the following may not exist on your
+machine because of portability issues. In any case, the default one
+should be the most effcient one (please report bugs if the
+auto-detection fails for you). They are sorted here from the slowest
+to the most effient:
+ - \b thread: very slow factory using full featured threads (either
+ ptheads or windows native threads)
+ - \b ucontext: fast factory using System V contexts (or a portability
+ layer of our own on top of Windows fibers)
+ - \b raw: amazingly fast factory using a context switching mecanism
+ of our own, directly implemented in assembly (only available for x86
+ and amd64 platforms for now)
+
+The only reason to change this setting is when the debuging tools get
+fooled by the optimized context factories. Threads are the most
+debugging-friendly contextes.
+
+\subsection options_virt_stacksize Adapting the used stack size
+
+(this only works if you use ucontexts or raw context factories)
+
+Each virtualized used process is executed using a specific system
+stack. The size of this stack has a huge impact on the simulation
+scalability, but its default value is rather large. This is because
+the error messages that you get when the stack size is too small are
+rather disturbing: this leads to stack overflow (overwriting other
+stacks), leading to segfaults with corrupted stack traces.
+
+If you want to push the scalability limits of your code, you really
+want to reduce the \b contexts/stack_size item. Its default value
+is 128 (in Kib), while our Chord simulation works with stacks as small
+as 16 Kib, for example.
+
+\subsection options_virt_parallel Running user code in parallel
+
+Parallel execution of the user code is only considered stable in
+SimGrid v3.7 and higher. It is described in
+<a href="http://hal.inria.fr/inria-00602216/">INRIA RR-7653</a>.
+
+If you are using the \c ucontext or \c raw context factories, you can
+request to execute the user code in parallel. Several threads are
+launched, each of them handling as much user contexts at each run. To
+actiave this, set the \b contexts/nthreads item to the amount of
+cores that you have in your computer (or -1 to have the amount of cores
+auto-detected).
+
+Even if you asked several worker threads using the previous option,
+you can request to start the parallel execution (and pay the
+associated synchronization costs) only if the potential parallelism is
+large enough. For that, set the \b contexts/parallel_threshold
+item to the minimal amount of user contexts needed to start the
+parallel execution. In any given simulation round, if that amount is
+not reached, the contexts will be run sequentially directly by the
+main thread (thus saving the synchronization costs). Note that this
+option is mainly useful when the grain of the user code is very fine,
+because our synchronization is now very efficient.
+
+When parallel execution is activated, you can choose the
+synchronization schema used with the \b contexts/synchro item,
+which value is either:
+ - \b futex: ultra optimized synchronisation schema, based on futexes
+ (fast user-mode mutexes), and thus only available on Linux systems.
+ This is the default mode when available.
+ - \b posix: slow but portable synchronisation using only POSIX
+ primitives.
+ - \b busy_wait: not really a synchronisation: the worker threads
+ constantly request new contexts to execute. It should be the most
+ efficient synchronisation schema, but it loads all the cores of your
+ machine for no good reason. You probably prefer the other less
+ eager schemas.
+
+\section options_tracing Configuring the tracing subsystem
+
+The \ref tracing "tracing subsystem" can be configured in several
+different ways depending on the nature of the simulator (MSG, SimDag,
+SMPI) and the kind of traces that need to be obtained. See the \ref
+tracing_tracing_options "Tracing Configuration Options subsection" to
+get a detailed description of each configuration option.
+
+We detail here a simple way to get the traces working for you, even if
+you never used the tracing API.
+
+
+- Any SimGrid-based simulator (MSG, SimDag, SMPI, ...) and raw traces:
+\verbatim
+--cfg=tracing:1 --cfg=tracing/uncategorized:1 --cfg=triva/uncategorized:uncat.plist
+\endverbatim
+ The first parameter activates the tracing subsystem, the second
+ tells it to trace host and link utilization (without any
+ categorization) and the third creates a graph configuration file
+ to configure Triva when analysing the resulting trace file.
+
+- MSG or SimDag-based simulator and categorized traces (you need to declare categories and classify your tasks according to them)
+\verbatim
+--cfg=tracing:1 --cfg=tracing/categorized:1 --cfg=triva/categorized:cat.plist
+\endverbatim
+ The first parameter activates the tracing subsystem, the second
+ tells it to trace host and link categorized utilization and the
+ third creates a graph configuration file to configure Triva when
+ analysing the resulting trace file.
+
+- SMPI simulator and traces for a space/time view:
+\verbatim
+smpirun -trace ...
+\endverbatim
+ The <i>-trace</i> parameter for the smpirun script runs the
+simulation with --cfg=tracing:1 and --cfg=tracing/smpi:1. Check the
+smpirun's <i>-help</i> parameter for additional tracing options.
+
+\section options_smpi Configuring SMPI
+
+The SMPI interface provides several specific configuration items.
+These are uneasy to see since the code is usually launched through the
+\c smiprun script directly.
+
+\subsection options_smpi_bench Automatic benchmarking of SMPI code
+
+In SMPI, the sequential code is automatically benchmarked, and these
+computations are automatically reported to the simulator. That is to
+say that if you have a large computation between a \c MPI_Recv() and a
+\c MPI_Send(), SMPI will automatically benchmark the duration of this
+code, and create an execution task within the simulator to take this
+into account. For that, the actual duration is measured on the host
+machine and then scaled to the power of the corresponding simulated
+machine. The variable \b smpi/running_power allows to specify the
+computational power of the host machine (in flop/s) to use when
+scaling the execution times. It defaults to 20000, but you really want
+to update it to get accurate simulation results.
+
+When the code is constituted of numerous consecutive MPI calls, the
+previous mechanism feeds the simulation kernel with numerous tiny
+computations. The \b smpi/cpu_threshold item becomes handy when this
+impacts badly the simulation performance. It specify a threshold (in
+second) under which the execution chunks are not reported to the
+simulation kernel (default value: 1e-6). Please note that in some
+circonstances, this optimization can hinder the simulation accuracy.
+
+\subsection options_smpi_timing Reporting simulation time
+
+Most of the time, you run MPI code through SMPI to compute the time it
+would take to run it on a platform that you don't have. But since the
+code is run through the \c smpirun script, you don't have any control
+on the launcher code, making difficult to report the simulated time
+when the simulation ends. If you set the \b smpi/display_timing item
+to 1, \c smpirun will display this information when the simulation ends. \verbatim
+Simulation time: 1e3 seconds.
+\endverbatim
+
+\section options_generic Configuring other aspects of SimGrid
+
+\subsection options_generic_path XML file inclusion path
+
+It is possible to specify a list of directories to search into for the
+\<include\> tag in XML files by using the \b path configuration
+item. To add several directory to the path, set the configuration
+item several times, as in \verbatim
+--cfg=path:toto --cfg=path:tutu
+\endverbatim
+
+\subsection options_generic_exit Behavior on Ctrl-C
+
+By default, when Ctrl-C is pressed, the status of all existing
+simulated processes is displayed. This is very useful to debug your
+code, but it can reveal troublesome in some cases (such as when the
+amount of processes becomes really big). This behavior is disabled
+when \b verbose-exit is set to 0 (it is to 1 by default).
+
+\section options_index Index of all existing configuration items
+
+- \c contexts/factory: \ref options_virt_factory
+- \c contexts/nthreads: \ref options_virt_parallel
+- \c contexts/parallel_threshold: \ref options_virt_parallel
+- \c contexts/stack_size: \ref options_virt_stacksize
+- \c contexts/synchro: \ref options_virt_parallel
+
+- \c cpu/maxmin_selective_update: \ref options_model_optim
+- \c cpu/model: \ref options_model_select
+- \c cpu/optim: \ref options_model_optim
+
+- \c gtnets/jitter: \ref options_pls
+- \c gtnets/jitter_seed: \ref options_pls
+
+- \c maxmin/precision: \ref options_model_precision
+
+- \c network/bandwidth_factor: \ref options_model_network_coefs
+- \c network/coordinates: \ref options_model_network_coord
+- \c network/crosstraffic: \ref options_model_network_crosstraffic
+- \c network/latency_factor: \ref options_model_network_coefs
+- \c network/maxmin_selective_update: \ref options_model_optim
+- \c network/model: \ref options_model_select
+- \c network/optim: \ref options_model_optim
+- \c network/sender_gap: \ref options_model_network_sendergap
+- \c network/TCP_gamma: \ref options_model_network_gamma
+- \c network/weight_S: \ref options_model_network_coefs
+
+- \c ns3/TcpModel: \ref options_pls
+
+- \c surf/nthreads: \ref options_model_nthreads
+
+- \c smpi/running_power: \ref options_smpi_bench
+- \c smpi/display_timing: \ref options_smpi_timing
+- \c smpi/cpu_threshold: \ref options_smpi_bench
+
+- \c path: \ref options_generic_path
+- \c verbose-exit: \ref options_generic_exit
+
+- \c workstation/model: \ref options_model_select
+