TODO for cache friendly

[simgrid.git] / doc / options.doc
diff --git a/doc/options.doc b/doc/options.doc

index 8badcdc..85519c5 100644 (file)
--- a/doc/options.doc
+++ b/doc/options.doc
@@ -2,71 +2,263 @@
  
  \htmlinclude .options.doc.toc
  
  
  \htmlinclude .options.doc.toc
  
-\section options_simgrid_configuration Changing SimGrid's behavior
-
  A number of options can be given at runtime to change the default
  A number of options can be given at runtime to change the default
-SimGrid behavior. In particular, you can change the default cpu and
-network models...
+SimGrid behavior. For a complete list of all configuration options
+accepted by the SimGrid version used in your simulator, simply pass
+the --help configuration flag to your program. If some of the options
+are not documented on this page, this is a bug that you should please
+report so that we can fix it.
  
  
-\subsection options_simgrid_configuration_fullduplex Using Fullduplex
+\section options_using Passing configuration options to the simulators
  
  
-Experimental fullduplex support is now available on the svn branch. In order to fullduple to work your platform must have two links for each pair
-of interconnected hosts, see an example here:
-\verbatim
-       simgrid_svn_sources/exemples/msg/gtnets/fullduplex-p.xml
+There is several way to pass configuration options to the simulators.
+The most common way is to use the \c --cfg command line argument. For
+example, to set the item \c Item to the value \c Value, simply
+type the following: \verbatim
+my_simulator --cfg=Item:Value (other arguments)
  \endverbatim
  
  \endverbatim
  
-Using fullduplex support ongoing and incoming communication flows are
-treated independently for most models. The exception is the LV08 model which 
-adds 0.05 of usage on the opposite direction for each new created flow. This 
-can be useful to simulate some important TCP phenomena such as ack compression. 
+Several \c --cfg command line arguments can naturally be used. If you
+need to include spaces in the argument, don't forget to quote the
+argument. You can even escape the included quotes (write \' for ' if
+you have your argument between ').
  
  
-Running a fullduplex example:
-\verbatim
-       cd simgrid_svn_sources/exemples/msg/gtnets
-       ./gtnets fullduplex-p.xml fullduplex-d.xml --cfg=fullduplex:1
+Another solution is to use the \c \<config\> tag in the platform file. The
+only restriction is that this tag must occure before the first
+platform element (be it \c \<AS\>, \c \<cluster\>, \c \<peer\> or whatever).
+The \c \<config\> tag takes an \c id attribute, but it is currently
+ignored so you don't really need to pass it. The important par is that
+within that tag, you can pass one or several \c \<prop\> tags to specify
+the configuration to use. For example, setting \c Item to \c Value
+can be done by adding the following to the beginning of your platform
+file: \verbatim
+<config>
+  <prop id="Item" value="Value"/>
+</config>
  \endverbatim
  
  \endverbatim
  
-\subsection options_simgrid_configuration_alternate_network Using alternative flow models
+A last solution is to pass your configuration directly using the C
+interface. Unfortunately, this path is not really easy to use right
+now, and you mess directly with surf internal variables as follows. Check the
+\ref XBT_config "relevant page" for details on all the functions you
+can use in this context, \c _surf_cfg_set being the only configuration set
+currently used in SimGrid. \code
+#include <xbt/config.h>
  
  
-The default simgrid network model uses a max-min based approach as
-explained in the research report
-<a href="ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-40.ps.gz">A Network Model for Simulation of Grid Application</a>.
-Other models have been proposed and implemented since then (see for example 
-<a href="http://mescal.imag.fr/membres/arnaud.legrand/articles/simutools09.pdf">Accuracy Study and Improvement of Network Simulation in the SimGrid Framework</a>)
-and can be activated at runtime. For example:
-\verbatim
-./mycode platform.xml deployment.xml --cfg=workstation/model:compound --cfg=network/model:LV08 -cfg=cpu/model:Cas01
-\endverbatim
+extern xbt_cfg_t _surf_cfg_set;
  
  
-Possible models for the network are currently "Constant", "CM02",
-"LegrandVelho", "GTNets", Reno", "Reno2", "Vegas". Others will
-probably be added in the future and many of the previous ones are
-experimental and are likely to disappear without notice... To know the
-list of the currently  implemented models, you should use the
---help-models command line option.
+int main(int argc, char *argv[]) {
+     MSG_global_init(&argc, argv);
+     
+     xbt_cfg_set_parse(_surf_cfg_set,"Item:Value");
+     
+     // Rest of your code
+}
+\endcode
  
  
-\verbatim
-./masterslave_forwarder ../small_platform.xml deployment_masterslave.xml  --help-models
-Long description of the workstation models accepted by this simulator:
-  CLM03: Default workstation model, using LV08 and CM02 as network and CPU
-  compound: Workstation model allowing you to use other network and CPU models
-  ptask_L07: Workstation model with better parallel task modeling
-Long description of the CPU models accepted by this simulator:
-  Cas01_fullupdate: CPU classical model time=size/power
-  Cas01: Variation of Cas01_fullupdate with partial invalidation optimization of lmm system. Should produce the same values, only faster
-  CpuTI: Variation of Cas01 with also trace integration. Should produce the same values, only faster if you use availability traces
-Long description of the network models accepted by this simulator:
-  Constant: Simplistic network model where all communication take a constant time (one second)
-  CM02: Realistic network model with lmm_solve and no correction factors
-  LV08: Realistic network model with lmm_solve and these correction factors: latency*=10.4, bandwidth*=.92, S=8775
-  Reno: Model using lagrange_solve instead of lmm_solve (experts only)
-  Reno2: Model using lagrange_solve instead of lmm_solve (experts only)
-  Vegas: Model using lagrange_solve instead of lmm_solve (experts only)
+\section options_model Configuring the platform models
+
+\subsection options_model_select Selecting the platform models
+
+SimGrid comes with several network and CPU models built in, and you
+can change the used model at runtime by changing the passed
+configuration. The three main configuration items are given below.
+For each of these items, passing the special \c help value gives
+you a short description of all possible values. Also, \c --help-models
+should provide information about all models for all existing resources. 
+   - \b network/model: specify the used network model
+   - \b cpu/model: specify the used CPU model
+   - \b workstation/model: specify the used workstation model
+
+As of writting, the accepted network models are the following. Over
+the time new models can be added, and some experimental models can be
+removed; check the values on your simulators for an uptodate
+information. Note that the CM02 model is described in the research report
+<a href="ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-40.ps.gz">A
+Network Model for Simulation of Grid Application</a> while LV08 is
+described in 
+<a href="http://mescal.imag.fr/membres/arnaud.legrand/articles/simutools09.pdf">Accuracy Study and Improvement of Network Simulation in the SimGrid Framework</a>.
+
+  - \b LV08 (default one): Realistic network analytic model
+    (slow-start modeled by multiplying latency by 10.4, bandwidth by
+    .92; bottleneck sharing uses a payload of S=8775 for evaluating RTT)
+  - \b Constant: Simplistic network model where all communication
+    take a constant time (one second). This model provides the lowest
+    realism, but is (marginally) faster.
+  - \b SMPI: Realistic network model specifically tailored for HPC
+    settings (accurate modeling of slow start with correction factors on
+    three intervals: < 1KiB, < 64 KiB, >= 64 KiB). See also \ref
+    options_model_network_coefs "this section" for more info.
+  - \b CM02: Legacy network analytic model (Very similar to LV08, but
+    without corrective factors. The timings of small messages are thus
+    poorly modeled)
+  - \b Reno: Model from Steven H. Low using lagrange_solve instead of
+    lmm_solve (experts only; check the code for more info).
+  - \b Reno2: Model from Steven H. Low using lagrange_solve instead of
+    lmm_solve (experts only; check the code for more info).
+  - \b Vegas: Model from Steven H. Low using lagrange_solve instead of
+    lmm_solve (experts only; check the code for more info).
+
+If you compiled SimGrid accordingly, you can use packet-level network
+simulators as network models (see \ref pls). In that case, you have
+two extra models, described below, and some \ref options_pls "specific
+additional configuration flags".
+  - \b GTNets: Network pseudo-model using the GTNets simulator instead
+    of an analytic model 
+  - \b NS3: Network pseudo-model using the NS3 tcp model instead of an
+    analytic model     
+
+Concerning the CPU, we have only one model for now:
+  - \b Cas01: Simplistic CPU model (time=size/power)
+  
+The workstation concept is the aggregation of a CPU with a network
+card. Three models exists, but actually, only 2 of them are
+interesting. The "compound" one is simply due to the way our internal
+code is organized, and can easily be ignored. So at the end, you have
+two workstation models: The default one allows to aggregate an
+existing CPU model with an existing network model, but does not allow
+parallel tasks because these beasts need some collaboration between
+the network and CPU model. That is why, ptask_07 is used by default
+when using SimDag.
+  - \b default: Default workstation model. Currently, CPU:Cas01 and 
+    network:LV08 (with cross traffic enabled)
+  - \b compound: Workstation model that is automatically chosen if
+    you change the network and CPU models
+  - \b ptask_L07: Workstation model somehow similar to Cas01+CM02 but
+    allowing parallel tasks
+  
+\subsection options_model_optim Optimization level of the platform models
+
+The network and CPU models that are based on lmm_solve (that
+is, all our analytical models) accept specific optimization
+configurations.
+  - items \b network/optim and \b CPU/optim (both default to 'Lazy'):
+    - \b Lazy: Lazy action management (partial invalidation in lmm +
+      heap in action remaining).
+    - \b TI: Trace integration. Highly optimized mode when using
+      availability traces (only available for the Cas01 CPU model for
+      now). 
+    - \b Full: Full update of remaining and variables. Slow but may be
+      useful when debugging.
+  - items \b network/maxmin_selective_update and
+    \b cpu/maxmin_selective_update: configure whether the underlying
+    should be lazily updated or not. It should have no impact on the
+    computed timings, but should speed up the computation. 
+    
+It is still possible to disable the \c maxmin_selective_update feature
+because it can reveal counter-productive in very specific scenarios
+where the interaction level is high. In particular, if all your
+communication share a given backbone link, you should disable it:
+without \c maxmin_selective_update, every communications are updated
+at each step through a simple loop over them. With that feature
+enabled, every communications will still get updated in this case
+(because of the dependency induced by the backbone), but through a
+complicated pattern aiming at following the actual dependencies.
+
+\subsection options_model_precision Numerical precision of the platform models
+
+The analytical models handle a lot of floating point values. It is
+possible to change the epsilon used to update and compare them through
+the \b maxmin/precision item (default value: 1e-9). Changing it
+may speedup the simulation by discarding very small actions, at the
+price of a reduced numerical precision.
+
+\subsection options_model_nthreads Parallel threads for model updates
+
+By default, Surf computes the analytical models sequentially to share their
+resources and update their actions. It is possible to run them in parallel,
+using the \b surf/nthreads item (default value: 1). If you use a
+negative value, the amount of available cores is automatically
+detected  and used instead.
+
+Depending on the workload of the models and their complexity, you may get a
+speedup or a slowdown because of the synchronization costs of threads.
+
+\subsection options_model_network Configuring the Network model
+
+\subsubsection options_model_network_gamma Maximal TCP window size
+
+The analytical models need to know the maximal TCP window size to take
+the TCP congestion mechanism into account. This is set to 20000 by
+default, but can be changed using the \b network/TCP_gamma item.
+
+On linux, this value can be retrieved using the following
+commands. Both give a set of values, and you should use the last one,
+which is the maximal size.\verbatim
+cat /proc/sys/net/ipv4/tcp_rmem # gives the sender window
+cat /proc/sys/net/ipv4/tcp_wmem # gives the receiver window
  \endverbatim
  
  \endverbatim
  
-\section options_modelchecking Model-Checking
-\subsection options_modelchecking_howto How to use it
+\subsubsection options_model_network_coefs Corrective simulation factors 
+
+These factors allow to betterly take the slow start into account.
+The corresponding values were computed through data fitting one the
+timings of packet-level simulators. You should not change these values
+unless you are really certain of what you are doing. See 
+<a href="http://mescal.imag.fr/membres/arnaud.legrand/articles/simutools09.pdf">Accuracy Study and Improvement of Network Simulation in the SimGrid Framework</a>
+for more informations about these coeficients.
+
+If you are using the SMPI model, these correction coeficients are
+themselves corrected by constant values depending on the size of the
+exchange. Again, only hardcore experts should bother about this fact.
+
+\subsubsection options_model_network_crosstraffic Simulating cross-traffic
+
+As of SimGrid v3.7, cross-traffic effects can be taken into account in
+analytical simulations. It means that ongoing and incoming
+communication flows are treated independently. In addition, the LV08
+model adds 0.05 of usage on the opposite direction for each new
+created flow. This can be useful to simulate some important TCP
+phenomena such as ack compression.
+
+For that to work, your platform must have two links for each
+pair of interconnected hosts. An example of usable platform is
+available in <tt>examples/msg/gtnets/crosstraffic-p.xml</tt>.
+
+This is activated through the \b network/crosstraffic item, that
+can be set to 0 (disable this feature) or 1 (enable it). 
+
+Note that with the default workstation model this option is activated by default.
+
+\subsubsection options_model_network_coord Coordinated-based network models
+
+When you want to use network coordinates, as it happens when you use
+an \<AS\> in your platform file with \c Vivaldi as a routing, you must
+set the \b network/coordinates to \c yes so that all mandatory
+initialization are done in the simulator.
+
+\subsubsection options_model_network_sendergap Simulating sender gap
+
+(this configuration item is experimental and may change or disapear)
+
+It is possible to specify a timing gap between consecutive emission on
+the same network card through the \b network/sender_gap item. This
+is still under investigation as of writting, and the default value is
+to wait 0 seconds between emissions (no gap applied).
+
+\subsubsection options_pls Configuring packet-level pseudo-models
+
+When using the packet-level pseudo-models, several specific
+configuration flags are provided to configure the associated tools.
+There is by far not enough such SimGrid flags to cover every aspects
+of the associated tools, since we only added the items that we
+needed ourselves. Feel free to request more items (or even better:
+provide patches adding more items).
+
+When using NS3, the only existing item is \b ns3/TcpModel,
+corresponding to the ns3::TcpL4Protocol::SocketType configuration item
+in NS3. The only valid values (enforced on the SimGrid side) are
+'NewReno' or 'Reno' or 'Tahoe'.
+
+When using GTNeTS, two items exist: 
+ - \b gtnets/jitter, that is a double value to oscillate
+   the link latency, uniformly in random interval
+   [-latency*gtnets_jitter,latency*gtnets_jitter). It defaults to 0.
+ - \b gtnets/jitter_seed, the positive seed used to reproduce jitted
+   results. Its value must be in [1,1e8] and defaults to 10.
+
+\section options_modelchecking Configuring the Model-Checking
+
  To enable the experimental SimGrid model-checking support the program should
  be executed with the command line argument 
  \verbatim
  To enable the experimental SimGrid model-checking support the program should
  be executed with the command line argument 
  \verbatim
@@ -77,4 +269,225 @@ Properties are expressed as assertions using the function
  void MC_assert(int prop);
  \endverbatim
  
  void MC_assert(int prop);
  \endverbatim
  
+\section options_virt Configuring the User Process Virtualization
+
+\subsection options_virt_factory Selecting the virtualization factory
+
+In SimGrid, the user code is virtualized in a specific mecanism
+allowing the simulation kernel to control its execution: when a user
+process requires a blocking action (such as sending a message), it is
+interrupted, and only gets released when the simulated clock reaches
+the point where the blocking operation is done.
+
+In SimGrid, the containers in which user processes are virtualized are
+called contexts. Several context factory are provided, and you can
+select the one you want to use with the \b contexts/factory
+configuration item. Some of the following may not exist on your
+machine because of portability issues. In any case, the default one
+should be the most effcient one (please report bugs if the
+auto-detection fails for you). They are sorted here from the slowest
+to the most effient:
+ - \b thread: very slow factory using full featured threads (either
+   ptheads or windows native threads) 
+ - \b ucontext: fast factory using System V contexts (or a portability
+   layer of our own on top of Windows fibers)
+ - \b raw: amazingly fast factory using a context switching mecanism
+   of our own, directly implemented in assembly (only available for x86 
+   and amd64 platforms for now)
+
+The only reason to change this setting is when the debuging tools get
+fooled by the optimized context factories. Threads are the most
+debugging-friendly contextes.
+
+\subsection options_virt_stacksize Adapting the used stack size
+
+(this only works if you use ucontexts or raw context factories)
+
+Each virtualized used process is executed using a specific system
+stack. The size of this stack has a huge impact on the simulation
+scalability, but its default value is rather large. This is because
+the error messages that you get when the stack size is too small are
+rather disturbing: this leads to stack overflow (overwriting other
+stacks), leading to segfaults with corrupted stack traces.
+
+If you want to push the scalability limits of your code, you really
+want to reduce the \b contexts/stack_size item. Its default value
+is 128 (in Kib), while our Chord simulation works with stacks as small
+as 16 Kib, for example.
+
+\subsection options_virt_parallel Running user code in parallel
+
+Parallel execution of the user code is only considered stable in
+SimGrid v3.7 and higher. It is described in 
+<a href="http://hal.inria.fr/inria-00602216/">INRIA RR-7653</a>.
+
+If you are using the \c ucontext or \c raw context factories, you can
+request to execute the user code in parallel. Several threads are
+launched, each of them handling as much user contexts at each run. To
+actiave this, set the \b contexts/nthreads item to the amount of
+cores that you have in your computer (or -1 to have the amount of cores
+auto-detected).
+
+Even if you asked several worker threads using the previous option,
+you can request to start the parallel execution (and pay the
+associated synchronization costs) only if the potential parallelism is
+large enough. For that, set the \b contexts/parallel_threshold
+item to the minimal amount of user contexts needed to start the
+parallel execution. In any given simulation round, if that amount is
+not reached, the contexts will be run sequentially directly by the
+main thread (thus saving the synchronization costs). Note that this
+option is mainly useful when the grain of the user code is very fine,
+because our synchronization is now very efficient.
+
+When parallel execution is activated, you can choose the
+synchronization schema used with the \b contexts/synchro item,
+which value is either:
+ - \b futex: ultra optimized synchronisation schema, based on futexes
+   (fast user-mode mutexes), and thus only available on Linux systems.
+   This is the default mode when available.
+ - \b posix: slow but portable synchronisation using only POSIX
+   primitives.
+ - \b busy_wait: not really a synchronisation: the worker threads
+   constantly request new contexts to execute. It should be the most
+   efficient synchronisation schema, but it loads all the cores of your 
+   machine for no good reason. You probably prefer the other less
+   eager schemas.
+
+\section options_tracing Configuring the tracing subsystem
+
+The \ref tracing "tracing subsystem" can be configured in several
+different ways depending on the nature of the simulator (MSG, SimDag,
+SMPI) and the kind of traces that need to be obtained. See the \ref
+tracing_tracing_options "Tracing Configuration Options subsection" to
+get a detailed description of each configuration option.
+
+We detail here a simple way to get the traces working for you, even if
+you never used the tracing API.
+
+
+- Any SimGrid-based simulator (MSG, SimDag, SMPI, ...) and raw traces:
+\verbatim
+--cfg=tracing:1 --cfg=tracing/uncategorized:1 --cfg=triva/uncategorized:uncat.plist
+\endverbatim
+    The first parameter activates the tracing subsystem, the second
+    tells it to trace host and link utilization (without any
+    categorization) and the third creates a graph configuration file
+    to configure Triva when analysing the resulting trace file.
+
+- MSG or SimDag-based simulator and categorized traces (you need to declare categories and classify your tasks according to them)
+\verbatim
+--cfg=tracing:1 --cfg=tracing/categorized:1 --cfg=triva/categorized:cat.plist
+\endverbatim
+    The first parameter activates the tracing subsystem, the second
+    tells it to trace host and link categorized utilization and the
+    third creates a graph configuration file to configure Triva when
+    analysing the resulting trace file.
+
+- SMPI simulator and traces for a space/time view:
+\verbatim
+smpirun -trace ...
+\endverbatim
+    The <i>-trace</i> parameter for the smpirun script runs the
+simulation with --cfg=tracing:1 and --cfg=tracing/smpi:1. Check the
+smpirun's <i>-help</i> parameter for additional tracing options.
+
+\section options_smpi Configuring SMPI
+
+The SMPI interface provides several specific configuration items.
+These are uneasy to see since the code is usually launched through the
+\c smiprun script directly.
+
+\subsection options_smpi_bench Automatic benchmarking of SMPI code
+
+In SMPI, the sequential code is automatically benchmarked, and these
+computations are automatically reported to the simulator. That is to
+say that if you have a large computation between a \c MPI_Recv() and a
+\c MPI_Send(), SMPI will automatically benchmark the duration of this
+code, and create an execution task within the simulator to take this
+into account. For that, the actual duration is measured on the host
+machine and then scaled to the power of the corresponding simulated
+machine. The variable \b smpi/running_power allows to specify the
+computational power of the host machine (in flop/s) to use when
+scaling the execution times. It defaults to 20000, but you really want
+to update it to get accurate simulation results.
+
+When the code is constituted of numerous consecutive MPI calls, the
+previous mechanism feeds the simulation kernel with numerous tiny
+computations. The \b smpi/cpu_threshold item becomes handy when this
+impacts badly the simulation performance. It specify a threshold (in
+second) under which the execution chunks are not reported to the
+simulation kernel (default value: 1e-6). Please note that in some
+circonstances, this optimization can hinder the simulation accuracy. 
+
+\subsection options_smpi_timing Reporting simulation time
+
+Most of the time, you run MPI code through SMPI to compute the time it
+would take to run it on a platform that you don't have. But since the
+code is run through the \c smpirun script, you don't have any control
+on the launcher code, making difficult to report the simulated time
+when the simulation ends. If you set the \b smpi/display_timing item
+to 1, \c smpirun will display this information when the simulation ends. \verbatim
+Simulation time: 1e3 seconds.
+\endverbatim
+
+\section options_generic Configuring other aspects of SimGrid
+
+\subsection options_generic_path XML file inclusion path
+
+It is possible to specify a list of directories to search into for the
+\<include\> tag in XML files by using the \b path configuration
+item. To add several directory to the path, set the configuration
+item several times, as in \verbatim
+--cfg=path:toto --cfg=path:tutu
+\endverbatim
+
+\subsection options_generic_exit Behavior on Ctrl-C
+
+By default, when Ctrl-C is pressed, the status of all existing
+simulated processes is displayed. This is very useful to debug your
+code, but it can reveal troublesome in some cases (such as when the 
+amount of processes becomes really big). This behavior is disabled
+when \b verbose-exit is set to 0 (it is to 1 by default).
+
+\section options_index Index of all existing configuration items
+
+- \c contexts/factory: \ref options_virt_factory
+- \c contexts/nthreads: \ref options_virt_parallel
+- \c contexts/parallel_threshold: \ref options_virt_parallel
+- \c contexts/stack_size: \ref options_virt_stacksize
+- \c contexts/synchro: \ref options_virt_parallel
+
+- \c cpu/maxmin_selective_update: \ref options_model_optim
+- \c cpu/model: \ref options_model_select
+- \c cpu/optim: \ref options_model_optim
+
+- \c gtnets/jitter: \ref options_pls
+- \c gtnets/jitter_seed: \ref options_pls
+
+- \c maxmin/precision: \ref options_model_precision
+
+- \c network/bandwidth_factor: \ref options_model_network_coefs
+- \c network/coordinates: \ref options_model_network_coord
+- \c network/crosstraffic: \ref options_model_network_crosstraffic 
+- \c network/latency_factor: \ref options_model_network_coefs
+- \c network/maxmin_selective_update: \ref options_model_optim
+- \c network/model: \ref options_model_select
+- \c network/optim: \ref options_model_optim
+- \c network/sender_gap: \ref options_model_network_sendergap
+- \c network/TCP_gamma: \ref options_model_network_gamma
+- \c network/weight_S: \ref options_model_network_coefs
+
+- \c ns3/TcpModel: \ref options_pls
+
+- \c surf/nthreads: \ref options_model_nthreads
+
+- \c smpi/running_power: \ref options_smpi_bench
+- \c smpi/display_timing: \ref options_smpi_timing
+- \c smpi/cpu_threshold: \ref options_smpi_bench
+
+- \c path: \ref options_generic_path
+- \c verbose-exit: \ref options_generic_exit
+
+- \c workstation/model: \ref options_model_select
+
  */
 \ No newline at end of file
  */
 \ No newline at end of file