From 5937b88aaa18de687b2de15a97ee3ae0dc480f64 Mon Sep 17 00:00:00 2001 From: Martin Quinson Date: Mon, 24 Sep 2018 01:34:11 +0200 Subject: [PATCH] convert options to sphinx Gosh that was a painful week-end. --- doc/doxygen/options.doc | 1342 -------------------------- doc/doxygen/platform.doc | 2 +- docs/source/conf.py | 3 - docs/source/intro_install.rst | 8 +- docs/source/scenar_config.rst | 1532 ++++++++++++++++++++++++++++++ docs/source/scenario.rst | 5 +- tools/cmake/DefinePackages.cmake | 1 - 7 files changed, 1542 insertions(+), 1351 deletions(-) delete mode 100644 doc/doxygen/options.doc diff --git a/doc/doxygen/options.doc b/doc/doxygen/options.doc deleted file mode 100644 index 34bec891a3..0000000000 --- a/doc/doxygen/options.doc +++ /dev/null @@ -1,1342 +0,0 @@ -/*! @page options Configure SimGrid - -@htmlonly -
-@endhtmlonly -@htmlinclude graphical-toc.svg -@htmlonly -
- -@endhtmlonly - -A number of options can be given at runtime to change the default -SimGrid behavior. For a complete list of all configuration options -accepted by the SimGrid version used in your simulator, simply pass -the --help configuration flag to your program. If some of the options -are not documented on this page, this is a bug that you should please -report so that we can fix it. Note that some of the options presented -here may not be available in your simulators, depending on the -@ref install_src_config "compile-time options" that you used. - -@tableofcontents - -@section options_using Passing configuration options to the simulators - -There is several way to pass configuration options to the simulators. -The most common way is to use the @c --cfg command line argument. For -example, to set the item @c Item to the value @c Value, simply -type the following: @verbatim -my_simulator --cfg=Item:Value (other arguments) -@endverbatim - -Several @c `--cfg` command line arguments can naturally be used. If you -need to include spaces in the argument, don't forget to quote the -argument. You can even escape the included quotes (write @' for ' if -you have your argument between '). - -Another solution is to use the @c @ tag in the platform file. The -only restriction is that this tag must occure before the first -platform element (be it @c @, @c @, @c @ or whatever). -The @c @ tag takes an @c id attribute, but it is currently -ignored so you don't really need to pass it. The important par is that -within that tag, you can pass one or several @c @ tags to specify -the configuration to use. For example, setting @c Item to @c Value -can be done by adding the following to the beginning of your platform -file: -@verbatim - - - -@endverbatim - -A last solution is to pass your configuration directly using the C -interface. If you happen to use the MSG interface, this is very easy -with the simgrid::s4u::Engine::setConfig() or MSG_config() functions. If you do not use MSG, that's a bit -more complex, as you have to mess with the internal configuration set -directly as follows. Check the @ref XBT_config "relevant page" for -details on all the functions you can use in this context, @c -_sg_cfg_set being the only configuration set currently used in -SimGrid. - -@code -#include - -int main(int argc, char *argv[]) { - SD_init(&argc, argv); - - /* Prefer MSG_config() if you use MSG!! */ - xbt_cfg_set_parse("Item:Value"); - - // Rest of your code -} -@endcode - -@section options_index Index of all existing configuration options - -@note - The full list can be retrieved by passing "--help" and - "--help-cfg" to an executable that uses SimGrid. - -- @c clean-atexit: @ref options_generic_clean_atexit - -- @c contexts/factory: @ref options_virt_factory -- @c contexts/guard-size: @ref options_virt_guard_size -- @c contexts/nthreads: @ref options_virt_parallel -- @c contexts/parallel-threshold: @ref options_virt_parallel -- @c contexts/stack-size: @ref options_virt_stacksize -- @c contexts/synchro: @ref options_virt_parallel - -- @c cpu/maxmin-selective-update: @ref options_model_optim -- @c cpu/model: @ref options_model_select -- @c cpu/optim: @ref options_model_optim - -- @c exception/cutpath: @ref options_exception_cutpath - -- @c host/model: @ref options_model_select - -- @c maxmin/precision: @ref options_model_precision -- @c maxmin/concurrency-limit: @ref options_concurrency_limit - -- @c msg/debug-multiple-use: @ref options_msg_debug_multiple_use - -- @c model-check: @ref options_modelchecking -- @c model-check/checkpoint: @ref options_modelchecking_steps -- @c model-check/communications-determinism: @ref options_modelchecking_comm_determinism -- @c model-check/dot-output: @ref options_modelchecking_dot_output -- @c model-check/hash: @ref options_modelchecking_hash -- @c model-check/property: @ref options_modelchecking_liveness -- @c model-check/max-depth: @ref options_modelchecking_max_depth -- @c model-check/record: @ref options_modelchecking_recordreplay -- @c model-check/reduction: @ref options_modelchecking_reduction -- @c model-check/replay: @ref options_modelchecking_recordreplay -- @c model-check/send-determinism: @ref options_modelchecking_comm_determinism -- @c model-check/sparse-checkpoint: @ref options_modelchecking_sparse_checkpoint -- @c model-check/termination: @ref options_modelchecking_termination -- @c model-check/timeout: @ref options_modelchecking_timeout -- @c model-check/visited: @ref options_modelchecking_visited - -- @c network/bandwidth-factor: @ref options_model_network_coefs -- @c network/crosstraffic: @ref options_model_network_crosstraffic -- @c network/latency-factor: @ref options_model_network_coefs -- @c network/maxmin-selective-update: @ref options_model_optim -- @c network/model: @ref options_model_select -- @c network/optim: @ref options_model_optim -- @c network/TCP-gamma: @ref options_model_network_gamma -- @c network/weight-S: @ref options_model_network_coefs - -- @c ns3/TcpModel: @ref options_pls -- @c path: @ref options_generic_path -- @c plugin: @ref options_generic_plugin - -- @c simix/breakpoint: @ref options_generic_breakpoint - -- @c storage/max_file_descriptors: @ref option_model_storage_maxfd - -- @c surf/precision: @ref options_model_precision - -- For collective operations of SMPI, please refer to Section @ref options_index_smpi_coll -- @c smpi/async-small-thresh: @ref options_model_network_asyncsend -- @c smpi/bw-factor: @ref options_model_smpi_bw_factor -- @c smpi/coll-selector: @ref options_model_smpi_collectives -- @c smpi/comp-adjustment-file: @ref options_model_smpi_adj_file -- @c smpi/cpu-threshold: @ref options_smpi_bench -- @c smpi/display-timing: @ref options_smpi_timing -- @c smpi/grow-injected-times: @ref options_model_smpi_test -- @c smpi/host-speed: @ref options_smpi_bench -- @c smpi/IB-penalty-factors: @ref options_model_network_coefs -- @c smpi/iprobe: @ref options_model_smpi_iprobe -- @c smpi/iprobe-cpu-usage: @ref options_model_smpi_iprobe_cpu_usage -- @c smpi/init: @ref options_model_smpi_init -- @c smpi/keep-temps: @ref options_smpi_temps -- @c smpi/lat-factor: @ref options_model_smpi_lat_factor -- @c smpi/ois: @ref options_model_smpi_ois -- @c smpi/or: @ref options_model_smpi_or -- @c smpi/os: @ref options_model_smpi_os -- @c smpi/papi-events: @ref options_smpi_papi_events -- @c smpi/privatization: @ref options_smpi_privatization -- @c smpi/privatize-libs: @ref options_smpi_privatize_libs -- @c smpi/send-is-detached-thresh: @ref options_model_smpi_detached -- @c smpi/shared-malloc: @ref options_model_smpi_shared_malloc -- @c smpi/shared-malloc-hugepage: @ref options_model_smpi_shared_malloc -- @c smpi/simulate-computation: @ref options_smpi_bench -- @c smpi/test: @ref options_model_smpi_test -- @c smpi/wtime: @ref options_model_smpi_wtime - -- Tracing configuration options can be found in Section @ref tracing_tracing_options. - -- @c storage/model: @ref options_storage_model -- @c verbose-exit: @ref options_generic_exit - -- @c vm/model: @ref options_vm_model - -@subsection options_index_smpi_coll Index of SMPI collective algorithms options - -TODO: All available collective algorithms will be made available via the ``smpirun --help-coll`` command. - -@section options_model Configuring the platform models - -@anchor options_storage_model -@anchor options_vm_model -@subsection options_model_select Selecting the platform models - -SimGrid comes with several network, CPU and storage models built in, and you -can change the used model at runtime by changing the passed -configuration. The three main configuration items are given below. -For each of these items, passing the special @c help value gives -you a short description of all possible values. Also, @c --help-models -should provide information about all models for all existing resources. - - @b network/model: specify the used network model - - @b cpu/model: specify the used CPU model - - @b host/model: specify the used host model - - @b storage/model: specify the used storage model (there is currently only one such model - this option is hence only useful for future releases) - - @b vm/model: specify the model for virtual machines (there is currently only one such model - this option is hence only useful for future releases) - -As of writing, the following network models are accepted. Over -the time new models can be added, and some experimental models can be -removed; check the values on your simulators for an uptodate -information. Note that the CM02 model is described in the research report -A -Network Model for Simulation of Grid Application while LV08 is -described in -Accuracy Study and Improvement of Network Simulation in the SimGrid Framework. - - - @b LV08 (default one): Realistic network analytic model - (slow-start modeled by multiplying latency by 13.01, bandwidth by - .97; bottleneck sharing uses a payload of S=20537 for evaluating RTT) - - @anchor options_model_select_network_constant @b Constant: Simplistic network model where all communication - take a constant time (one second). This model provides the lowest - realism, but is (marginally) faster. - - @b SMPI: Realistic network model specifically tailored for HPC - settings (accurate modeling of slow start with correction factors on - three intervals: < 1KiB, < 64 KiB, >= 64 KiB). See also @ref - options_model_network_coefs "this section" for more info. - - @b IB: Realistic network model specifically tailored for HPC - settings with InfiniBand networks (accurate modeling contention - behavior, based on the model explained in - http://mescal.imag.fr/membres/jean-marc.vincent/index.html/PhD/Vienne.pdf). - See also @ref options_model_network_coefs "this section" for more info. - - @b CM02: Legacy network analytic model (Very similar to LV08, but - without corrective factors. The timings of small messages are thus - poorly modeled) - - @b Reno: Model from Steven H. Low using lagrange_solve instead of - lmm_solve (experts only; check the code for more info). - - @b Reno2: Model from Steven H. Low using lagrange_solve instead of - lmm_solve (experts only; check the code for more info). - - @b Vegas: Model from Steven H. Low using lagrange_solve instead of - lmm_solve (experts only; check the code for more info). - -If you compiled SimGrid accordingly, you can use packet-level network -simulators as network models (see @ref pls_ns3). In that case, you have -two extra models, described below, and some -@ref options_pls "specific additional configuration flags". - - @b NS3: Network pseudo-model using the NS3 tcp model - -Concerning the CPU, we have only one model for now: - - @b Cas01: Simplistic CPU model (time=size/power) - -The host concept is the aggregation of a CPU with a network -card. Three models exists, but actually, only 2 of them are -interesting. The "compound" one is simply due to the way our internal -code is organized, and can easily be ignored. So at the end, you have -two host models: The default one allows to aggregate an -existing CPU model with an existing network model, but does not allow -parallel tasks because these beasts need some collaboration between -the network and CPU model. That is why, ptask_07 is used by default -when using SimDag. - - @b default: Default host model. Currently, CPU:Cas01 and - network:LV08 (with cross traffic enabled) - - @b compound: Host model that is automatically chosen if - you change the network and CPU models - - @b ptask_L07: Host model somehow similar to Cas01+CM02 but - allowing "parallel tasks", that are intended to model the moldable - tasks of the grid scheduling literature. - -@subsection options_generic_plugin Plugins - -SimGrid plugins allow to extend the framework without changing its -source code directly. Read the source code of the existing plugins to -learn how to do so (in ``src/plugins``), and ask your questions to the -usual channels (Stack Overflow, Mailing list, IRC). The basic idea is -that plugins usually register callbacks to some signals of interest. -If they need to store some information about a given object (Link, CPU -or Actor), they do so through the use of a dedicated object extension. - -Some of the existing plugins can be activated from the command line, -meaning that you can activate them from the command line without any -modification to your simulation code. For example, you can activate -the host energy plugin by adding the following to your command line: - -@verbatim - --cfg=plugin:host_energy -@endverbatim - -Here is the full list of plugins that can be activated this way: - - - @b host_energy: keeps track of the energy dissipated by - computations. More details in @ref plugin_energy. - - @b link_energy: keeps track of the energy dissipated by - communications. More details in @ref SURF_plugin_energy. - - @b host_load: keeps track of the computational load. - More details in @ref plugin_load. - -@subsection options_model_optim Optimization level of the platform models - -The network and CPU models that are based on lmm_solve (that -is, all our analytical models) accept specific optimization -configurations. - - items @b network/optim and @b cpu/optim (both default to 'Lazy'): - - @b Lazy: Lazy action management (partial invalidation in lmm + - heap in action remaining). - - @b TI: Trace integration. Highly optimized mode when using - availability traces (only available for the Cas01 CPU model for - now). - - @b Full: Full update of remaining and variables. Slow but may be - useful when debugging. - - items @b network/maxmin-selective-update and - @b cpu/maxmin-selective-update: configure whether the underlying - should be lazily updated or not. It should have no impact on the - computed timings, but should speed up the computation. - -It is still possible to disable the @c maxmin-selective-update feature -because it can reveal counter-productive in very specific scenarios -where the interaction level is high. In particular, if all your -communication share a given backbone link, you should disable it: -without @c maxmin-selective-update, every communications are updated -at each step through a simple loop over them. With that feature -enabled, every communications will still get updated in this case -(because of the dependency induced by the backbone), but through a -complicated pattern aiming at following the actual dependencies. - -@subsection options_model_precision Numerical precision of the platform models - -The analytical models handle a lot of floating point values. It is -possible to change the epsilon used to update and compare them through -the @b maxmin/precision item (default value: 0.00001). Changing it -may speedup the simulation by discarding very small actions, at the -price of a reduced numerical precision. - -@subsection options_concurrency_limit Concurrency limit - -The maximum number of variables per resource can be tuned through -the @b maxmin/concurrency-limit item. The default value is -1, meaning that -there is no such limitation. You can have as many simultaneous actions per -resources as you want. If your simulation presents a very high level of -concurrency, it may help to use e.g. 100 as a value here. It means that at -most 100 actions can consume a resource at a given time. The extraneous actions -are queued and wait until the amount of concurrency of the considered resource -lowers under the given boundary. - -Such limitations help both to the simulation speed and simulation accuracy -on highly constrained scenarios, but the simulation speed suffers of this -setting on regular (less constrained) scenarios so it is off by default. - -@subsection options_model_network Configuring the Network model - -@subsubsection options_model_network_gamma Maximal TCP window size - -The analytical models need to know the maximal TCP window size to take -the TCP congestion mechanism into account. This is set to 4194304 by -default, but can be changed using the @b network/TCP-gamma item. - -On linux, this value can be retrieved using the following -commands. Both give a set of values, and you should use the last one, -which is the maximal size.@verbatim -cat /proc/sys/net/ipv4/tcp_rmem # gives the sender window -cat /proc/sys/net/ipv4/tcp_wmem # gives the receiver window -@endverbatim - -@subsubsection options_model_network_coefs Correcting important network parameters - -SimGrid can take network irregularities such as a slow startup or -changing behavior depending on the message size into account. -You should not change these values unless you really know what you're doing. - -The corresponding values were computed through data fitting one the -timings of packet-level simulators. - -See -Accuracy Study and Improvement of Network Simulation in the SimGrid Framework -for more information about these parameters. - -If you are using the SMPI model, these correction coefficients are -themselves corrected by constant values depending on the size of the -exchange. Again, only hardcore experts should bother about this fact. - -InfiniBand network behavior can be modeled through 3 parameters, as explained in -this PhD thesis. -These factors can be changed through the following option: - -@verbatim -smpi/IB-penalty-factors:"βe;βs;γs" -@endverbatim - -By default SMPI uses factors computed on the Stampede Supercomputer at TACC, with optimal -deployment of processes on nodes. - -@subsubsection options_model_network_crosstraffic Simulating cross-traffic - -As of SimGrid v3.7, cross-traffic effects can be taken into account in -analytical simulations. It means that ongoing and incoming -communication flows are treated independently. In addition, the LV08 -model adds 0.05 of usage on the opposite direction for each new -created flow. This can be useful to simulate some important TCP -phenomena such as ack compression. - -For that to work, your platform must have two links for each -pair of interconnected hosts. An example of usable platform is -available in examples/platforms/crosstraffic.xml. - -This is activated through the @b network/crosstraffic item, that -can be set to 0 (disable this feature) or 1 (enable it). - -Note that with the default host model this option is activated by default. - -@subsubsection options_model_network_asyncsend Simulating asyncronous send - -(this configuration item is experimental and may change or disapear) - -It is possible to specify that messages below a certain size will be sent -as soon as the call to MPI_Send is issued, without waiting for the -correspondant receive. This threshold can be configured through the -@b smpi/async-small-thresh item. The default value is 0. This behavior can also be -manually set for MSG mailboxes, by setting the receiving mode of the mailbox -with a call to @ref MSG_mailbox_set_async . For MSG, all messages sent to this -mailbox will have this behavior, so consider using two mailboxes if needed. - -This value needs to be smaller than or equals to the threshold set at -@ref options_model_smpi_detached , because asynchronous messages are -meant to be detached as well. - -@subsubsection options_pls Configuring packet-level pseudo-models - -When using the packet-level pseudo-models, several specific -configuration flags are provided to configure the associated tools. -There is by far not enough such SimGrid flags to cover every aspects -of the associated tools, since we only added the items that we -needed ourselves. Feel free to request more items (or even better: -provide patches adding more items). - -When using NS3, the only existing item is @b ns3/TcpModel, -corresponding to the ns3::TcpL4Protocol::SocketType configuration item -in NS3. The only valid values (enforced on the SimGrid side) are -'NewReno' or 'Reno' or 'Tahoe'. - -@subsection options_model_storage Configuring the Storage model - -@subsubsection option_model_storage_maxfd Maximum amount of file descriptors per host - -Each host maintains a fixed-size array of its file descriptors. You -can change its size (1024 by default) through the @b -storage/max_file_descriptors item to either enlarge it if your -application requires it or to reduce it to save memory space. - -@section options_modelchecking Configuring the Model-Checking - -To enable the SimGrid model-checking support the program should -be executed using the simgrid-mc wrapper: -@verbatim -simgrid-mc ./my_program -@endverbatim - -Safety properties are expressed as assertions using the function -@verbatim -void MC_assert(int prop); -@endverbatim - -@subsection options_modelchecking_liveness Specifying a liveness property - -If you want to specify liveness properties (beware, that's -experimental), you have to pass them on the command line, specifying -the name of the file containing the property, as formatted by the -ltl2ba program. - -@verbatim ---cfg=model-check/property: -@endverbatim - -@subsection options_modelchecking_steps Going for stateful verification - -By default, the system is backtracked to its initial state to explore -another path instead of backtracking to the exact step before the fork -that we want to explore (this is called stateless verification). This -is done this way because saving intermediate states can rapidly -exhaust the available memory. If you want, you can change the value of -the model-check/checkpoint variable. For example, the -following configuration will ask to take a checkpoint every step. -Beware, this will certainly explode your memory. Larger values are -probably better, make sure to experiment a bit to find the right -setting for your specific system. - -@verbatim ---cfg=model-check/checkpoint:1 -@endverbatim - -@subsection options_modelchecking_reduction Specifying the kind of reduction - -The main issue when using the model-checking is the state space -explosion. To counter that problem, several exploration reduction -techniques can be used. There is unfortunately no silver bullet here, -and the most efficient reduction techniques cannot be applied to any -properties. In particular, the DPOR method cannot be applied on -liveness properties since it may break some cycles in the exploration -that are important to the property validity. - -@verbatim ---cfg=model-check/reduction: -@endverbatim - -For now, this configuration variable can take 2 values: - * none: Do not apply any kind of reduction (mandatory for now for - liveness properties) - * dpor: Apply Dynamic Partial Ordering Reduction. Only valid if you - verify local safety properties (default value for safety checks). - -@subsection options_modelchecking_visited model-check/visited, Cycle detection - -In order to detect cycles, the model-checker needs to check if a new explored -state is in fact the same state than a previous one. For that, -the model-checker can take a snapshot of each visited state: this snapshot is -then used to compare it with subsequent states in the exploration graph. - -The @b model-check/visited option is the maximum number of states which are stored in -memory. If the maximum number of snapshotted state is reached, some states will -be removed from the memory and some cycles might be missed. Small -values can lead to incorrect verifications, but large value can -exhaust your memory, so choose carefully. - -By default, no state is snapshotted and cycles cannot be detected. - -@subsection options_modelchecking_termination model-check/termination, Non termination detection - -The @b model-check/termination configuration item can be used to report if a -non-termination execution path has been found. This is a path with a cycle -which means that the program might never terminate. - -This only works in safety mode. - -This options is disabled by default. - -@subsection options_modelchecking_dot_output model-check/dot-output, Dot output - -If set, the @b model-check/dot-output configuration item is the name of a file -in which to write a dot file of the path leading the found property (safety or -liveness violation) as well as the cycle for liveness properties. This dot file -can then fed to the graphviz dot tool to generate an corresponding graphical -representation. - -@subsection options_modelchecking_max_depth model-check/max-depth, Depth limit - -The @b model-checker/max-depth can set the maximum depth of the exploration -graph of the model-checker. If this limit is reached, a logging message is -sent and the results might not be exact. - -By default, there is not depth limit. - -@subsection options_modelchecking_timeout Handling of timeout - -By default, the model-checker does not handle timeout conditions: the `wait` -operations never time out. With the @b model-check/timeout configuration item -set to @b yes, the model-checker will explore timeouts of `wait` operations. - -@subsection options_modelchecking_comm_determinism Communication determinism - -The @b model-check/communications-determinism and -@b model-check/send-determinism items can be used to select the communication -determinism mode of the model-checker which checks determinism properties of -the communications of an application. - -@subsection options_modelchecking_sparse_checkpoint Per page checkpoints - -When the model-checker is configured to take a snapshot of each explored state -(with the @b model-checker/visited item), the memory consumption can rapidly -reach GiB ou Tib of memory. However, for many workloads, the memory does not -change much between different snapshots and taking a complete copy of each -snapshot is a waste of memory. - -The @b model-check/sparse-checkpoint option item can be set to @b yes in order -to avoid making a complete copy of each snapshot: instead, each snapshot will be -decomposed in blocks which will be stored separately. -If multiple snapshots share the same block (or if the same block -is used in the same snapshot), the same copy of the block will be shared leading -to a reduction of the memory footprint. - -For many applications, this option considerably reduces the memory consumption. -In somes cases, the model-checker might be slightly slower because of the time -taken to manage the metadata about the blocks. In other cases however, this -snapshotting strategy will be much faster by reducing the cache consumption. -When the memory consumption is important, by avoiding to hit the swap or -reducing the swap usage, this option might be much faster than the basic -snapshotting strategy. - -This option is currently disabled by default. - -@subsection options_mc_perf Performance considerations for the model checker - -The size of the stacks can have a huge impact on the memory -consumption when using model-checking. By default, each snapshot will -save a copy of the whole stacks and not only of the part which is -really meaningful: you should expect the contribution of the memory -consumption of the snapshots to be @f$ @mbox{number of processes} -@times @mbox{stack size} @times @mbox{number of states} @f$. - -The @b model-check/sparse-checkpoint can be used to reduce the memory -consumption by trying to share memory between the different snapshots. - -When compiled against the model checker, the stacks are not -protected with guards: if the stack size is too small for your -application, the stack will silently overflow on other parts of the -memory (see @ref options_virt_guard_size). - -@subsection options_modelchecking_hash Hashing of the state (experimental) - -Usually most of the time of the model-checker is spent comparing states. This -process is complicated and consumes a lot of bandwidth and cache. -In order to speedup the state comparison, the experimental @b model-checker/hash -configuration item enables the computation of a hash summarizing as much -information of the state as possible into a single value. This hash can be used -to avoid most of the comparisons: the costly comparison is then only used when -the hashes are identical. - -Currently most of the state is not included in the hash because the -implementation was found to be buggy and this options is not as useful as -it could be. For this reason, it is currently disabled by default. - -@subsection options_modelchecking_recordreplay Record/replay (experimental) - -As the model-checker keeps jumping at different places in the execution graph, -it is difficult to understand what happens when trying to debug an application -under the model-checker. Event the output of the program is difficult to -interpret. Moreover, the model-checker does not behave nicely with advanced -debugging tools such as valgrind. For those reason, to identify a trajectory -in the execution graph with the model-checker and replay this trajcetory and -without the model-checker black-magic but with more standard tools -(such as a debugger, valgrind, etc.). For this reason, Simgrid implements an -experimental record/replay functionnality in order to record a trajectory with -the model-checker and replay it without the model-checker. - -When the model-checker finds an interesting path in the application execution -graph (where a safety or liveness property is violated), it can generate an -identifier for this path. In order to enable this behavious the -@b model-check/record must be set to @b yes. By default, this behaviour is not -enabled. - -This is an example of output: - -
-[  0.000000] (0:@) Check a safety property
-[  0.000000] (0:@) **************************
-[  0.000000] (0:@) *** PROPERTY NOT VALID ***
-[  0.000000] (0:@) **************************
-[  0.000000] (0:@) Counter-example execution trace:
-[  0.000000] (0:@) Path = 1/3;1/4
-[  0.000000] (0:@) [(1)Tremblay (app)] MC_RANDOM(3)
-[  0.000000] (0:@) [(1)Tremblay (app)] MC_RANDOM(4)
-[  0.000000] (0:@) Expanded states = 27
-[  0.000000] (0:@) Visited states = 68
-[  0.000000] (0:@) Executed transitions = 46
-
- -This path can then be replayed outside of the model-checker (and even in -non-MC build of simgrid) by setting the @b model-check/replay item to the given -path. The other options should be the same (but the model-checker should -be disabled). - -The format and meaning of the path may change between different releases so -the same release of Simgrid should be used for the record phase and the replay -phase. - -@section options_virt Configuring the User Process Virtualization - -@subsection options_virt_factory Selecting the virtualization factory - -In SimGrid, the user code is virtualized in a specific mechanism -that allows the simulation kernel to control its execution: when a user -process requires a blocking action (such as sending a message), it is -interrupted, and only gets released when the simulated clock reaches -the point where the blocking operation is done. This is explained -graphically in the [relevant tutorial, available online](http://simgrid.gforge.inria.fr/tutorials/simgrid-simix-101.pdf). - -In SimGrid, the containers in which user processes are virtualized are -called contexts. Several context factory are provided, and you can -select the one you want to use with the @b contexts/factory -configuration item. Some of the following may not exist on your -machine because of portability issues. In any case, the default one -should be the most effcient one (please report bugs if the -auto-detection fails for you). They are approximately sorted here from -the slowest to the most efficient: - - - @b thread: very slow factory using full featured threads (either - pthreads or windows native threads). They are slow but very - standard. Some debuggers or profilers only work with this factory. - - @b java: Java applications are virtualized onto java threads (that - are regular pthreads registered to the JVM) - - @b ucontext: fast factory using System V contexts (Linux and FreeBSD only) - - @b boost: This uses the [context implementation](http://www.boost.org/doc/libs/1_59_0/libs/context/doc/html/index.html) - of the boost library for a performance that is comparable to our - raw implementation.@n Install the relevant library (e.g. with the - libboost-contexts-dev package on Debian/Ubuntu) and recompile - SimGrid. Note that our implementation is not compatible with recent - implementations of the library, and it will be hard to fix this since - the library's author decided to hide an API that we were using. - - @b raw: amazingly fast factory using a context switching mechanism - of our own, directly implemented in assembly (only available for x86 - and amd64 platforms for now) and without any unneeded system call. - -The main reason to change this setting is when the debugging tools get -fooled by the optimized context factories. Threads are the most -debugging-friendly contextes, as they allow to set breakpoints -anywhere with gdb and visualize backtraces for all processes, in order -to debug concurrency issues. Valgrind is also more comfortable with -threads, but it should be usable with all factories (but the callgrind -tool that really don't like raw and ucontext factories). - -@subsection options_virt_stacksize Adapting the used stack size - -Each virtualized used process is executed using a specific system -stack. The size of this stack has a huge impact on the simulation -scalability, but its default value is rather large. This is because -the error messages that you get when the stack size is too small are -rather disturbing: this leads to stack overflow (overwriting other -stacks), leading to segfaults with corrupted stack traces. - -If you want to push the scalability limits of your code, you might -want to reduce the @b contexts/stack-size item. Its default value -is 8192 (in KiB), while our Chord simulation works with stacks as small -as 16 KiB, for example. For the thread factory, the default value -is the one of the system but you can still change it with this parameter. - -The operating system should only allocate memory for the pages of the -stack which are actually used and you might not need to use this in -most cases. However, this setting is very important when using the -model checker (see @ref options_mc_perf). - -@subsection options_virt_guard_size Disabling stack guard pages - -A stack guard page is usually used which prevents the stack of a given -actor from overflowing on another stack. But the performance impact -may become prohibitive when the amount of actors increases. The -option @b contexts:guard-size is the number of stack guard pages used. -By setting it to 0, no guard pages will be used: in this case, you -should avoid using small stacks (@b stack-size) as the stack will -silently overflow on other parts of the memory. - -When no stack guard page is created, stacks may then silently overflow -on other parts of the memory if their size is too small for the -application. This happens: - -- on Windows systems; -- when the model checker is enabled; -- and of course when guard pages are explicitely disabled (with @b contexts:guard-size=0). - -@subsection options_virt_parallel Running user code in parallel - -Parallel execution of the user code is only considered stable in -SimGrid v3.7 and higher, and mostly for MSG simulations. SMPI -simulations may well fail in parallel mode. It is described in -INRIA RR-7653. - -If you are using the @c ucontext or @c raw context factories, you can -request to execute the user code in parallel. Several threads are -launched, each of them handling as much user contexts at each run. To -actiave this, set the @b contexts/nthreads item to the amount of -cores that you have in your computer (or lower than 1 to have -the amount of cores auto-detected). - -Even if you asked several worker threads using the previous option, -you can request to start the parallel execution (and pay the -associated synchronization costs) only if the potential parallelism is -large enough. For that, set the @b contexts/parallel-threshold -item to the minimal amount of user contexts needed to start the -parallel execution. In any given simulation round, if that amount is -not reached, the contexts will be run sequentially directly by the -main thread (thus saving the synchronization costs). Note that this -option is mainly useful when the grain of the user code is very fine, -because our synchronization is now very efficient. - -When parallel execution is activated, you can choose the -synchronization schema used with the @b contexts/synchro item, -which value is either: - - @b futex: ultra optimized synchronisation schema, based on futexes - (fast user-mode mutexes), and thus only available on Linux systems. - This is the default mode when available. - - @b posix: slow but portable synchronisation using only POSIX - primitives. - - @b busy_wait: not really a synchronisation: the worker threads - constantly request new contexts to execute. It should be the most - efficient synchronisation schema, but it loads all the cores of your - machine for no good reason. You probably prefer the other less - eager schemas. - -@section options_tracing Configuring the tracing subsystem - -The @ref outcomes_vizu "tracing subsystem" can be configured in several -different ways depending on the nature of the simulator (MSG, SimDag, -SMPI) and the kind of traces that need to be obtained. See the @ref -tracing_tracing_options "Tracing Configuration Options subsection" to -get a detailed description of each configuration option. - -We detail here a simple way to get the traces working for you, even if -you never used the tracing API. - - -- Any SimGrid-based simulator (MSG, SimDag, SMPI, ...) and raw traces: -@verbatim ---cfg=tracing:yes --cfg=tracing/uncategorized:yes --cfg=triva/uncategorized:uncat.plist -@endverbatim - The first parameter activates the tracing subsystem, the second - tells it to trace host and link utilization (without any - categorization) and the third creates a graph configuration file - to configure Triva when analysing the resulting trace file. - -- MSG or SimDag-based simulator and categorized traces (you need to declare categories and classify your tasks according to them) -@verbatim ---cfg=tracing:yes --cfg=tracing/categorized:yes --cfg=triva/categorized:cat.plist -@endverbatim - The first parameter activates the tracing subsystem, the second - tells it to trace host and link categorized utilization and the - third creates a graph configuration file to configure Triva when - analysing the resulting trace file. - -- SMPI simulator and traces for a space/time view: -@verbatim -smpirun -trace ... -@endverbatim - The -trace parameter for the smpirun script runs the -simulation with --cfg=tracing:yes and --cfg=tracing/smpi:yes. Check the -smpirun's -help parameter for additional tracing options. - -Sometimes you might want to put additional information on the trace to -correctly identify them later, or to provide data that can be used to -reproduce an experiment. You have two ways to do that: - -- Add a string on top of the trace file as comment: -@verbatim ---cfg=tracing/comment:my_simulation_identifier -@endverbatim - -- Add the contents of a textual file on top of the trace file as comment: -@verbatim ---cfg=tracing/comment-file:my_file_with_additional_information.txt -@endverbatim - -Please, use these two parameters (for comments) to make reproducible -simulations. For additional details about this and all tracing -options, check See the @ref tracing_tracing_options. - -@section options_msg Configuring MSG - -@subsection options_msg_debug_multiple_use Debugging MSG - -Sometimes your application may try to send a task that is still being -executed somewhere else, making it impossible to send this task. However, -for debugging purposes, one may want to know what the other host is/was -doing. This option shows a backtrace of the other process. - -Enable this option by adding - -@verbatim ---cfg=msg/debug-multiple-use:on -@endverbatim - -@section options_smpi Configuring SMPI - -The SMPI interface provides several specific configuration items. -These are uneasy to see since the code is usually launched through the -@c smiprun script directly. - -@subsection options_smpi_bench smpi/bench: Automatic benchmarking of SMPI code - -In SMPI, the sequential code is automatically benchmarked, and these -computations are automatically reported to the simulator. That is to -say that if you have a large computation between a @c MPI_Recv() and a -@c MPI_Send(), SMPI will automatically benchmark the duration of this -code, and create an execution task within the simulator to take this -into account. For that, the actual duration is measured on the host -machine and then scaled to the power of the corresponding simulated -machine. The variable @b smpi/host-speed allows to specify the -computational speed of the host machine (in flop/s) to use when -scaling the execution times. It defaults to 20000, but you really want -to update it to get accurate simulation results. - -When the code is constituted of numerous consecutive MPI calls, the -previous mechanism feeds the simulation kernel with numerous tiny -computations. The @b smpi/cpu-threshold item becomes handy when this -impacts badly the simulation performance. It specifies a threshold (in -seconds) below which the execution chunks are not reported to the -simulation kernel (default value: 1e-6). - -@note - The option smpi/cpu-threshold ignores any computation time spent - below this threshold. SMPI does not consider the @a amount of these - computations; there is no offset for this. Hence, by using a - value that is too low, you may end up with unreliable simulation - results. - -In some cases, however, one may wish to disable simulation of -application computation. This is the case when SMPI is used not to -simulate an MPI applications, but instead an MPI code that performs -"live replay" of another MPI app (e.g., ScalaTrace's replay tool, -various on-line simulators that run an app at scale). In this case the -computation of the replay/simulation logic should not be simulated by -SMPI. Instead, the replay tool or on-line simulator will issue -"computation events", which correspond to the actual MPI simulation -being replayed/simulated. At the moment, these computation events can -be simulated using SMPI by calling internal smpi_execute*() functions. - -To disable the benchmarking/simulation of computation in the simulated -application, the variable @b smpi/simulate-computation should be set to no. - -@note - This option just ignores the timings in your simulation; it still executes - the computations itself. If you want to stop SMPI from doing that, - you should check the SMPI_SAMPLE macros, documented in the section - @ref SMPI_adapting_speed. - -Solution | Computations actually executed? | Computations simulated ? ----------------------------------- | ------------------------------- | ------------------------ ---cfg=smpi/simulate-computation:no | Yes | No, never ---cfg=smpi/cpu-threshold:42 | Yes, in all cases | Only if it lasts more than 42 seconds -SMPI_SAMPLE() macro | Only once per loop nest (see @ref SMPI_adapting_speed "documentation") | Always - -@subsection options_model_smpi_adj_file smpi/comp-adjustment-file: Slow-down or speed-up parts of your code. - -This option allows you to pass a file that contains two columns: The first column -defines the section that will be subject to a speedup; the second column is the speedup. - -For instance: - -@verbatim -"start:stop","ratio" -"exchange_1.f:30:exchange_1.f:130",1.18244559422142 -@endverbatim - -The first line is the header - you must include it. -The following line means that the code between two consecutive MPI calls on -line 30 in exchange_1.f and line 130 in exchange_1.f should receive a speedup -of 1.18244559422142. The value for the second column is therefore a speedup, if it is -larger than 1 and a slow-down if it is smaller than 1. Nothing will be changed if it is -equal to 1. - -Of course, you can set any arbitrary filenames you want (so the start and end don't have to be -in the same file), but be aware that this mechanism only supports @em consecutive calls! - -@note - Please note that you must pass the @b -trace-call-location flag to smpicc - or smpiff, respectively! This flag activates some macro definitions in our - mpi.h / mpi.f files that help with obtaining the call location. - -@subsection options_model_smpi_bw_factor smpi/bw-factor: Bandwidth factors - -The possible throughput of network links is often dependent on the -message sizes, as protocols may adapt to different message sizes. With -this option, a series of message sizes and factors are given, helping -the simulation to be more realistic. For instance, the current -default value is - -@verbatim -65472:0.940694;15424:0.697866;9376:0.58729;5776:1.08739;3484:0.77493;1426:0.608902;732:0.341987;257:0.338112;0:0.812084 -@endverbatim - -So, messages with size 65472 and more will get a total of MAX_BANDWIDTH*0.940694, -messages of size 15424 to 65471 will get MAX_BANDWIDTH*0.697866 and so on. -Here, MAX_BANDWIDTH denotes the bandwidth of the link. - -@note - The SimGrid-Team has developed a script to help you determine these - values. You can find more information and the download here: - 1. http://simgrid.gforge.inria.fr/contrib/smpi-calibration-doc.html - 2. http://simgrid.gforge.inria.fr/contrib/smpi-saturation-doc.html - -@subsection options_smpi_timing smpi/display-timing: Reporting simulation time - -@b Default: 0 (false) - -Most of the time, you run MPI code with SMPI to compute the time it -would take to run it on a platform. But since the -code is run through the @c smpirun script, you don't have any control -on the launcher code, making it difficult to report the simulated time -when the simulation ends. If you set the @b smpi/display-timing item -to 1, @c smpirun will display this information when the simulation ends. @verbatim -Simulation time: 1e3 seconds. -@endverbatim - -@subsection options_smpi_temps smpi/keep-temps: not cleaning up after simulation - -@b Default: 0 (false) - -Under some conditions, SMPI generates a lot of temporary files. They -usually get cleaned, but you may use this option to not erase these -files. This is for example useful when debugging or profiling -executions using the dlopen privatization schema, as missing binary -files tend to fool the debuggers. - -@subsection options_model_smpi_lat_factor smpi/lat-factor: Latency factors - -The motivation and syntax for this option is identical to the motivation/syntax -of smpi/bw-factor, see @ref options_model_smpi_bw_factor for details. - -There is an important difference, though: While smpi/bw-factor @a reduces the -actual bandwidth (i.e., values between 0 and 1 are valid), latency factors -increase the latency, i.e., values larger than or equal to 1 are valid here. - -This is the default value: - -@verbatim -65472:11.6436;15424:3.48845;9376:2.59299;5776:2.18796;3484:1.88101;1426:1.61075;732:1.9503;257:1.95341;0:2.01467 -@endverbatim - -@note - The SimGrid-Team has developed a script to help you determine these - values. You can find more information and the download here: - 1. http://simgrid.gforge.inria.fr/contrib/smpi-calibration-doc.html - 2. http://simgrid.gforge.inria.fr/contrib/smpi-saturation-doc.html - -@subsection options_smpi_papi_events smpi/papi-events: Trace hardware counters with PAPI - -@warning - This option is experimental and will be subject to change. - This feature currently requires superuser privileges, as registers are queried. - Only use this feature with code you trust! Call smpirun for instance via - smpirun -wrapper "sudo " - or run sudo sh -c "echo 0 > /proc/sys/kernel/perf_event_paranoid" - In the later case, sudo will not be required. - -@note - This option is only available when SimGrid was compiled with PAPI support. - -This option takes the names of PAPI counters and adds their respective values -to the trace files. (See Section @ref tracing_tracing_options.) - -It is planned to make this feature available on a per-process (or per-thread?) basis. -The first draft, however, just implements a "global" (i.e., for all processes) set -of counters, the "default" set. - -@verbatim ---cfg=smpi/papi-events:"default:PAPI_L3_LDM:PAPI_L2_LDM" -@endverbatim - -@subsection options_smpi_privatization smpi/privatization: Automatic privatization of global variables - -MPI executables are usually meant to be executed in separated -processes, but SMPI is executed in only one process. Global variables -from executables will be placed in the same memory zone and shared -between processes, causing intricate bugs. Several options are -possible to avoid this, as described in the main -SMPI publication and in -the @ref SMPI_what_globals "SMPI documentation". SimGrid provides two -ways of automatically privatizing the globals, and this option allows -to choose between them. - - - no (default when not using smpirun): Do not automatically privatize variables. - Pass @c -no-privatize to smpirun to disable this feature. - - dlopen or yes (default when using smpirun): Link multiple times against the binary. - - mmap (slower, but maybe somewhat more stable): - Runtime automatic switching of the data segments. - -@warning - This configuration option cannot be set in your platform file. You can only - pass it as an argument to smpirun. - -@subsection options_smpi_privatize_libs smpi/privatize-libs: Automatic privatization of - global variables inside external libraries - -Linux/BSD only: When using dlopen (default) privatization, privatize specific -shared libraries with internal global variables, if they can't be linked statically. -For example libgfortran is usually used for Fortran I/O and indexes in files -can be mixed up. - -@warning - This configuration option can only use either full paths to libraries, or full names. - Check with ldd the name of the library you want to use. - Example: - ldd allpairf90 - libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007fbb4d91b000) - Then you can use --cfg=smpi/privatize-libs:"libgfortran.so.3" or --cfg=smpi/privatize-libs:"/usr/lib/x86_64-linux-gnu/libgfortran.so.3", but not "libgfortran" or "libgfortran.so". - Multiple libraries can be given, semicolon separated. - - -@subsection options_model_smpi_detached Simulating MPI detached send - -This threshold specifies the size in bytes under which the send will return -immediately. This is different from the threshold detailed in @ref options_model_network_asyncsend -because the message is not effectively sent when the send is posted. SMPI still waits for the -correspondant receive to be posted to perform the communication operation. This threshold can be set -by changing the @b smpi/send-is-detached-thresh item. The default value is 65536. - -@subsection options_model_smpi_collectives Simulating MPI collective algorithms - -SMPI implements more than 100 different algorithms for MPI collective communication, to accurately -simulate the behavior of most of the existing MPI libraries. The @b smpi/coll-selector item can be used - to use the decision logic of either OpenMPI or MPICH libraries (values: ompi or mpich, by default SMPI -uses naive version of collective operations). Each collective operation can be manually selected with a -@b smpi/collective_name:algo_name. Available algorithms are listed in @ref SMPI_use_colls . - -@subsection options_model_smpi_iprobe smpi/iprobe: Inject constant times for calls to MPI_Iprobe - -@b Default value: 0.0001 - -The behavior and motivation for this configuration option is identical with @a smpi/test, see -Section @ref options_model_smpi_test for details. - -@subsection options_model_smpi_iprobe_cpu_usage smpi/iprobe-cpu-usage: Reduce speed for iprobe calls - -@b Default value: 1 (no change from default behavior) - -MPI_Iprobe calls can be heavily used in applications. To account correctly for the energy -cores spend probing, it is necessary to reduce the load that these calls cause inside -SimGrid. - -For instance, we measured a max power consumption of 220 W for a particular application but -only 180 W while this application was probing. Hence, the correct factor that should -be passed to this option would be 180/220 = 0.81. - -@subsection options_model_smpi_init smpi/init: Inject constant times for calls to MPI_Init - -@b Default value: 0 - -The behavior for this configuration option is identical with @a smpi/test, see -Section @ref options_model_smpi_test for details. - -@subsection options_model_smpi_ois smpi/ois: Inject constant times for asynchronous send operations - -This configuration option works exactly as @a smpi/os, see Section @ref options_model_smpi_os. -Of course, @a smpi/ois is used to account for MPI_Isend instead of MPI_Send. - -@subsection options_model_smpi_os smpi/os: Inject constant times for send operations - -In several network models such as LogP, send (MPI_Send, MPI_Isend) and receive (MPI_Recv) -operations incur costs (i.e., they consume CPU time). SMPI can factor these costs in as well, but the -user has to configure SMPI accordingly as these values may vary by machine. -This can be done by using smpi/os for MPI_Send operations; for MPI_Isend and -MPI_Recv, use @a smpi/ois and @a smpi/or, respectively. These work exactly as -@a smpi/ois. - -@a smpi/os can consist of multiple sections; each section takes three values, for example: - -@verbatim - 1:3:2;10:5:1 -@endverbatim - -Here, the sections are divided by ";" (that is, this example contains two sections). -Furthermore, each section consists of three values. - -1. The first value denotes the minimum size for this section to take effect; - read it as "if message size is greater than this value (and other section has a larger - first value that is also smaller than the message size), use this". - In the first section above, this value is "1". - -2. The second value is the startup time; this is a constant value that will always - be charged, no matter what the size of the message. In the first section above, - this value is "3". - -3. The third value is the @a per-byte cost. That is, it is charged for every - byte of the message (incurring cost messageSize*cost_per_byte) - and hence accounts also for larger messages. In the first - section of the example above, this value is "2". - -Now, SMPI always checks which section it should take for a given message; that is, -if a message of size 11 is sent with the configuration of the example above, only -the second section will be used, not the first, as the first value of the second -section is closer to the message size. Hence, a message of size 11 incurs the -following cost inside MPI_Send: - -@verbatim - 5+11*1 -@endverbatim - -As 5 is the startup cost and 1 is the cost per byte. - -@note - The order of sections can be arbitrary; they will be ordered internally. - -@subsection options_model_smpi_or smpi/or: Inject constant times for receive operations - -This configuration option works exactly as @a smpi/os, see Section @ref options_model_smpi_os. -Of course, @a smpi/or is used to account for MPI_Recv instead of MPI_Send. - -@subsection options_model_smpi_test smpi/test: Inject constant times for calls to MPI_Test - -@b Default value: 0.0001 - -By setting this option, you can control the amount of time a process sleeps -when MPI_Test() is called; this is important, because SimGrid normally only -advances the time while communication is happening and thus, -MPI_Test will not add to the time, resulting in a deadlock if used as a -break-condition. - -Here is an example: - -@code{.unparsed} - while(!flag) { - MPI_Test(request, flag, status); - ... - } -@endcode - -@note - Internally, in order to speed up execution, we use a counter to keep track - on how often we already checked if the handle is now valid or not. Hence, we - actually use counter*SLEEP_TIME, that is, the time MPI_Test() causes the process - to sleep increases linearly with the number of previously failed tests. This - behavior can be disabled by setting smpi/grow-injected-times to no. This will - also disable this behavior for MPI_Iprobe. - - -@subsection options_model_smpi_shared_malloc smpi/shared-malloc: Factorize malloc()s - -@b Default: global - -If your simulation consumes too much memory, you may want to modify -your code so that the working areas are shared by all MPI ranks. For -example, in a bloc-cyclic matrix multiplication, you will only -allocate one set of blocs, and every processes will share them. -Naturally, this will lead to very wrong results, but this will save a -lot of memory so this is still desirable for some studies. For more on -the motivation for that feature, please refer to the -relevant -section of the SMPI CourseWare (see Activity #2.2 of the pointed -assignment). In practice, change the call to malloc() and free() into -SMPI_SHARED_MALLOC() and SMPI_SHARED_FREE(). - -SMPI provides 2 algorithms for this feature. The first one, called @c -local, allocates one bloc per call to SMPI_SHARED_MALLOC() in your -code (each call location gets its own bloc) and this bloc is shared -amongst all MPI ranks. This is implemented with the shm_* functions -to create a new POSIX shared memory object (kept in RAM, in /dev/shm) -for each shared bloc. - -With the @c global algorithm, each call to SMPI_SHARED_MALLOC() -returns a new adress, but it only points to a shadow bloc: its memory -area is mapped on a 1MiB file on disk. If the returned bloc is of size -N MiB, then the same file is mapped N times to cover the whole bloc. -At the end, no matter how many SMPI_SHARED_MALLOC you do, this will -only consume 1 MiB in memory. - -You can disable this behavior and come back to regular mallocs (for -example for debugging purposes) using @c "no" as a value. - -If you want to keep private some parts of the buffer, for instance if these -parts are used by the application logic and should not be corrupted, you -can use SMPI_PARTIAL_SHARED_MALLOC(size, offsets, offsets_count). - -As an example, - -@code{.C} - mem = SMPI_PARTIAL_SHARED_MALLOC(500, {27,42 , 100,200}, 2); -@endcode - -will allocate 500 bytes to mem, such that mem[27..41] and mem[100..199] -are shared and other area remain private. - -Then, it can be deallocated by calling SMPI_SHARED_FREE(mem). - -When smpi/shared-malloc:global is used, the memory consumption problem -is solved, but it may induce too much load on the kernel's pages table. -In this case, you should use huge pages so that we create only one -entry per Mb of malloced data instead of one entry per 4k. -To activate this, you must mount a hugetlbfs on your system and allocate -at least one huge page: - -@code{.sh} - mkdir /home/huge - sudo mount none /home/huge -t hugetlbfs -o rw,mode=0777 - sudo sh -c 'echo 1 > /proc/sys/vm/nr_hugepages' # echo more if you need more -@endcode - -Then, you can pass the option --cfg=smpi/shared-malloc-hugepage:/home/huge -to smpirun to actually activate the huge page support in shared mallocs. - -@subsection options_model_smpi_wtime smpi/wtime: Inject constant times for calls to MPI_Wtime, gettimeofday and clock_gettime - -@b Default value: 10 ns - -This option controls the amount of (simulated) time spent in calls to -MPI_Wtime(), gettimeofday() and clock_gettime(). If you set this value -to 0, the simulated clock is not advanced in these calls, which leads -to issue if your application contains such a loop: - -@code{.unparsed} - while(MPI_Wtime() < some_time_bound) { - /* some tests, with no communication nor computation */ - } -@endcode - -When the option smpi/wtime is set to 0, the time advances only on -communications and computations, so the previous code results in an -infinite loop: the current [simulated] time will never reach @c -some_time_bound. This infinite loop is avoided when that option is -set to a small amount, as it is by default since SimGrid v3.21. - -Note that if your application does not contain any loop depending on -the current time only, then setting this option to a non-zero value -will slow down your simulations by a tiny bit: the simulation loop has -to be broken and reset each time your code ask for the current time. -If the simulation speed really matters to you, you can avoid this -extra delay by setting smpi/wtime to 0. - -@section options_generic Configuring other aspects of SimGrid - -@subsection options_generic_clean_atexit Cleanup before termination - -The C / C++ standard contains a function called @b [atexit](http://www.cplusplus.com/reference/cstdlib/atexit/). -atexit registers callbacks, which are called just before the program terminates. - -By setting the configuration option clean-atexit to 1 (true), a callback -is registered and will clean up some variables and terminate/cleanup the tracing. - -TODO: Add when this should be used. - -@subsection options_generic_path Profile files' search path - -It is possible to specify a list of directories to search into for the -trace files (see @ref pf_trace) by using the @b path configuration -item. To add several directory to the path, set the configuration -item several times, as in @verbatim ---cfg=path:toto --cfg=path:tutu -@endverbatim - -@subsection options_generic_breakpoint Set a breakpoint - -@verbatim ---cfg=simix/breakpoint:3.1416 -@endverbatim - -This configuration option sets a breakpoint: when the simulated clock reaches -the given time, a SIGTRAP is raised. This can be used to stop the execution and -get a backtrace with a debugger. - -It is also possible to set the breakpoint from inside the debugger, by writing -in global variable simgrid::simix::breakpoint. For example, with gdb: - -@verbatim -set variable simgrid::simix::breakpoint = 3.1416 -@endverbatim - -@subsection options_generic_exit Behavior on Ctrl-C - -By default, when Ctrl-C is pressed, the status of all existing -simulated processes is displayed before exiting the simulation. This is very useful to debug your -code, but it can reveal troublesome in some cases (such as when the -amount of processes becomes really big). This behavior is disabled -when @b verbose-exit is set to 0 (it is to 1 by default). - -@subsection options_exception_cutpath Truncate local path from exception backtrace - -@verbatim ---cfg=exception/cutpath:1 -@endverbatim - -This configuration option is used to remove the path from the -backtrace shown when an exception is thrown. This is mainly useful for -the tests: the full file path makes the tests not reproducible, and -thus failing as we are currently comparing output. Clearly, the path -used on different machines are almost guaranteed to be different and -hence, the output would mismatch, causing the test to fail. - -@section options_log Logging Configuration - -It can be done by using XBT. Go to @ref XBT_log for more details. - -*/ diff --git a/doc/doxygen/platform.doc b/doc/doxygen/platform.doc index c4e0ea49c4..318db98295 100644 --- a/doc/doxygen/platform.doc +++ b/doc/doxygen/platform.doc @@ -427,7 +427,7 @@ to latency. Attribute name | Mandatory | Values | Description --------------- | --------- | ------ | ----------- id | yes | string | The identifier of the link to be used when referring to it. -bandwidth | yes | int | Maximum bandwidth for this link, given in bytes/s +bandwidth | yes | string | Maximum bandwidth for this link, along with its unit. latency | no | double (default: 0.0) | Latency for this link. sharing_policy | no | @ref sharing_policy_shared "SHARED"@|@ref pf_sharing_policy_fatpipe "FATPIPE"@|@ref pf_sharing_policy_splitduplex "SPLITDUPLEX" (default: SHARED) | Sharing policy for the link. bandwidth_file | no | string | Allows you to use a file as input for bandwidth. diff --git a/docs/source/conf.py b/docs/source/conf.py index 9f3d260b78..dc1f498326 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -41,9 +41,6 @@ release = u'3.21' # ones. extensions = [ 'sphinx.ext.todo', -# 'sphinx.ext.coverage', -# 'sphinx.ext.mathjax', -# 'sphinx.ext.ifconfig', 'breathe', 'exhale', 'hidden_code_block', diff --git a/docs/source/intro_install.rst b/docs/source/intro_install.rst index fd7b4c9cd9..ea1761a860 100644 --- a/docs/source/intro_install.rst +++ b/docs/source/intro_install.rst @@ -122,13 +122,15 @@ dependencies. make make install +.. _install_src_config: + Build Configuration ^^^^^^^^^^^^^^^^^^^ This section is about **compile-time options**, that are very -different from @ref options "run-time options". Compile-time options -fall into two categories. *SimGrid-specific options* define which part -of the framework to compile while *Generic options* are provided by +different from :ref:`run-time options `. Compile-time options +fall into two categories. **SimGrid-specific options** define which part +of the framework to compile while **Generic options** are provided by cmake itself. Generic build-time options diff --git a/docs/source/scenar_config.rst b/docs/source/scenar_config.rst index 671a6b0dc2..69ff7dadfe 100644 --- a/docs/source/scenar_config.rst +++ b/docs/source/scenar_config.rst @@ -14,3 +14,1535 @@ Configuring SimGrid

+ +A number of options can be given at runtime to change the default +SimGrid behavior. For a complete list of all configuration options +accepted by the SimGrid version used in your simulator, simply pass +the --help configuration flag to your program. If some of the options +are not documented on this page, this is a bug that you should please +report so that we can fix it. Note that some of the options presented +here may not be available in your simulators, depending on the +:ref:`compile-time options ` that you used. + +Setting Configuration Items +--------------------------- + +There is several way to pass configuration options to the simulators. +The most common way is to use the ``--cfg`` command line argument. For +example, to set the item ``Item`` to the value ``Value``, simply +type the following on the command-line: + +.. code-block:: shell + + my_simulator --cfg=Item:Value (other arguments) + +Several ``--cfg`` command line arguments can naturally be used. If you +need to include spaces in the argument, don't forget to quote the +argument. You can even escape the included quotes (write @' for ' if +you have your argument between '). + +Another solution is to use the ```` tag in the platform file. The +only restriction is that this tag must occure before the first +platform element (be it ````, ````, ```` or whatever). +The ```` tag takes an ``id`` attribute, but it is currently +ignored so you don't really need to pass it. The important part is that +within that tag, you can pass one or several ```` tags to specify +the configuration to use. For example, setting ``Item`` to ``Value`` +can be done by adding the following to the beginning of your platform +file: + +.. code-block:: xml + + + + + +A last solution is to pass your configuration directly in your program +with :cpp:func:`simgrid::s4u::Engine::set_config` or :cpp:func:`MSG_config`. + +.. code-block:: cpp + + #include + + int main(int argc, char *argv[]) { + simgrid::s4u::Engine e(&argc, argv); + + e->set_config("Item:Value"); + + // Rest of your code + } + +Existing Configuration Items +---------------------------- + +.. note:: + The full list can be retrieved by passing ``--help`` and + ``--help-cfg`` to an executable that uses SimGrid. + +- **clean-atexit:** :ref:`cfg=clean-atexit` + +- **contexts/factory:** :ref:`cfg=contexts/factory` +- **contexts/guard-size:** :ref:`cfg=contexts/guard-size` +- **contexts/nthreads:** :ref:`cfg=contexts/nthreads` +- **contexts/parallel-threshold:** :ref:`cfg=contexts/parallel-threshold` +- **contexts/stack-size:** :ref:`cfg=contexts/stack-size` +- **contexts/synchro:** :ref:`cfg=contexts/synchro` + +- **cpu/maxmin-selective-update:** :ref:`Cpu Optimization Level ` +- **cpu/model:** :ref:`options_model_select` +- **cpu/optim:** :ref:`Cpu Optimization Level ` + +- **exception/cutpath:** :ref:`cfg=exception/cutpath` + +- **host/model:** :ref:`options_model_select` + +- **maxmin/precision:** :ref:`cfg=maxmin/precision` +- **maxmin/concurrency-limit:** :ref:`cfg=maxmin/concurrency-limit` + +- **msg/debug-multiple-use:** :ref:`cfg=msg/debug-multiple-use` + +- **model-check:** :ref:`options_modelchecking` +- **model-check/checkpoint:** :ref:`cfg=model-check/checkpoint` +- **model-check/communications-determinism:** :ref:`cfg=model-check/communications-determinism` +- **model-check/dot-output:** :ref:`cfg=model-check/dot-output` +- **model-check/hash:** :ref:`cfg=model-checker/hash` +- **model-check/max-depth:** :ref:`cfg=model-check/max-depth` +- **model-check/property:** :ref:`cfg=model-check/property` +- **model-check/record:** :ref:`cfg=model-check/record` +- **model-check/reduction:** :ref:`cfg=model-check/reduction` +- **model-check/replay:** :ref:`cfg=model-check/replay` +- **model-check/send-determinism:** :ref:`cfg=model-check/send-determinism` +- **model-check/sparse-checkpoint:** :ref:`cfg=model-check/sparse-checkpoint` +- **model-check/termination:** :ref:`cfg=model-check/termination` +- **model-check/timeout:** :ref:`cfg=model-check/timeout` +- **model-check/visited:** :ref:`cfg=model-check/visited` + +- **network/bandwidth-factor:** :ref:`cfg=network/bandwidth-factor` +- **network/crosstraffic:** :ref:`cfg=opt_network/crosstraffic` +- **network/latency-factor:** :ref:`cfg=network/latency-factor` +- **network/maxmin-selective-update:** :ref:`Network Optimization Level ` +- **network/model:** :ref:`options_model_select` +- **network/optim:** :ref:`Network Optimization Level ` +- **network/TCP-gamma:** :ref:`cfg=network/TCP-gamma` +- **network/weight-S:** :ref:`cfg=network/weight-S` + +- **ns3/TcpModel:** :ref:`options_pls` +- **path:** :ref:`cfg=path` +- **plugin:** :ref:`cfg=plugin` + +- **simix/breakpoint:** :ref:`cfg=simix/breakpoint` + +- **storage/max_file_descriptors:** :ref:`cfg=storage/max_file_descriptors` + +- **surf/precision:** :ref:`cfg=surf/precision` + +- **For collective operations of SMPI,** please refer to Section :ref:`options_index_smpi_coll` +- **smpi/async-small-thresh:** :ref:`cfg=smpi/async-small-thresh` +- **smpi/bw-factor:** :ref:`cfg=smpi/bw-factor` +- **smpi/coll-selector:** :ref:`cfg=smpi/coll-selector` +- **smpi/comp-adjustment-file:** :ref:`cfg=smpi/comp-adjustment-file` +- **smpi/cpu-threshold:** :ref:`cfg=smpi/cpu-threshold` +- **smpi/display-timing:** :ref:`cfg=smpi/display-timing` +- **smpi/grow-injected-times:** :ref:`cfg=smpi/grow-injected-times` +- **smpi/host-speed:** :ref:`cfg=smpi/host-speed` +- **smpi/IB-penalty-factors:** :ref:`cfg=smpi/IB-penalty-factors` +- **smpi/iprobe:** :ref:`cfg=smpi/iprobe` +- **smpi/iprobe-cpu-usage:** :ref:`cfg=smpi/iprobe-cpu-usage` +- **smpi/init:** :ref:`cfg=smpi/init` +- **smpi/keep-temps:** :ref:`cfg=smpi/keep-temps` +- **smpi/lat-factor:** :ref:`cfg=smpi/lat-factor` +- **smpi/ois:** :ref:`cfg=smpi/ois` +- **smpi/or:** :ref:`cfg=smpi/or` +- **smpi/os:** :ref:`cfg=smpi/os` +- **smpi/papi-events:** :ref:`cfg=smpi/papi-events` +- **smpi/privatization:** :ref:`cfg=smpi/privatization` +- **smpi/privatize-libs:** :ref:`cfg=smpi/privatize-libs` +- **smpi/send-is-detached-thresh:** :ref:`cfg=smpi/send-is-detached-thresh` +- **smpi/shared-malloc:** :ref:`cfg=smpi/shared-malloc` +- **smpi/shared-malloc-hugepage:** :ref:`cfg=smpi/shared-malloc-hugepage` +- **smpi/simulate-computation:** :ref:`cfg=smpi/simulate-computation` +- **smpi/test:** :ref:`cfg=smpi/test` +- **smpi/wtime:** :ref:`cfg=smpi/wtime` + +- **Tracing configuration options** can be found in Section :ref:`tracing_tracing_options` + +- **storage/model:** :ref:`options_model_select` +- **verbose-exit:** :ref:`cfg=verbose-exit` + +- **vm/model:** :ref:`options_model_select` + +.. _options_index_smpi_coll: + +Index of SMPI collective algorithms options + +.. TODO:: All available collective algorithms will be made available + via the ``smpirun --help-coll`` command. + +.. _options_model: + +Configuring the Platform Models +------------------------------- + +.. _options_model_select: + +Choosing the Platform Models +............................ + +SimGrid comes with several network, CPU and storage models built in, +and you can change the used model at runtime by changing the passed +configuration. The three main configuration items are given below. +For each of these items, passing the special ``help`` value gives you +a short description of all possible values (for example, +``--cfg=network/model:help`` will present all provided network +models). Also, ``--help-models`` should provide information about all +models for all existing resources. + +- ``network/model``: specify the used network model. Possible values: + + - **LV08 (default one):** Realistic network analytic model + (slow-start modeled by multiplying latency by 13.01, bandwidth by + .97; bottleneck sharing uses a payload of S=20537 for evaluating + RTT). Described in `Accuracy Study and Improvement of Network + Simulation in the SimGrid Framework + `_. + - **Constant:** Simplistic network model where all communication + take a constant time (one second). This model provides the lowest + realism, but is (marginally) faster. + - **SMPI:** Realistic network model specifically tailored for HPC + settings (accurate modeling of slow start with correction factors on + three intervals: < 1KiB, < 64 KiB, >= 64 KiB). This model can be + :ref:`further configured `. + - **IB:** Realistic network model specifically tailored for HPC + settings with InfiniBand networks (accurate modeling contention + behavior, based on the model explained in `this PhD work + `_. + This model can be :ref:`further configured `. + - **CM02:** Legacy network analytic model. Very similar to LV08, but + without corrective factors. The timings of small messages are thus + poorly modeled. This model is described in `A Network Model for + Simulation of Grid Application + `_. + - **Reno/Reno2/Vegas:** Models from Steven H. Low using lagrange_solve instead of + lmm_solve (experts only; check the code for more info). + - **NS3** (only available if you compiled SimGrid accordingly): + Use the packet-level network + simulators as network models (see :ref:`pls_ns3`). + This model can be :ref:`further configured `. + +- ``cpu/model``: specify the used CPU model. We have only one model + for now: + + - **Cas01:** Simplistic CPU model (time=size/power) + +- ``host/model``: The host concept is the aggregation of a CPU with a + network card. Three models exists, but actually, only 2 of them are + interesting. The "compound" one is simply due to the way our + internal code is organized, and can easily be ignored. So at the + end, you have two host models: The default one allows to aggregate + an existing CPU model with an existing network model, but does not + allow parallel tasks because these beasts need some collaboration + between the network and CPU model. That is why, ptask_07 is used by + default when using SimDag. + + - **default:** Default host model. Currently, CPU:Cas01 and + network:LV08 (with cross traffic enabled) + - **compound:** Host model that is automatically chosen if + you change the network and CPU models + - **ptask_L07:** Host model somehow similar to Cas01+CM02 but + allowing "parallel tasks", that are intended to model the moldable + tasks of the grid scheduling literature. + +- ``storage/model``: specify the used storage model. Only one model is + provided so far. +- ``vm/model``: specify the model for virtual machines. Only one model + is provided so far. + +.. todo: make 'compound' the default host model. + +.. _options_model_optim: + +Optimization Level +.................. + +The network and CPU models that are based on lmm_solve (that +is, all our analytical models) accept specific optimization +configurations. + + - items ``network/optim`` and ``cpu/optim`` (both default to 'Lazy'): + + - **Lazy:** Lazy action management (partial invalidation in lmm + + heap in action remaining). + - **TI:** Trace integration. Highly optimized mode when using + availability traces (only available for the Cas01 CPU model for + now). + - **Full:** Full update of remaining and variables. Slow but may be + useful when debugging. + + - items ``network/maxmin-selective-update`` and + ``cpu/maxmin-selective-update``: configure whether the underlying + should be lazily updated or not. It should have no impact on the + computed timings, but should speed up the computation. |br| It is + still possible to disable this feature because it can reveal + counter-productive in very specific scenarios where the + interaction level is high. In particular, if all your + communication share a given backbone link, you should disable it: + without it, a simple regular loop is used to update each + communication. With it, each of them is still updated (because of + the dependency induced by the backbone), but through a complicated + and slow pattern that follows the actual dependencies. + +.. _cfg=maxmin/precision: +.. _cfg=surf/precision: + +Numerical Precision +................... + +**Option** ``maxmin/precision`` **Default:** 0.00001 (in flops or bytes) |br| +**Option** ``surf/precision`` **Default:** 0.00001 (in seconds) + +The analytical models handle a lot of floating point values. It is +possible to change the epsilon used to update and compare them through +this configuration item. Changing it may speedup the simulation by +discarding very small actions, at the price of a reduced numerical +precision. You can modify separately the precision used to manipulate +timings (in seconds) and the one used to manipulate amounts of work +(in flops or bytes). + +.. _cfg=maxmin/concurrency-limit: + +Concurrency Limit +................. + +**Option** ``maxmin/concurrency-limit`` **Default:** -1 (no limit) + +The maximum number of variables per resource can be tuned through this +option. You can have as many simultaneous actions per resources as you +want. If your simulation presents a very high level of concurrency, it +may help to use e.g. 100 as a value here. It means that at most 100 +actions can consume a resource at a given time. The extraneous actions +are queued and wait until the amount of concurrency of the considered +resource lowers under the given boundary. + +Such limitations help both to the simulation speed and simulation accuracy +on highly constrained scenarios, but the simulation speed suffers of this +setting on regular (less constrained) scenarios so it is off by default. + +.. _options_model_network: + +Configuring the Network Model +............................. + +.. _cfg=network/TCP-gamma: + +Maximal TCP Window Size +^^^^^^^^^^^^^^^^^^^^^^^ + +**Option** ``network/TCP-gamma`` **Default:** 4194304 + +The analytical models need to know the maximal TCP window size to take +the TCP congestion mechanism into account. On Linux, this value can +be retrieved using the following commands. Both give a set of values, +and you should use the last one, which is the maximal size. + +.. code-block:: shell + + cat /proc/sys/net/ipv4/tcp_rmem # gives the sender window + cat /proc/sys/net/ipv4/tcp_wmem # gives the receiver window + +.. _cfg=smpi/IB-penalty-factors: +.. _cfg=network/bandwidth-factor: +.. _cfg=network/latency-factor: +.. _cfg=network/weight-S: + +Correcting Important Network Parameters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +SimGrid can take network irregularities such as a slow startup or +changing behavior depending on the message size into account. You +should not change these values unless you really know what you're +doing. The corresponding values were computed through data fitting +one the timings of packet-level simulators, as described in `Accuracy +Study and Improvement of Network Simulation in the SimGrid Framework +`_. + + +If you are using the SMPI model, these correction coefficients are +themselves corrected by constant values depending on the size of the +exchange. By default SMPI uses factors computed on the Stampede +Supercomputer at TACC, with optimal deployment of processes on +nodes. Again, only hardcore experts should bother about this fact. + +InfiniBand network behavior can be modeled through 3 parameters +``smpi/IB-penalty-factors:"βe;βs;γs"``, as explained in `this PhD +thesis +`_. + +.. todo:: This section should be rewritten, and actually explain the + options network/bandwidth-factor, network/latency-factor, + network/weight-S. + +.. _opt_network/crosstraffic: + +Simulating Cross-Traffic +^^^^^^^^^^^^^^^^^^^^^^^^ + +Since SimGrid v3.7, cross-traffic effects can be taken into account in +analytical simulations. It means that ongoing and incoming +communication flows are treated independently. In addition, the LV08 +model adds 0.05 of usage on the opposite direction for each new +created flow. This can be useful to simulate some important TCP +phenomena such as ack compression. + +For that to work, your platform must have two links for each +pair of interconnected hosts. An example of usable platform is +available in ``examples/platforms/crosstraffic.xml``. + +This is activated through the ``network/crosstraffic`` item, that +can be set to 0 (disable this feature) or 1 (enable it). + +Note that with the default host model this option is activated by default. + +.. _cfg=smpi/async-small-thresh: + +Simulating Asyncronous Send +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +(this configuration item is experimental and may change or disapear) + +It is possible to specify that messages below a certain size will be +sent as soon as the call to MPI_Send is issued, without waiting for +the correspondant receive. This threshold can be configured through +the ``smpi/async-small-thresh`` item. The default value is 0. This +behavior can also be manually set for mailboxes, by setting the +receiving mode of the mailbox with a call to +:cpp:func:`MSG_mailbox_set_async`. After this, all messages sent to +this mailbox will have this behavior regardless of the message size. + +This value needs to be smaller than or equals to the threshold set at +@ref options_model_smpi_detached , because asynchronous messages are +meant to be detached as well. + +.. _options_pls: + +Configuring NS3 +^^^^^^^^^^^^^^^ + +**Option** ``ns3/TcpModel`` **Default:** "default" (NS3 default) + +When using NS3, there is an extra item ``ns3/TcpModel``, corresponding +to the ``ns3::TcpL4Protocol::SocketType`` configuration item in +NS3. The only valid values (enforced on the SimGrid side) are +'default' (no change to the NS3 configuration), 'NewReno' or 'Reno' or +'Tahoe'. + +Configuring the Storage model +............................. + +.. _cfg=storage/max_file_descriptors: + +File Descriptor Cound per Host +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Option** ``storage/max_file_descriptors`` **Default:** 1024 + +Each host maintains a fixed-size array of its file descriptors. You +can change its size through this item to either enlarge it if your +application requires it or to reduce it to save memory space. + +.. _cfg=plugin: + +Activating Plugins +------------------ + +SimGrid plugins allow to extend the framework without changing its +source code directly. Read the source code of the existing plugins to +learn how to do so (in ``src/plugins``), and ask your questions to the +usual channels (Stack Overflow, Mailing list, IRC). The basic idea is +that plugins usually register callbacks to some signals of interest. +If they need to store some information about a given object (Link, CPU +or Actor), they do so through the use of a dedicated object extension. + +Some of the existing plugins can be activated from the command line, +meaning that you can activate them from the command line without any +modification to your simulation code. For example, you can activate +the host energy plugin by adding ``--cfg=plugin:host_energy`` to your +command line. + +Here is the full list of plugins that can be activated this way: + + - **host_energy:** keeps track of the energy dissipated by + computations. More details in @ref plugin_energy. + - **link_energy:** keeps track of the energy dissipated by + communications. More details in @ref SURF_plugin_energy. + - **host_load:** keeps track of the computational load. + More details in @ref plugin_load. + +.. _options_modelchecking: + +Configuring the Model-Checking +------------------------------ + +To enable the SimGrid model-checking support the program should +be executed using the simgrid-mc wrapper: + +.. code-block:: shell + + simgrid-mc ./my_program + +Safety properties are expressed as assertions using the function +:cpp:func:`void MC_assert(int prop)`. + +.. _cfg=model-check/property: + +Specifying a liveness property +.............................. + +**Option** ``model-check/property`` **Default:** unset + +If you want to specify liveness properties, you have to pass them on +the command line, specifying the name of the file containing the +property, as formatted by the ltl2ba program. + + +.. code-block:: shell + + simgrid-mc ./my_program --cfg=model-check/property: + +.. _cfg=model-check/checkpoint: + +Going for Stateful Verification +............................... + +By default, the system is backtracked to its initial state to explore +another path instead of backtracking to the exact step before the fork +that we want to explore (this is called stateless verification). This +is done this way because saving intermediate states can rapidly +exhaust the available memory. If you want, you can change the value of +the ``model-check/checkpoint`` item. For example, +``--cfg=model-check/checkpoint:1`` asks to take a checkpoint every +step. Beware, this will certainly explode your memory. Larger values +are probably better, make sure to experiment a bit to find the right +setting for your specific system. + +.. _cfg=model-check/reduction: + +Specifying the kind of reduction +................................ + +The main issue when using the model-checking is the state space +explosion. To counter that problem, you can chose a exploration +reduction techniques with +``--cfg=model-check/reduction:``. For now, this +configuration variable can take 2 values: + + - **none:** Do not apply any kind of reduction (mandatory for now for + liveness properties) + - **dpor:** Apply Dynamic Partial Ordering Reduction. Only valid if + you verify local safety properties (default value for safety + checks). + +There is unfortunately no silver bullet here, and the most efficient +reduction techniques cannot be applied to any properties. In +particular, the DPOR method cannot be applied on liveness properties +since our implementation of DPOR may break some cycles, while cycles +are very important to the soundness of the exploration for liveness +properties. + +.. _cfg=model-check/visited: + +Size of Cycle Detection Set +........................... + +In order to detect cycles, the model-checker needs to check if a new +explored state is in fact the same state than a previous one. For +that, the model-checker can take a snapshot of each visited state: +this snapshot is then used to compare it with subsequent states in the +exploration graph. + +The ``model-check/visited`` item is the maximum number of states which +are stored in memory. If the maximum number of snapshotted state is +reached, some states will be removed from the memory and some cycles +might be missed. Small values can lead to incorrect verifications, but +large value can exhaust your memory, so choose carefully. + +By default, no state is snapshotted and cycles cannot be detected. + +.. _cfg=model-check/termination: + +Non-Termination Detection +......................... + +The ``model-check/termination`` configuration item can be used to +report if a non-termination execution path has been found. This is a +path with a cycle which means that the program might never terminate. + +This only works in safety mode, not in liveness mode. + +This options is disabled by default. + +.. _cfg=model-check/dot-output: + +Dot Output +.......... + +If set, the ``model-check/dot-output`` configuration item is the name +of a file in which to write a dot file of the path leading the found +property (safety or liveness violation) as well as the cycle for +liveness properties. This dot file can then fed to the graphviz dot +tool to generate an corresponding graphical representation. + +.. _cfg=model-check/max-depth: + +Exploration Depth Limit +....................... + +The ``model-checker/max-depth`` can set the maximum depth of the +exploration graph of the model-checker. If this limit is reached, a +logging message is sent and the results might not be exact. + +By default, there is not depth limit. + +.. _cfg=model-check/timeout: + +Handling of Timeouts +.................... + +By default, the model-checker does not handle timeout conditions: the `wait` +operations never time out. With the ``model-check/timeout`` configuration item +set to **yes**, the model-checker will explore timeouts of `wait` operations. + +.. _cfg=model-check/communications-determinism: +.. _cfg=model-check/send-determinism: + +Communication Determinism +......................... + +The ``model-check/communications-determinism`` and +``model-check/send-determinism`` items can be used to select the +communication determinism mode of the model-checker which checks +determinism properties of the communications of an application. + +.. _cfg=model-check/sparse-checkpoint: + +Incremental Checkpoints +....................... + +When the model-checker is configured to take a snapshot of each +explored state (with the ``model-checker/visited`` item), the memory +consumption can rapidly reach GiB ou Tib of memory. However, for many +workloads, the memory does not change much between different snapshots +and taking a complete copy of each snapshot is a waste of memory. + +The ``model-check/sparse-checkpoint`` option item can be set to +**yes** to avoid making a complete copy of each snapshot. Instead, +each snapshot will be decomposed in blocks which will be stored +separately. If multiple snapshots share the same block (or if the +same block is used in the same snapshot), the same copy of the block +will be shared leading to a reduction of the memory footprint. + +For many applications, this option considerably reduces the memory +consumption. In somes cases, the model-checker might be slightly +slower because of the time taken to manage the metadata about the +blocks. In other cases however, this snapshotting strategy will be +much faster by reducing the cache consumption. When the memory +consumption is important, by avoiding to hit the swap or reducing the +swap usage, this option might be much faster than the basic +snapshotting strategy. + +This option is currently disabled by default. + +Verification Performance Considerations +....................................... + +The size of the stacks can have a huge impact on the memory +consumption when using model-checking. By default, each snapshot will +save a copy of the whole stacks and not only of the part which is +really meaningful: you should expect the contribution of the memory +consumption of the snapshots to be @f$ @mbox{number of processes} +@times @mbox{stack size} @times @mbox{number of states} @f$. + +The @b model-check/sparse-checkpoint can be used to reduce the memory +consumption by trying to share memory between the different snapshots. + +When compiled against the model checker, the stacks are not +protected with guards: if the stack size is too small for your +application, the stack will silently overflow on other parts of the +memory (see @ref options_virt_guard_size). + +.. _cfg=model-checker/hash: + +State Hashing +............. + +Usually most of the time of the model-checker is spent comparing states. This +process is complicated and consumes a lot of bandwidth and cache. +In order to speedup the state comparison, the experimental ``model-checker/hash`` +configuration item enables the computation of a hash summarizing as much +information of the state as possible into a single value. This hash can be used +to avoid most of the comparisons: the costly comparison is then only used when +the hashes are identical. + +Currently most of the state is not included in the hash because the +implementation was found to be buggy and this options is not as useful as +it could be. For this reason, it is currently disabled by default. + +.. _cfg=model-check/record: +.. _cfg=model-check/replay: + +Record/Replay of Verification +............................. + +As the model-checker keeps jumping at different places in the execution graph, +it is difficult to understand what happens when trying to debug an application +under the model-checker. Event the output of the program is difficult to +interpret. Moreover, the model-checker does not behave nicely with advanced +debugging tools such as valgrind. For those reason, to identify a trajectory +in the execution graph with the model-checker and replay this trajcetory and +without the model-checker black-magic but with more standard tools +(such as a debugger, valgrind, etc.). For this reason, Simgrid implements an +experimental record/replay functionnality in order to record a trajectory with +the model-checker and replay it without the model-checker. + +When the model-checker finds an interesting path in the application +execution graph (where a safety or liveness property is violated), it +can generate an identifier for this path. To enable this behavious the +``model-check/record`` must be set to **yes**, which is not the case +by default. + +Here is an example of output: + +.. code-block:: shell + + [ 0.000000] (0:@) Check a safety property + [ 0.000000] (0:@) ************************** + [ 0.000000] (0:@) *** PROPERTY NOT VALID *** + [ 0.000000] (0:@) ************************** + [ 0.000000] (0:@) Counter-example execution trace: + [ 0.000000] (0:@) Path = 1/3;1/4 + [ 0.000000] (0:@) [(1)Tremblay (app)] MC_RANDOM(3) + [ 0.000000] (0:@) [(1)Tremblay (app)] MC_RANDOM(4) + [ 0.000000] (0:@) Expanded states = 27 + [ 0.000000] (0:@) Visited states = 68 + [ 0.000000] (0:@) Executed transitions = 46 + +This path can then be replayed outside of the model-checker (and even +in non-MC build of simgrid) by setting the ``model-check/replay`` item +to the given path. The other options should be the same (but the +model-checker should be disabled). + +The format and meaning of the path may change between different +releases so the same release of Simgrid should be used for the record +phase and the replay phase. + +Configuring the User Code Virtualization +---------------------------------------- + +.. _cfg=contexts/factory: + +Selecting the Virtualization Factory +.................................... + +**Option** contexts/factory **Default:** "raw" + +In SimGrid, the user code is virtualized in a specific mechanism that +allows the simulation kernel to control its execution: when a user +process requires a blocking action (such as sending a message), it is +interrupted, and only gets released when the simulated clock reaches +the point where the blocking operation is done. This is explained +graphically in the `relevant tutorial, available online +`_. + +In SimGrid, the containers in which user processes are virtualized are +called contexts. Several context factory are provided, and you can +select the one you want to use with the ``contexts/factory`` +configuration item. Some of the following may not exist on your +machine because of portability issues. In any case, the default one +should be the most effcient one (please report bugs if the +auto-detection fails for you). They are approximately sorted here from +the slowest to the most efficient: + + - **thread:** very slow factory using full featured threads (either + pthreads or windows native threads). They are slow but very + standard. Some debuggers or profilers only work with this factory. + - **java:** Java applications are virtualized onto java threads (that + are regular pthreads registered to the JVM) + - **ucontext:** fast factory using System V contexts (Linux and FreeBSD only) + - **boost:** This uses the `context + implementation `_ + of the boost library for a performance that is comparable to our + raw implementation. + |br| Install the relevant library (e.g. with the + libboost-contexts-dev package on Debian/Ubuntu) and recompile + SimGrid. + - **raw:** amazingly fast factory using a context switching mechanism + of our own, directly implemented in assembly (only available for x86 + and amd64 platforms for now) and without any unneeded system call. + +The main reason to change this setting is when the debugging tools get +fooled by the optimized context factories. Threads are the most +debugging-friendly contextes, as they allow to set breakpoints +anywhere with gdb and visualize backtraces for all processes, in order +to debug concurrency issues. Valgrind is also more comfortable with +threads, but it should be usable with all factories (Exception: the +callgrind tool really dislikes raw and ucontext factories). + +.. _cfg=contexts/stack-size: + +Adapting the Stack Size +....................... + +**Option** ``contexts/stack-size`` **Default:** 8192 KiB + +Each virtualized used process is executed using a specific system +stack. The size of this stack has a huge impact on the simulation +scalability, but its default value is rather large. This is because +the error messages that you get when the stack size is too small are +rather disturbing: this leads to stack overflow (overwriting other +stacks), leading to segfaults with corrupted stack traces. + +If you want to push the scalability limits of your code, you might +want to reduce the ``contexts/stack-size`` item. Its default value is +8192 (in KiB), while our Chord simulation works with stacks as small +as 16 KiB, for example. For the thread factory, the default value is +the one of the system but you can still change it with this parameter. + +The operating system should only allocate memory for the pages of the +stack which are actually used and you might not need to use this in +most cases. However, this setting is very important when using the +model checker (see :ref:`options_mc_perf`). + +.. _cfg=contexts:guard-size: + +Disabling Stack Guard Pages +........................... + +**Option** ``contexts:guard-size`` **Default** 1 page in most case (0 pages on Windows or with MC) + +A stack guard page is usually used which prevents the stack of a given +actor from overflowing on another stack. But the performance impact +may become prohibitive when the amount of actors increases. The +option ``contexts:guard-size`` is the number of stack guard pages +used. By setting it to 0, no guard pages will be used: in this case, +you should avoid using small stacks (with :ref:`contexts/stack-size +`) as the stack will silently overflow on +other parts of the memory. + +When no stack guard page is created, stacks may then silently overflow +on other parts of the memory if their size is too small for the +application. + +.. _cfg=contexts/nthreads: +.. _cfg=contexts/parallel-threshold: +.. _cfg=contexts/synchro: + +Running User Code in Parallel +............................. + +Parallel execution of the user code is only considered stable in +SimGrid v3.7 and higher, and mostly for MSG simulations. SMPI +simulations may well fail in parallel mode. It is described in +`INRIA RR-7653 `_. + +If you are using the **ucontext** or **raw** context factories, you can +request to execute the user code in parallel. Several threads are +launched, each of them handling as much user contexts at each run. To +actiave this, set the ``contexts/nthreads`` item to the amount of +cores that you have in your computer (or lower than 1 to have +the amount of cores auto-detected). + +Even if you asked several worker threads using the previous option, +you can request to start the parallel execution (and pay the +associated synchronization costs) only if the potential parallelism is +large enough. For that, set the ``contexts/parallel-threshold`` +item to the minimal amount of user contexts needed to start the +parallel execution. In any given simulation round, if that amount is +not reached, the contexts will be run sequentially directly by the +main thread (thus saving the synchronization costs). Note that this +option is mainly useful when the grain of the user code is very fine, +because our synchronization is now very efficient. + +When parallel execution is activated, you can choose the +synchronization schema used with the ``contexts/synchro`` item, +which value is either: + + - **futex:** ultra optimized synchronisation schema, based on futexes + (fast user-mode mutexes), and thus only available on Linux systems. + This is the default mode when available. + - **posix:** slow but portable synchronisation using only POSIX + primitives. + - **busy_wait:** not really a synchronisation: the worker threads + constantly request new contexts to execute. It should be the most + efficient synchronisation schema, but it loads all the cores of + your machine for no good reason. You probably prefer the other less + eager schemas. + + +Configuring the Tracing +----------------------- + +The :ref:`tracing subsystem ` can be configured in +several different ways depending on the nature of the simulator (MSG, +SimDag, SMPI) and the kind of traces that need to be obtained. See the +:ref:`Tracing Configuration Options subsection +` to get a detailed description of each +configuration option. + +We detail here a simple way to get the traces working for you, even if +you never used the tracing API. + + +- Any SimGrid-based simulator (MSG, SimDag, SMPI, ...) and raw traces: + + .. code-block:: shell + + --cfg=tracing:yes --cfg=tracing/uncategorized:yes --cfg=triva/uncategorized:uncat.plist + + The first parameter activates the tracing subsystem, the second + tells it to trace host and link utilization (without any + categorization) and the third creates a graph configuration file to + configure Triva when analysing the resulting trace file. + +- MSG or SimDag-based simulator and categorized traces (you need to + declare categories and classify your tasks according to them) + + .. code-block:: shell + + --cfg=tracing:yes --cfg=tracing/categorized:yes --cfg=triva/categorized:cat.plist + + The first parameter activates the tracing subsystem, the second + tells it to trace host and link categorized utilization and the + third creates a graph configuration file to configure Triva when + analysing the resulting trace file. + +- SMPI simulator and traces for a space/time view: + + .. code-block:: shell + + smpirun -trace ... + + The `-trace` parameter for the smpirun script runs the simulation + with ``--cfg=tracing:yes --cfg=tracing/smpi:yes``. Check the + smpirun's `-help` parameter for additional tracing options. + +Sometimes you might want to put additional information on the trace to +correctly identify them later, or to provide data that can be used to +reproduce an experiment. You have two ways to do that: + +- Add a string on top of the trace file as comment: + + .. code-block:: shell + + --cfg=tracing/comment:my_simulation_identifier + +- Add the contents of a textual file on top of the trace file as comment: + + .. code-block:: shell + + --cfg=tracing/comment-file:my_file_with_additional_information.txt + +Please, use these two parameters (for comments) to make reproducible +simulations. For additional details about this and all tracing +options, check See the :ref:`tracing_tracing_options`. + +Configuring MSG +--------------- + +.. _cfg=msg/debug-multiple-use: + +Debugging MSG Code +.................. + +**Option** ``msg/debug-multiple-use`` **Default:** off + +Sometimes your application may try to send a task that is still being +executed somewhere else, making it impossible to send this task. However, +for debugging purposes, one may want to know what the other host is/was +doing. This option shows a backtrace of the other process. + +Configuring SMPI +---------------- + +The SMPI interface provides several specific configuration items. +These are uneasy to see since the code is usually launched through the +``smiprun`` script directly. + +.. _cfg=smpi/host-speed: +.. _cfg=smpi/cpu-threshold: +.. _cfg=smpi/simulate-computation: + +Automatic Benchmarking of SMPI Code +................................... + +In SMPI, the sequential code is automatically benchmarked, and these +computations are automatically reported to the simulator. That is to +say that if you have a large computation between a ``MPI_Recv()`` and +a ``MPI_Send()``, SMPI will automatically benchmark the duration of +this code, and create an execution task within the simulator to take +this into account. For that, the actual duration is measured on the +host machine and then scaled to the power of the corresponding +simulated machine. The variable ``smpi/host-speed`` allows to specify +the computational speed of the host machine (in flop/s) to use when +scaling the execution times. It defaults to 20000, but you really want +to update it to get accurate simulation results. + +When the code is constituted of numerous consecutive MPI calls, the +previous mechanism feeds the simulation kernel with numerous tiny +computations. The ``smpi/cpu-threshold`` item becomes handy when this +impacts badly the simulation performance. It specifies a threshold (in +seconds) below which the execution chunks are not reported to the +simulation kernel (default value: 1e-6). + +.. note:: The option ``smpi/cpu-threshold`` ignores any computation + time spent below this threshold. SMPI does not consider the + `amount` of these computations; there is no offset for this. Hence, + a value that is too small, may lead to unreliable simulation + results. + +In some cases, however, one may wish to disable simulation of +application computation. This is the case when SMPI is used not to +simulate an MPI applications, but instead an MPI code that performs +"live replay" of another MPI app (e.g., ScalaTrace's replay tool, +various on-line simulators that run an app at scale). In this case the +computation of the replay/simulation logic should not be simulated by +SMPI. Instead, the replay tool or on-line simulator will issue +"computation events", which correspond to the actual MPI simulation +being replayed/simulated. At the moment, these computation events can +be simulated using SMPI by calling internal smpi_execute*() functions. + +To disable the benchmarking/simulation of computation in the simulated +application, the variable ``smpi/simulate-computation`` should be set +to no. This option just ignores the timings in your simulation; it +still executes the computations itself. If you want to stop SMPI from +doing that, you should check the SMPI_SAMPLE macros, documented in +Section :ref:`SMPI_adapting_speed`. + ++------------------------------------+-------------------------+-----------------------------+ +| Solution | Computations executed? | Computations simulated? | ++====================================+=========================+=============================+ +| --cfg=smpi/simulate-computation:no | Yes | Never | ++------------------------------------+-------------------------+-----------------------------+ +| --cfg=smpi/cpu-threshold:42 | Yes, in all cases | If it lasts over 42 seconds | ++------------------------------------+-------------------------+-----------------------------+ +| SMPI_SAMPLE() macro | Only once per loop nest | Always | ++------------------------------------+-------------------------+-----------------------------+ + +.. _cfg=smpi/comp-adjustment-file: + +Slow-down or speed-up parts of your code +........................................ + +**Option** ``smpi/comp-adjustment-file:`` **Default:** unset + +This option allows you to pass a file that contains two columns: The +first column defines the section that will be subject to a speedup; +the second column is the speedup. For instance: + +.. code-block:: shell + + "start:stop","ratio" + "exchange_1.f:30:exchange_1.f:130",1.18244559422142 + +The first line is the header - you must include it. The following +line means that the code between two consecutive MPI calls on line 30 +in exchange_1.f and line 130 in exchange_1.f should receive a speedup +of 1.18244559422142. The value for the second column is therefore a +speedup, if it is larger than 1 and a slow-down if it is smaller +than 1. Nothing will be changed if it is equal to 1. + +Of course, you can set any arbitrary filenames you want (so the start +and end don't have to be in the same file), but be aware that this +mechanism only supports `consecutive calls!` + +Please note that you must pass the ``-trace-call-location`` flag to +smpicc or smpiff, respectively. This flag activates some internal +macro definitions that help with obtaining the call location. + +.. _cfg=smpi/bw-factor: + +Bandwidth Factors +................. + +**Option** ``smpi/bw-factor`` +|br| **Default:** 65472:0.940694;15424:0.697866;9376:0.58729;5776:1.08739;3484:0.77493;1426:0.608902;732:0.341987;257:0.338112;0:0.812084 + +The possible throughput of network links is often dependent on the +message sizes, as protocols may adapt to different message sizes. With +this option, a series of message sizes and factors are given, helping +the simulation to be more realistic. For instance, the current default +value means that messages with size 65472 and more will get a total of +MAX_BANDWIDTH*0.940694, messages of size 15424 to 65471 will get +MAX_BANDWIDTH*0.697866 and so on (where MAX_BANDWIDTH denotes the +bandwidth of the link). + +An experimental script to compute these factors is available online. See +http://simgrid.gforge.inria.fr/contrib/smpi-calibration-doc.html +http://simgrid.gforge.inria.fr/contrib/smpi-saturation-doc.html + +.. _cfg=smpi/display-timing: + +Reporting Simulation Time +......................... + +**Option** ``smpi/display-timing`` **Default:** 0 (false) + +Most of the time, you run MPI code with SMPI to compute the time it +would take to run it on a platform. But since the code is run through +the ``smpirun`` script, you don't have any control on the launcher +code, making it difficult to report the simulated time when the +simulation ends. If you enable the ``smpi/display-timing`` item, +``smpirun`` will display this information when the simulation +ends. + +.. _cfg=smpi/keep-temps: + +Keeping temporary files after simulation +........................................ + +**Option** ``smpi/keep-temps`` **default:** 0 (false) + +SMPI usually generates a lot of temporary files that are cleaned after +use. This option request to preserve them, for example to debug or +profile your code. Indeed, the binary files are removed very early +under the dlopen privatization schema, which tend to fool the +debuggers. + +.. _cfg=smpi/lat-factor: + +Latency factors +............... + +**Option** ``smpi/lat-factor`` |br| +**default:** 65472:11.6436;15424:3.48845;9376:2.59299;5776:2.18796;3484:1.88101;1426:1.61075;732:1.9503;257:1.95341;0:2.01467 + +The motivation and syntax for this option is identical to the motivation/syntax +of :ref:`cfg=smpi/bw-factor`. + +There is an important difference, though: While smpi/bw-factor `reduces` the +actual bandwidth (i.e., values between 0 and 1 are valid), latency factors +increase the latency, i.e., values larger than or equal to 1 are valid here. + +.. _cfg=smpi/papi-events: + +Trace hardware counters with PAPI +................................. + +**Option** ``smpi/papi-events`` **default:** unset + +When the PAPI support was compiled in SimGrid, this option takes the +names of PAPI counters and adds their respective values to the trace +files (See Section :ref:`tracing_tracing_options`). + +.. warning:: + + This feature currently requires superuser privileges, as registers + are queried. Only use this feature with code you trust! Call + smpirun for instance via ``smpirun -wrapper "sudo " + `` or run ``sudo sh -c "echo 0 > + /proc/sys/kernel/perf_event_paranoid"`` In the later case, sudo + will not be required. + +It is planned to make this feature available on a per-process (or per-thread?) basis. +The first draft, however, just implements a "global" (i.e., for all processes) set +of counters, the "default" set. + +.. code-block:: shell + + --cfg=smpi/papi-events:"default:PAPI_L3_LDM:PAPI_L2_LDM" + +.. _cfg=smpi/privatization: + +Automatic Privatization of Global Variables +........................................... + +**Option** ``smpi/privatization`` **default:** "dlopen" (when using smpirun) + +MPI executables are usually meant to be executed in separated +processes, but SMPI is executed in only one process. Global variables +from executables will be placed in the same memory zone and shared +between processes, causing intricate bugs. Several options are +possible to avoid this, as described in the main `SMPI publication +`_ and in the :ref:`SMPI +documentation `. SimGrid provides two ways of +automatically privatizing the globals, and this option allows to +choose between them. + + - **no** (default when not using smpirun): Do not automatically + privatize variables. Pass ``-no-privatize`` to smpirun to disable + this feature. + - **dlopen** or **yes** (default when using smpirun): Link multiple + times against the binary. + - **mmap** (slower, but maybe somewhat more stable): + Runtime automatic switching of the data segments. + +.. warning:: + This configuration option cannot be set in your platform file. You can only + pass it as an argument to smpirun. + +.. _cfg=smpi/privatize-libs: + +Automatic privatization of global variables inside external libraries +..................................................................... + +**Option** ``smpi/privatize-libs`` **default:** unset + +**Linux/BSD only:** When using dlopen (default) privatization, +privatize specific shared libraries with internal global variables, if +they can't be linked statically. For example libgfortran is usually +used for Fortran I/O and indexes in files can be mixed up. + +Multiple libraries can be given, semicolon separated. + +This configuration option can only use either full paths to libraries, +or full names. Check with ldd the name of the library you want to +use. Example: + +.. code-block:: shell + + ldd allpairf90 + ... + libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007fbb4d91b000) + ... + +Then you can use ``--cfg=smpi/privatize-libs:libgfortran.so.3`` +or ``--cfg=smpi/privatize-libs:/usr/lib/x86_64-linux-gnu/libgfortran.so.3``, +but not ``libgfortran`` nor ``libgfortran.so``. + +.. _cfg=smpi/send-is-detached-thresh: + +Simulating MPI detached send +............................ + +**Option** ``smpi/send-is-detached-thresh`` **default:** 65536 + +This threshold specifies the size in bytes under which the send will +return immediately. This is different from the threshold detailed in +:ref:`options_model_network_asyncsend` because the message is not +effectively sent when the send is posted. SMPI still waits for the +correspondant receive to be posted to perform the communication +operation. + +.. _cfg=smpi/coll-selector: + +Simulating MPI collective algorithms +.................................... + +**Option** ``smpi/coll-selector`` **Possible values:** naive (default), ompi, mpich + +SMPI implements more than 100 different algorithms for MPI collective +communication, to accurately simulate the behavior of most of the +existing MPI libraries. The ``smpi/coll-selector`` item can be used to +use the decision logic of either OpenMPI or MPICH libraries (by +default SMPI uses naive version of collective operations). + +Each collective operation can be manually selected with a +``smpi/collective_name:algo_name``. Available algorithms are listed in +:ref:`SMPI_use_colls`. + +.. _cfg=smpi/iprobe: + +Inject constant times for MPI_Iprobe +.................................... + +**Option** ``smpi/iprobe`` **default:** 0.0001 + +The behavior and motivation for this configuration option is identical +with :ref:`smpi/test `, but for the function +``MPI_Iprobe()`` + +.. _cfg=smpi/iprobe-cpu-usage: + +Reduce speed for iprobe calls +............................. + +**Option** ``smpi/iprobe-cpu-usage`` **default:** 1 (no change) + +MPI_Iprobe calls can be heavily used in applications. To account +correctly for the energy cores spend probing, it is necessary to +reduce the load that these calls cause inside SimGrid. + +For instance, we measured a max power consumption of 220 W for a +particular application but only 180 W while this application was +probing. Hence, the correct factor that should be passed to this +option would be 180/220 = 0.81. + +.. _cfg=smpi/init: + +Inject constant times for MPI_Init +.................................. + +**Option** ``smpi/init`` **default:** 0 + +The behavior and motivation for this configuration option is identical +with :ref:`smpi/test `, but for the function ``MPI_Init()``. + +.. _cfg=smpi/ois: + +Inject constant times for MPI_Isend() +..................................... + +**Option** ``smpi/ois`` + +The behavior and motivation for this configuration option is identical +with :ref:`smpi/os `, but for the function ``MPI_Isend()``. + +.. _cfg=smpi/os: + +Inject constant times for MPI_send() +.................................... + +**Option** ``smpi/os`` + +In several network models such as LogP, send (MPI_Send, MPI_Isend) and +receive (MPI_Recv) operations incur costs (i.e., they consume CPU +time). SMPI can factor these costs in as well, but the user has to +configure SMPI accordingly as these values may vary by machine. This +can be done by using ``smpi/os`` for MPI_Send operations; for MPI_Isend +and MPI_Recv, use ``smpi/ois`` and ``smpi/or``, respectively. These work +exactly as ``smpi/ois``. + +This item can consist of multiple sections; each section takes three +values, for example ``1:3:2;10:5:1``. The sections are divided by ";" +so this example contains two sections. Furthermore, each section +consists of three values. + +1. The first value denotes the minimum size for this section to take effect; + read it as "if message size is greater than this value (and other section has a larger + first value that is also smaller than the message size), use this". + In the first section above, this value is "1". + +2. The second value is the startup time; this is a constant value that will always + be charged, no matter what the size of the message. In the first section above, + this value is "3". + +3. The third value is the `per-byte` cost. That is, it is charged for every + byte of the message (incurring cost messageSize*cost_per_byte) + and hence accounts also for larger messages. In the first + section of the example above, this value is "2". + +Now, SMPI always checks which section it should take for a given +message; that is, if a message of size 11 is sent with the +configuration of the example above, only the second section will be +used, not the first, as the first value of the second section is +closer to the message size. Hence, when ``smpi/os=1:3:2;10:5:1``, a +message of size 11 incurs the following cost inside MPI_Send: +``5+11*1`` because 5 is the startup cost and 1 is the cost per byte. + +Note that the order of sections can be arbitrary; they will be ordered internally. + +.. _cfg=smpi/or: + +Inject constant times for MPI_Recv() +.................................... + +**Option** ``smpi/or`` + +The behavior and motivation for this configuration option is identical +with :ref:`smpi/os `, but for the function ``MPI_Recv()``. + +.. _cfg=smpi/test: +.. _cfg=smpi/grow-injected-times: + +Inject constant times for MPI_Test +.................................. + +**Option** ``smpi/test`` **default:** 0.0001 + +By setting this option, you can control the amount of time a process +sleeps when MPI_Test() is called; this is important, because SimGrid +normally only advances the time while communication is happening and +thus, MPI_Test will not add to the time, resulting in a deadlock if +used as a break-condition as in the following example: + +.. code-block:: cpp + + while(!flag) { + MPI_Test(request, flag, status); + ... + } + +To speed up execution, we use a counter to keep track on how often we +already checked if the handle is now valid or not. Hence, we actually +use counter*SLEEP_TIME, that is, the time MPI_Test() causes the +process to sleep increases linearly with the number of previously +failed tests. This behavior can be disabled by setting +``smpi/grow-injected-times`` to **no**. This will also disable this +behavior for MPI_Iprobe. + +.. _cfg=smpi/shared-malloc: +.. _cfg=smpi/shared-malloc-hugepage: + +Factorize malloc()s +................... + +**Option** ``smpi/shared-malloc`` **Possible values:** global (default), local + +If your simulation consumes too much memory, you may want to modify +your code so that the working areas are shared by all MPI ranks. For +example, in a bloc-cyclic matrix multiplication, you will only +allocate one set of blocs, and every processes will share them. +Naturally, this will lead to very wrong results, but this will save a +lot of memory so this is still desirable for some studies. For more on +the motivation for that feature, please refer to the `relevant section +`_ +of the SMPI CourseWare (see Activity #2.2 of the pointed +assignment). In practice, change the call to malloc() and free() into +SMPI_SHARED_MALLOC() and SMPI_SHARED_FREE(). + +SMPI provides two algorithms for this feature. The first one, called +``local``, allocates one bloc per call to SMPI_SHARED_MALLOC() in your +code (each call location gets its own bloc) and this bloc is shared +amongst all MPI ranks. This is implemented with the shm_* functions +to create a new POSIX shared memory object (kept in RAM, in /dev/shm) +for each shared bloc. + +With the ``global`` algorithm, each call to SMPI_SHARED_MALLOC() +returns a new adress, but it only points to a shadow bloc: its memory +area is mapped on a 1MiB file on disk. If the returned bloc is of size +N MiB, then the same file is mapped N times to cover the whole bloc. +At the end, no matter how many SMPI_SHARED_MALLOC you do, this will +only consume 1 MiB in memory. + +You can disable this behavior and come back to regular mallocs (for +example for debugging purposes) using @c "no" as a value. + +If you want to keep private some parts of the buffer, for instance if these +parts are used by the application logic and should not be corrupted, you +can use SMPI_PARTIAL_SHARED_MALLOC(size, offsets, offsets_count). Example: + +.. code-block:: cpp + + mem = SMPI_PARTIAL_SHARED_MALLOC(500, {27,42 , 100,200}, 2); + +This will allocate 500 bytes to mem, such that mem[27..41] and +mem[100..199] are shared while other area remain private. + +Then, it can be deallocated by calling SMPI_SHARED_FREE(mem). + +When smpi/shared-malloc:global is used, the memory consumption problem +is solved, but it may induce too much load on the kernel's pages table. +In this case, you should use huge pages so that we create only one +entry per Mb of malloced data instead of one entry per 4k. +To activate this, you must mount a hugetlbfs on your system and allocate +at least one huge page: + +.. code-block:: shell + + mkdir /home/huge + sudo mount none /home/huge -t hugetlbfs -o rw,mode=0777 + sudo sh -c 'echo 1 > /proc/sys/vm/nr_hugepages' # echo more if you need more + +Then, you can pass the option +``--cfg=smpi/shared-malloc-hugepage:/home/huge`` to smpirun to +actually activate the huge page support in shared mallocs. + +.. _cfg=smpi/wtime: + +Inject constant times for MPI_Wtime, gettimeofday and clock_gettime +................................................................... + +**Option** ``smpi/wtime`` **default:** 10 ns + +This option controls the amount of (simulated) time spent in calls to +MPI_Wtime(), gettimeofday() and clock_gettime(). If you set this value +to 0, the simulated clock is not advanced in these calls, which leads +to issue if your application contains such a loop: + +.. code-block:: cpp + + while(MPI_Wtime() < some_time_bound) { + /* some tests, with no communication nor computation */ + } + +When the option smpi/wtime is set to 0, the time advances only on +communications and computations, so the previous code results in an +infinite loop: the current [simulated] time will never reach +``some_time_bound``. This infinite loop is avoided when that option +is set to a small amount, as it is by default since SimGrid v3.21. + +Note that if your application does not contain any loop depending on +the current time only, then setting this option to a non-zero value +will slow down your simulations by a tiny bit: the simulation loop has +to be broken and reset each time your code ask for the current time. +If the simulation speed really matters to you, you can avoid this +extra delay by setting smpi/wtime to 0. + +Other Configurations +-------------------- + +.. _cfg=clean-atexit: + +Cleanup at Termination +...................... + +**Option** ``clean-atexit`` **default:** on + +If your code is segfaulting during its finalization, it may help to +disable this option to request SimGrid to not attempt any cleanups at +the end of the simulation. Since the Unix process is ending anyway, +the operating system will wipe it all. + +.. _cfg=path: + +Search Path +........... + +**Option** ``path`` **default:** . (current dir) + +It is possible to specify a list of directories to search into for the +trace files (see :ref:`pf_trace`) by using this configuration +item. To add several directory to the path, set the configuration +item several times, as in ``--cfg=path:toto --cfg=path:tutu`` + +.. _cfg=simix/breakpoint: + +Set a Breakpoint +................ + +**Option** ``simix/breakpoint`` **default:** unset + +This configuration option sets a breakpoint: when the simulated clock +reaches the given time, a SIGTRAP is raised. This can be used to stop +the execution and get a backtrace with a debugger. + +It is also possible to set the breakpoint from inside the debugger, by +writing in global variable simgrid::simix::breakpoint. For example, +with gdb: + +.. code-block:: shell + + set variable simgrid::simix::breakpoint = 3.1416 + +.. _cfg=verbose-exit: + +Behavior on Ctrl-C +.................. + +**Option** ``verbose-exit`` **default:** on + +By default, when Ctrl-C is pressed, the status of all existing actors +is displayed before exiting the simulation. This is very useful to +debug your code, but it can reveal troublesome if you have many +actors. Set this configuration item to **off** to disable this +feature. + +.. _cfg=exception/cutpath: + +Truncate local path from exception backtrace +............................................ + +**Option** ``exception/cutpath`` **default:** off + +This configuration option is used to remove the path from the +backtrace shown when an exception is thrown. This is mainly useful for +the tests: the full file path makes the tests not reproducible because +the path of source files depend of the build settings. That would +break most of our tests as we keep comparing output. + +Logging Configuration +--------------------- + +It can be done by using XBT. Go to :ref:`XBT_log` for more details. + +.. |br| raw:: html + +
diff --git a/docs/source/scenario.rst b/docs/source/scenario.rst index 43d927a970..692f8c1463 100644 --- a/docs/source/scenario.rst +++ b/docs/source/scenario.rst @@ -12,7 +12,10 @@ Describing the Experimental Scenario - Reproducible random number generation - Command line options, in particular on the model switching -.. include:: scenar_config.rst +.. toctree:: + :hidden: + + Configuring SimGrid diff --git a/tools/cmake/DefinePackages.cmake b/tools/cmake/DefinePackages.cmake index c393933509..655a240801 100644 --- a/tools/cmake/DefinePackages.cmake +++ b/tools/cmake/DefinePackages.cmake @@ -894,7 +894,6 @@ set(DOC_SOURCES doc/doxygen/module-xbt.doc doc/doxygen/module-index.doc doc/doxygen/ns3.doc - doc/doxygen/options.doc doc/doxygen/outcomes.doc doc/doxygen/outcomes_logs.doc doc/doxygen/outcomes_MC.doc -- 2.20.1