X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/cd44215340094b81407df96650bf5fd17854623b..29d98d1ceb682fbc4c734a92353be4b0bcd5d17b:/doc/doxygen/options.doc?ds=sidebyside diff --git a/doc/doxygen/options.doc b/doc/doxygen/options.doc index 4b16add364..6a94beb730 100644 --- a/doc/doxygen/options.doc +++ b/doc/doxygen/options.doc @@ -850,6 +850,16 @@ to 1, \c smpirun will display this information when the simulation ends. \verbat Simulation time: 1e3 seconds. \endverbatim +\subsection options_smpi_temps smpi/keep-temps: not cleaning up after simulation + +\b Default: 0 (false) + +Under some conditions, SMPI generates a lot of temporary files. They +usually get cleaned, but you may use this option to not erase these +files. This is for example useful when debugging or profiling +executions using the dlopen privatization schema, as missing binary +files tend to fool the debuggers. + \subsection options_model_smpi_lat_factor smpi/lat-factor: Latency factors The motivation and syntax for this option is identical to the motivation/syntax @@ -895,40 +905,56 @@ of counters, the "default" set. --cfg=smpi/papi-events:"default:PAPI_L3_LDM:PAPI_L2_LDM" \endverbatim -\subsection options_smpi_global smpi/privatize-global-variables: Automatic privatization of global variables +\subsection options_smpi_privatization smpi/privatization: Automatic privatization of global variables -MPI executables are meant to be executed in separated processes, but SMPI is +MPI executables are usually meant to be executed in separated processes, but SMPI is executed in only one process. Global variables from executables will be placed -in the same memory zone and shared between processes, causing hard to find bugs. -To avoid this, several options are possible : - - Manual edition of the code, for example to add __thread keyword before data - declaration, which allows the resulting code to work with SMPI, but only - if the thread factory (see \ref options_virt_factory) is used, as global - variables are then placed in the TLS (thread local storage) segment. - - Source-to-source transformation, to add a level of indirection - to the global variables. SMPI does this for F77 codes compiled with smpiff, - and used to provide coccinelle scripts for C codes, which are not functional anymore. - - Compilation pass, to have the compiler automatically put the data in - an adapted zone. - - Runtime automatic switching of the data segments. SMPI stores a copy of - each global data segment for each process, and at each context switch replaces - the actual data with its copy from the right process. This mechanism uses mmap, - and is for now limited to systems supporting this functionnality (all Linux - and some BSD should be compatible). - Another limitation is that SMPI only accounts for global variables defined in - the executable. If the processes use external global variables from dynamic - libraries, they won't be switched correctly. To avoid this, using static - linking is advised (but not with the simgrid library, to avoid replicating - its own global variables). - - To use this runtime automatic switching, the variable \b smpi/privatize-global-variables - should be set to yes +in the same memory zone and shared between processes, causing intricate bugs. +Several options are possible to avoid this, as described in the main +SMPI publication. +SimGrid provides two ways of automatically privatizing the globals, +and this option allows to choose between them. + + - no (default): Do not automatically privatize variables. + - mmap or yes: Runtime automatic switching of the data segments.\n + SMPI stores a copy of each global data segment for each process, + and at each context switch replaces the actual data with its copy + from the right process. No copy actually occures as this mechanism + uses mmap for efficiency. As such, it is for now limited to + systems supporting this functionnality (all Linux and most BSD).\n + Another limitation is that SMPI only accounts for global variables + defined in the executable. If the processes use external global + variables from dynamic libraries, they won't be switched + correctly. The easiest way to solve this is to statically link + against the library with these globals (but you should never + statically link against the simgrid library itself). + - dlopen: Link multiple times against the binary.\n + SMPI loads several copy of the same binary in memory, resulting in + the natural duplication global variables. Since the dynamic linker + refuses to link the same file several times, the binary is copied + in a temporary file before being dl-loaded (it is erased right + after loading).\n + Note that this feature is somewhat experimental at time of writing + (v3.16) but seems to work.\n + This approach greatly speeds up the context switching, down to + about 40 CPU cycles with our raw contextes, instead of requesting + several syscalls with the \c mmap approach. Another advantage is + that it permits to run the SMPI contexts in parallel, which is + obviously not possible with the \c mmap approach.\n + Further work may be possible to alleviate the memory and disk + overconsumption. It seems that we could + punch holes + in the files before dl-loading them to remove the code and + constants, and mmap these area onto a unique copy. This require + to understand the ELF layout of the file, but would + reduce the disk- and memory- usage to the bare minimum. In + addition, this would reduce the pressure on the CPU caches (in + particular on instruction one). \warning This configuration option cannot be set in your platform file. You can only pass it as an argument to smpirun. - \subsection options_model_smpi_detached Simulating MPI detached send This threshold specifies the size in bytes under which the send will return @@ -952,6 +978,18 @@ uses naive version of collective operations). Each collective operation can be m The behavior and motivation for this configuration option is identical with \a smpi/test, see Section \ref options_model_smpi_test for details. +\subsection options_model_smpi_iprobe_cpu_usage smpi/iprobe-cpu-usage: Reduce speed for iprobe calls + +\b Default value: 1 (no change from default behavior) + +MPI_Iprobe calls can be heavily used in applications. To account correctly for the energy +cores spend probing, it is necessary to reduce the load that these calls cause inside +SimGrid. + +For instance, we measured a max power consumption of 220 W for a particular application but +only 180 W while this application was probing. Hence, the correct factor that should +be passed to this option would be 180/220 = 0.81. + \subsection options_model_smpi_init smpi/init: Inject constant times for calls to MPI_Init \b Default value: 0 @@ -1072,11 +1110,42 @@ returns a new adress, but it only points to a shadow bloc: its memory area is mapped on a 1MiB file on disk. If the returned bloc is of size N MiB, then the same file is mapped N times to cover the whole bloc. At the end, no matter how many SMPI_SHARED_MALLOC you do, this will -only consume 1 MiB in memory. +only consume 1 MiB in memory. You can disable this behavior and come back to regular mallocs (for example for debugging purposes) using \c "no" as a value. +If you want to keep private some parts of the buffer, for instance if these +parts are used by the application logic and should not be corrupted, you +can use SMPI_PARTIAL_SHARED_MALLOC(size, offsets, offsets_count). + +As an example, + +\code{.C} + mem = SMPI_PARTIAL_SHARED_MALLOC(500, {27,42 , 100,200}, 2); +\endcode + +will allocate 500 bytes to mem, such that mem[27..41] and mem[100..199] +are shared and other area remain private. + +Then, it can be deallocated by calling SMPI_SHARED_FREE(mem). + +When smpi/shared-malloc:global is used, the memory consumption problem +is solved, but it may induce too much load on the kernel's pages table. +In this case, you should use huge pages so that we create only one +entry per Mb of malloced data instead of one entry per 4k. +To activate this, you must mount a hugetlbfs on your system and allocate +at least one huge page: + +\code{.sh} + mkdir /home/huge + sudo mount none /home/huge -t hugetlbfs -o rw,mode=0777 + sudo sh -c 'echo 1 > /proc/sys/vm/nr_hugepages' # echo more if you need more +\endcode + +Then, you can pass the option --cfg=smpi/shared-malloc-hugepage:/home/huge +to smpirun to actually activate the huge page support in shared mallocs. + \subsection options_model_smpi_wtime smpi/wtime: Inject constant times for calls to MPI_Wtime \b Default value: 0 @@ -1149,13 +1218,8 @@ It can be done by using XBT. Go to \ref XBT_log for more details. \section options_index Index of all existing configuration options \note - Almost all options are defined in src/simgrid/sg_config.c. You may - want to check this file, too, but this index should be somewhat complete - for the moment (May 2015). - -\note - \b Please \b note: You can also pass the command-line option "--help" and - "--help-cfg" to an executable that uses simgrid. + The full list can be retrieved by passing "--help" and + "--help-cfg" to an executable that uses SimGrid. - \c clean-atexit: \ref options_generic_clean_atexit @@ -1224,15 +1288,18 @@ It can be done by using XBT. Go to \ref XBT_log for more details. - \c smpi/host-speed: \ref options_smpi_bench - \c smpi/IB-penalty-factors: \ref options_model_network_coefs - \c smpi/iprobe: \ref options_model_smpi_iprobe +- \c smpi/iprobe-cpu-usage: \ref options_model_smpi_iprobe_cpu_usage - \c smpi/init: \ref options_model_smpi_init +- \c smpi/keep-temps: \ref options_smpi_temps - \c smpi/lat-factor: \ref options_model_smpi_lat_factor - \c smpi/ois: \ref options_model_smpi_ois - \c smpi/or: \ref options_model_smpi_or - \c smpi/os: \ref options_model_smpi_os - \c smpi/papi-events: \ref options_smpi_papi_events -- \c smpi/privatize-global-variables: \ref options_smpi_global +- \c smpi/privatization: \ref options_smpi_privatization - \c smpi/send-is-detached-thresh: \ref options_model_smpi_detached - \c smpi/shared-malloc: \ref options_model_smpi_shared_malloc +- \c smpi/shared-malloc-hugepage: \ref options_model_smpi_shared_malloc - \c smpi/simulate-computation: \ref options_smpi_bench - \c smpi/test: \ref options_model_smpi_test - \c smpi/wtime: \ref options_model_smpi_wtime