From d391482c2128603c316d908fa745c4daa051ac48 Mon Sep 17 00:00:00 2001 From: Martin Quinson Date: Mon, 22 May 2017 15:01:07 +0200 Subject: [PATCH] tiny updates to the doc of huge pages --- ChangeLog | 1 + doc/doxygen/options.doc | 21 +++++++++++---------- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/ChangeLog b/ChangeLog index 6ca3d3e3a8..b388b9dc73 100644 --- a/ChangeLog +++ b/ChangeLog @@ -27,6 +27,7 @@ SimGrid (3.16) UNRELEASED SMPI - New algorithm to privatize globals: dlopen, with dynamic loading tricks - New option: smpi/keep-temps to not cleanup temp files + - Support for sparse privatized malloc with SMPI_PARTIAL_SHARED_MALLOC() XBT - Replay: New function xbt_replay_action_get(): diff --git a/doc/doxygen/options.doc b/doc/doxygen/options.doc index d5a8f54d65..3fad32cb85 100644 --- a/doc/doxygen/options.doc +++ b/doc/doxygen/options.doc @@ -1110,7 +1110,7 @@ returns a new adress, but it only points to a shadow bloc: its memory area is mapped on a 1MiB file on disk. If the returned bloc is of size N MiB, then the same file is mapped N times to cover the whole bloc. At the end, no matter how many SMPI_SHARED_MALLOC you do, this will -only consume 1 MiB in memory. +only consume 1 MiB in memory. You can disable this behavior and come back to regular mallocs (for example for debugging purposes) using \c "no" as a value. @@ -1121,7 +1121,7 @@ can use SMPI_PARTIAL_SHARED_MALLOC(size, offsets, offsets_count). As an example, -\code{.unparsed} +\code{.C} mem = SMPI_PARTIAL_SHARED_MALLOC(500, {27,42 , 100,200}, 2); \endcode @@ -1130,21 +1130,21 @@ are shared and other area remain private. Then, it can be deallocated by calling SMPI_SHARED_FREE(mem). -When smpi/shared-malloc:global is used, it is possible to optimize even -further the memory consumption and the simulation time by using huge pages. -To do so, it is required to mount a hugetlbfs on your system and allocate +When smpi/shared-malloc:global is used, the memory consumption problem +is solved, but it may induce too much load on the kernel's pages table. +In this case, you should use huge pages so that we create only one +entry per Mb of malloced data instead of one entry per 4k. +To activate this, you must mount a hugetlbfs on your system and allocate at least one huge page: -\code{.unparsed} +\code{.sh} mkdir /home/huge sudo mount none /home/huge -t hugetlbfs -o rw,mode=0777 - sudo sh -c 'echo 1 > /proc/sys/vm/nr_hugepages' + sudo sh -c 'echo 1 > /proc/sys/vm/nr_hugepages' # echo more if you need more \endcode -Note that at least one huge page is required for smpirun, but you can set -an higher number of huge pages in nr_hugepages (e.g. "echo 27"). Then, you can pass the option --cfg=smpi/shared-malloc-hugepage:/home/huge -to smpirun. +to smpirun to actually activate the huge page support in shared mallocs. \subsection options_model_smpi_wtime smpi/wtime: Inject constant times for calls to MPI_Wtime @@ -1304,6 +1304,7 @@ It can be done by using XBT. Go to \ref XBT_log for more details. - \c smpi/privatization: \ref options_smpi_privatization - \c smpi/send-is-detached-thresh: \ref options_model_smpi_detached - \c smpi/shared-malloc: \ref options_model_smpi_shared_malloc +- \c smpi/shared-malloc-hugepage: \ref options_model_smpi_shared_malloc - \c smpi/simulate-computation: \ref options_smpi_bench - \c smpi/test: \ref options_model_smpi_test - \c smpi/wtime: \ref options_model_smpi_wtime -- 2.20.1