X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/3ddaab8765c7fe9f1aa6cb388df73ff237fc9071..e37c35c004577bfa892819392ab46ef242db6326:/doc/FAQ.doc diff --git a/doc/FAQ.doc b/doc/FAQ.doc index 5856a6122f..05f8fb5cdd 100644 --- a/doc/FAQ.doc +++ b/doc/FAQ.doc @@ -680,7 +680,7 @@ of processes in your simulations. result in *fat* simulation hindering debugging. - It was really boring to write 25,000 entries in the deployment file, so I wrote a little script - examples/gras/tokenS/make_deployment.pl, which you may + examples/gras/mutual_exclusion/simple_token/make_deployment.pl, which you may want to adapt to your case. You could also think about hijacking the SURFXML parser (have look at \ref faq_flexml_bypassing). - The deployment file became quite big, so I had to do what is in @@ -690,10 +690,24 @@ of processes in your simulations. user don't get into trouble about this. You want to tune this size to increse the number of processes. This is the STACK_SIZE define in - src/xbt/context_private.h, which is 128kb by default. + src/xbt/xbt_context_sysv.c, which is 128kb by default. Reduce this as much as you can, but be warned that if this value is too low, you'll get a segfault. The token ring example, which - is quite simple, runs with 40kb stacks. + is quite simple, runs with 40kb stacks. + - You may tweak the logs to reduce the stack size further. When + logging something, we try to build the string to display in a + char array on the stack. The size of this array is constant (and + equal to XBT_LOG_BUFF_SIZE, defined in include/xbt/log/h). If the + string is too large to fit this buffer, we move to a dynamically + sized buffer. In which case, we have to traverse one time the log + event arguments to compute the size we need for the buffer, + malloc it, and traverse the argument list again to do the actual + job.\n + The idea here is to move XBT_LOG_BUFF_SIZE to 1, forcing the logs + to use a dynamic array each time. This allows us to lower further + the stack size at the price of some performance loss...\n + This allowed me to run the reduce the stack size to ... 4k. Ie, + on my 1Gb laptop, I can run more than 250,000 processes! \subsubsection faq_MIA_batch_scheduler Is there a native support for batch schedulers in SimGrid? @@ -1117,47 +1131,26 @@ reason: \subsubsection faq_trouble_errors_big_fat_warning I'm told that my XML files are too old. -We have decided to change the units in SimGrid. Now we use Bytes, Flops and -seconds instead of MBytes, MFlops and seconds... Units should be updated -accordingly and the version of platform_description should be set to a -valuer greater than 1: +The format of the XML platform description files is sometimes +improved. For example, we decided to change the units used in SimGrid +from MBytes, MFlops and seconds to Bytes, Flops and seconds to ease +people exchanging small messages. We also reworked the route +descriptions to allow more compact descriptions. + +That is why the XML files are versionned using the 'version' attribute +of the root tag. Currently, it should read: \verbatim - + \endverbatim -You should try to use the surfxml_update.pl script that can be found -here. -\subsection faq_trouble_valgrind Valgrind-related issues +If your files are too old, you can use the simgrid_update_xml.pl +script which can be found in the tools directory of the archive. + +\subsection faq_trouble_valgrind Valgrind-related and other debugger issues If you don't, you really should use valgrind to debug your code, it's almost magic. -\subsubsection faq_trouble_vg_context Stack switching problems and truncated backtraces - -With the default version of simgrid, valgrind will probably spit tons -of warnings about stack switching like the following, and produce -truncated bactraces where only one call appears instead of the whole -stack. - -\verbatim -==14908== Warning: client switching stacks? SP change: 0xBEA2A48C --> 0x476F350 -==14908== to suppress, use: --max-stackframe=1171541700 or greater -==14908== Warning: client switching stacks? SP change: 0x476E1E4 --> 0xBEA2A48C -==14908== to suppress, use: --max-stackframe=1171537240 or greater -==14908== Warning: client switching stacks? SP change: 0xBEA2A48C --> 0x4792420 -==14908== to suppress, use: --max-stackframe=1171685268 or greater -==14908== further instances of this message will not be shown. -\endverbatim - -This is because valgrind don't like too much the UNIX98 contextes we -use by default in simgrid for efficiency reasons. Simply add the ---with-pthread flag to your configure when debugging your code. You -may also find --disable-compiler-optimization usefull if valgrind or -gdb get fooled by the optimization done by the compiler. But you -should remove these flages when everything works before going in -production (before launching your 1252135 experiments), or everything -will run only one third of the true SimGrid potential. - \subsubsection faq_trouble_vg_longjmp longjmp madness in valgrind This is when valgrind starts complaining about longjmp things, just like: @@ -1169,27 +1162,6 @@ This is when valgrind starts complaining about longjmp things, just like: ==21434== at 0x420DC3A: __longjmp (__longjmp.S:48) \endverbatim -or even when it reports scary things like: - -\verbatim ==24023== Warning: client switching stacks? SP change: 0xBE3FF618 --> 0xBE7FF710 -x86->IR: unhandled instruction bytes: 0xF4 0xC7 0x83 0xD0 -==24023== to suppress, use: --max-stackframe=4194552 or greater -==24023== Your program just tried to execute an instruction that Valgrind -==24023== did not recognise. There are two possible reasons for this. -==24023== 1. Your program has a bug and erroneously jumped to a non-code -==24023== location. If you are running Memcheck and you just saw a -==24023== warning about a bad jump, it's probably your program's fault. -==24023== 2. The instruction is legitimate but Valgrind doesn't handle it, -==24023== i.e. it's Valgrind's fault. If you think this is the case or -==24023== you are not sure, please let us know. -==24023== Either way, Valgrind will now raise a SIGILL signal which will -==24023== probably kill your program. -==24023== -==24023== Process terminating with default action of signal 4 (SIGILL) -==24023== Illegal opcode at address 0x420D234 -==24023== at 0x420D234: abort (abort.c:124) -\endverbatim - This is the sign that you didn't used the exception mecanism well. Most probably, you have a return; somewhere within a TRY{} block. This is evil, and you must not do this. Did you read the section @@ -1241,6 +1213,15 @@ more information. \verbatim export VALGRIND_OPTS="--leak-check=yes --leak-resolution=high --num-callers=40 --tool=memcheck --suppressions=$HOME/.valgrind.supp" \endverbatim +\subsubsection faq_trouble_backtraces Truncated backtraces + +When debugging SimGrid, it's easier to pass the +--disable-compiler-optimization flag to the configure if valgrind or +gdb get fooled by the optimization done by the compiler. But you +should remove these flages when everything works before going in +production (before launching your 1252135 experiments), or everything +will run only one half of the true SimGrid potential. + \subsection faq_deadlock There is a deadlock in my code!!! Unfortunately, we cannot debug every code written in SimGrid. We