X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/b401ba10cae8b375d11bee6d24b116936a91acb5..719db78a976b1fbbad9d6aeb3001ae40b1e14c26:/doc/FAQ.doc diff --git a/doc/FAQ.doc b/doc/FAQ.doc index de38773ac7..f59e1cf7cd 100644 --- a/doc/FAQ.doc +++ b/doc/FAQ.doc @@ -580,6 +580,63 @@ int receiver() \subsection faq_MIA_SimDag SimDag related questions +\subsubsection faq_SG_comm Implementing communication delays between tasks. + +A classic question of SimDag newcommers is about how to express a +communication delay between tasks. The thing is that in SimDag, both +computation and communication are seen as tasks. So, if you want to +model a data dependency between two DAG tasks t1 and t2, you have to +create 3 SD_tasks: t1, t2 and c and add dependencies in the following +way: + +\verbatim +SD_task_dependency_add(NULL, NULL, t1, c); +SD_task_dependency_add(NULL, NULL, c, t2); +\endverbatim + +This way task t2 cannot start before the termination of communication c +which in turn cannot start before t1 ends. + +When creating task c, you have to associate an amount of data (in bytes) +corresponding to what has to be sent by t1 to t2. + +Finally to schedule the communication task c, you have to build a list +comprising the workstations on which t1 and t2 are scheduled (w1 and w2 +for example) and build a communication matrix that should look like +[0;amount ; 0; 0]. + +\subsubsection faq_SG_DAG How to implement a distributed dynamic scheduler of DAGs. + +Distributed is somehow "contagious". If you start making distributed +decisions, there is no way to handle DAGs directly anymore (unless I +am missing something). You have to encode your DAGs in term of +communicating process to make the whole scheduling process +distributed. Here is an example of how you could do that. Assume T1 +has to be done before T2. + +\verbatim + int your_agent(int argc, char *argv[] { + ... + T1 = MSG_task_create(...); + T2 = MSG_task_create(...); + ... + while(1) { + ... + if(cond) MSG_task_execute(T1); + ... + if((MSG_task_get_remaining_computation(T1)=0.0) && (you_re_in_a_good_mood)) + MSG_task_execute(T2) + else { + /* do something else */ + } + } + } +\endverbatim + +If you decide that the distributed part is not that much important and that +DAG is really the level of abstraction you want to work with, then you should +give a try to \ref SD_API. + \subsubsection faq_SG Where has SG disappeared? OK, it's time to explain what's happening to the SimGrid project. Let's @@ -669,15 +726,15 @@ purpose: \ref SD_API. It is built directly on top of SURF and provides an API rather close to the old SG: \verbatim -______________________ -| User code | -|____________________| -| | MSG | GRAS | SD | -| -------------------| -| | SURF | -| -------------------| -| XBT | ----------------------- +_________________________ +| User code | +|________________________| +| | MSG | GRAS | SimDag | +| -----------------------| +| | SURF | +| -----------------------| +| XBT | +-------------------------- \endverbatim The nice thing is that, as it is writen on top of SURF, it seamlessly @@ -685,38 +742,6 @@ support DAG of parallel tasks as well as complex communications patterns. Some old codes using SG are currently under rewrite using \ref SD_API to check that all needful functions are provided. -\subsubsection faq_SG_DAG How to implement a distributed dynamic scheduler of DAGs. - -Distributed is somehow "contagious". If you start making distributed -decisions, there is no way to handle DAGs directly anymore (unless I -am missing something). You have to encode your DAGs in term of -communicating process to make the whole scheduling process -distributed. Here is an example of how you could do that. Assume T1 -has to be done before T2. - -\verbatim - int your_agent(int argc, char *argv[] { - ... - T1 = MSG_task_create(...); - T2 = MSG_task_create(...); - ... - while(1) { - ... - if(cond) MSG_task_execute(T1); - ... - if((MSG_task_get_remaining_computation(T1)=0.0) && (you_re_in_a_good_mood)) - MSG_task_execute(T2) - else { - /* do something else */ - } - } - } -\endverbatim - -If you decide that the distributed part is not that much important and that -DAG is really the level of abstraction you want to work with, then you should -give a try to \ref SD_API. - \subsection faq_MIA_generic Generic features \subsubsection faq_more_processes Increasing the amount of simulated processes @@ -882,6 +907,47 @@ latency_file and state_file. The only difference with CPUs is that bandwidth_file and latency_file do not express fraction of available power but are expressed directly in bytes per seconds and seconds. +\subsubsection faq_platform_multipath Is it possible to have several paths between two given hosts? + +The answer is no, unfortunately. Let's consider the following platform +file: + +\verbatim + + + + + + + + + +\endverbatim + +Althrough it is perfectly valid, it does not mean that data traveling +from A to C can either go directly (using link 3) or through B (using +links 1 and 2). It simply means that the routing on the graph is not +trivial, and that data do not following the shortest path in number of +hops on this graph. Another way to say it is that there is no implicit +in these routing descriptions. The system will only use the routes you +declare (such as <route src="A" dst="C"><route_element +name="3"/></route>), without trying to build new routes by aggregating +the provided ones. + +You are also free to declare platform where the routing is not +symetric. For example, add the following to the previous file: + +\verbatim + + + + +\endverbatim + +This makes sure that data from C to A go through B where data from A +to C go directly. Do not worry about realism of such settings since +we've seen ways more weird situation in real settings. + \subsubsection faq_flexml_bypassing Bypassing the XML parser with your own C functions So you want to bypass the XML files parser, uh? Maybe doin some parameter @@ -954,9 +1020,9 @@ An example of this trick is distributed in the file examples/msg/msg_test_surfxm \section faq_troubleshooting Troubleshooting -\subsection faq_trouble_compil Compilation and installation problems +\subsection faq_trouble_lib_compil SimGrid compilation and installation problems -\subsubsection faq_trouble_config ./configure fails! +\subsubsection faq_trouble_lib_config ./configure fails! We now only one reason for the configure to fail: @@ -996,7 +1062,7 @@ machine: that we can check it out. Make sure to read \ref faq_bugrepport before you do so. -\subsection faq_trouble_errors Understanding error messages +\subsection faq_trouble_compil User code compilation problems \subsubsection faq_trouble_err_logcat "gcc: _simgrid_this_log_category_does_not_exist__??? undeclared (first use in this function)" @@ -1005,6 +1071,20 @@ any default category in this file. You should refer to \ref XBT_log for all the details, but you simply forgot to call one of XBT_LOG_NEW_DEFAULT_CATEGORY() or XBT_LOG_NEW_DEFAULT_SUBCATEGORY(). +\subsubsection faq_trouble_pthreadstatic "gcc: undefinded reference to pthread_key_create" + +This indicates that one of the library SimGrid depends on (libpthread +here) was missing on the linking command line. Dependencies of +libsimgrid are expressed directly in the dynamic library, so it's +quite impossible that you see this message when doing dynamic linking. + +If you compile your code statically (and if you use a pthread version +of SimGrid -- see \ref faq_more_processes), you must absolutely +specify -lpthread on the linker command line. As usual, this should +come after -lsimgrid on this command line. + +\subsection faq_trouble_errors Runtime error messages + \subsubsection faq_flexml_limit "surf_parse_lex: Assertion `next limit' failed." This is because your platform file is too big for the parser. @@ -1109,6 +1189,32 @@ You should try to use the surfxml_update.pl script that can be found If you don't, you really should use valgrind to debug your code, it's almost magic. +\subsubsection faq_trouble_vg_context Stack switching problems and truncated backtraces + +With the default version of simgrid, valgrind will probably spit tons +of warnings about stack switching like the following, and produce +truncated bactraces where only one call appears instead of the whole +stack. + +\verbatim +==14908== Warning: client switching stacks? SP change: 0xBEA2A48C --> 0x476F350 +==14908== to suppress, use: --max-stackframe=1171541700 or greater +==14908== Warning: client switching stacks? SP change: 0x476E1E4 --> 0xBEA2A48C +==14908== to suppress, use: --max-stackframe=1171537240 or greater +==14908== Warning: client switching stacks? SP change: 0xBEA2A48C --> 0x4792420 +==14908== to suppress, use: --max-stackframe=1171685268 or greater +==14908== further instances of this message will not be shown. +\endverbatim + +This is because valgrind don't like too much the UNIX98 contextes we +use by default in simgrid for efficiency reasons. Simply add the +--with-pthread flag to your configure when debugging your code. You +may also find --disable-compiler-optimization usefull if valgrind or +gdb get fooled by the optimization done by the compiler. But you +should remove these flages when everything works before going in +production (before launching your 1252135 experiments), or everything +will run only one third of the true SimGrid potential. + \subsubsection faq_trouble_vg_longjmp longjmp madness in valgrind This is when valgrind starts complaining about longjmp things, just like: