X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/029f4e3fda0aaddabf8312586a1a6d73643967fc..20f5604e5582663868f57fb5db28a88929ac3d43:/doc/doxygen/FAQ.doc diff --git a/doc/doxygen/FAQ.doc b/doc/doxygen/FAQ.doc index 2bb4121a7b..4f8f178d99 100644 --- a/doc/doxygen/FAQ.doc +++ b/doc/doxygen/FAQ.doc @@ -1,253 +1,6 @@ -/*! \page FAQ Frequently Asked Questions +/*! @page FAQ Frequently Asked Questions -@tableofcontents - -\section faq_simgrid I'm new to SimGrid. I have some questions. Where should I start? - -You are at the right place... Having a look to these -the slides of the HPCS'10 tutorial -(or to these ancient -slides, or to these -"obsolete" slides) -may give you some insights on what SimGrid can help you to do and what -are its limitations. Then you definitely should read the \ref -MSG_examples. - -If you are stuck at any point and if this FAQ cannot help you, please drop us a -mail to the user mailing list: . - -\subsection faq_interfaces What is the difference between MSG and SimDag? Do they serve the same purpose? - -It depend on how you define "purpose", I guess ;) - -They all allow you to build a prototype of application which you can run -within the simulator afterward. They all share the same simulation kernel, -which is the core of the SimGrid project. They differ by the way you express -your application. - -With SimDag, you express your code as a collection of interdependent -parallel tasks. So, in this model, applications can be seen as a DAG of -tasks. This is the interface of choice for people wanting to port old -code designed for SimGrid v1 or v2 to the framework current version. - -With MSG, your application is seen as a set of communicating -processes, exchanging data by the way of messages and performing -computation on their own. - -\subsection faq_visualization Visualizing and analyzing the results - -It is sometime convenient to "see" how the agents are behaving. If you -like colors, you can use tools/MSG_visualization/colorize.pl -as a filter to your MSG outputs. It works directly with INFO. Beware, -INFO() prints on stderr. Do not forget to redirect if you want to -filter (e.g. with bash): -\verbatim -./msg_test small_platform.xml small_deployment.xml 2>&1 | ../../tools/MSG_visualization/colorize.pl -\endverbatim - -We also have a more graphical output. Have a look at section \ref options_tracing. - -\subsection faq_C Argh! Do I really have to code in C? - -Currently bindings on top of MSG are supported for Java, Ruby and Lua. You can find a few -documentation about them on the doc page. Note that bindings are released separately from the main dist -and so have their own version numbers. - -Moreover If you use C++, -you should be able to use the SimGrid library as a standard C library -and everything should work fine (simply link against this -library; recompiling SimGrid with a C++ compiler won't work and it -wouldn't help if you could). - -For now, -we do not feel a real demand for any other language. But if you think there is one, - please speak up! - -\section faq_howto Feature related questions - -\subsection faq_MIA "Could you please add (your favorite feature here) to SimGrid?" - -Here is the deal. The whole SimGrid project (MSG, SURF, ...) is -meant to be kept as simple and generic as possible. We cannot add -functions for everybody's needs when these functions can easily be -built from the ones already in the API. Most of the time, it is -possible and when it was not possible we always have upgraded the API -accordingly. When somebody asks us a question like "How to do that? -Is there a function in the API to simply do this?", we're always glad -to answer and help. However if we don't need this code for our own -need, there is no chance we're going to write it... it's your job! :) -The counterpart to our answers is that once you come up with a neat -implementation of this feature (task duplication, RPC, thread -synchronization, ...), you should send it to us and we will be glad to -add it to the distribution. Thus, other people will take advantage of -it (and we don't have to answer this question again and again ;). - -You'll find in this section a few "Missing In Action" features. Many -people have asked about it and we have given hints on how to simply do -it with MSG. Feel free to contribute... - -\subsection faq_MIA_MSG MSG features - -\subsubsection faq_MIA_examples I want some more complex MSG examples! - -Many people have come to ask me a more complex example and each time, -they have realized afterward that the basics were in the previous three -examples. - -Of course they have often been needing more complex functions like -MSG_process_suspend(), MSG_process_resume() and -MSG_process_isSuspended() (to perform synchronization), or -MSG_task_Iprobe() and MSG_process_sleep() (to avoid blocking -receptions), or even MSG_process_create() (to design asynchronous -communications or computations). But the examples are sufficient to -start. - -We know. We should add some more examples, but not really some more -complex ones... We should add some examples that illustrate some other -functionalists (like how to simply encode asynchronous -communications, RPC, process migrations, thread synchronization, ...) -and we will do it when we will have a little bit more time. We have -tried to document the examples so that they are understandable. Tell -us if something is not clear and once again feel free to participate! -:) - -\subsubsection faq_MIA_taskdup Missing in action: MSG Task duplication/replication - -There is no task duplication in MSG. When you create a task, you can -process it or send it somewhere else. As soon as a process has sent -this task, he doesn't have this task anymore. It's gone. The receiver -process has got the task. However, you could decide upon receiving to -create a "copy" of a task but you have to handle by yourself the -semantic associated to this "duplication". - -As we already told, we prefer keeping the API as simple as -possible. This kind of feature is rather easy to implement by users -and the semantic you associate really depends on people. Having a -*generic* task duplication mechanism is not that trivial (in -particular because of the data field). That is why I would recommend -that you write it by yourself even if I can give you advice on how to -do it. - -You have the following functions to get information about a task: -MSG_task_get_name(), MSG_task_get_compute_duration(), -MSG_task_get_remaining_computation(), MSG_task_get_data_size(), -and MSG_task_get_data(). - -You could use a dictionary (#xbt_dict_t) of dynars (#xbt_dynar_t). If -you still don't see how to do it, please come back to us... - -\subsubsection faq_MIA_asynchronous I want to do asynchronous communications in MSG - -In the past (version <= 3.4), there was no function to perform asynchronous communications. -It could easily be implemented by creating new process when needed though. Since version 3.5, -we have introduced the following functions: - - MSG_task_isend() - - MSG_task_irecv() - - MSG_comm_test() - - MSG_comm_wait() - - MSG_comm_waitall() - - MSG_comm_waitany() - - MSG_comm_destroy() - -We refer you to the description of these functions for more details on their usage as well -as to the example section on \ref MSG_ex_asynchronous_communications. - -\subsubsection faq_MIA_thread_synchronization I need to synchronize my MSG processes - -You obviously cannot use pthread_mutexes of pthread_conds since we handle every -scheduling related decision within SimGrid. - -In the past (version <=3.3.4) you could do it by playing with -MSG_process_suspend() and MSG_process_resume() or with fake communications (using MSG_task_get(), -MSG_task_put() and MSG_task_Iprobe()). - -Since version 3.4, you can use classical synchronization structures. See page \ref XBT_synchro or simply check in -include/xbt/synchro_core.h. - -\subsubsection faq_MIA_host_load Where is the get_host_load function hidden in MSG? - -There is no such thing because its semantic wouldn't be really -clear. Of course, it is something about the amount of host throughput, -but there is as many definition of "host load" as people asking for -this function. First, you have to remember that resource availability -may vary over time, which make any load notion harder to define. - -It may be instantaneous value or an average one. Moreover it may be only the -power of the computer, or may take the background load into account, or may -even take the currently running tasks into account. In some SURF models, -communications have an influence on computational power. Should it be taken -into account too? - -First of all, it's near to impossible to predict the load beforehand in the -simulator since it depends on too much parameters (background load -variation, bandwidth sharing algorithmic complexity) some of them even being -not known beforehand (other task starting at the same time). So, getting -this information is really hard (just like in real life). It's not just that -we want MSG to be as painful as real life. But as it is in some way -realistic, we face some of the same problems as we would face in real life. - -How would you do it for real? The most common option is to use something -like NWS that performs active probes. The best solution is probably to do -the same within MSG, as in next code snippet. It is very close from what you -would have to do out of the simulator, and thus gives you information that -you could also get in real settings to not hinder the realism of your -simulation. - -\verbatim -double get_host_load() { - m_task_t task = MSG_task_create("test", 0.001, 0, NULL); - double date = MSG_get_clock(); - - MSG_task_execute(task); - date = MSG_get_clock() - date; - MSG_task_destroy(task); - return (0.001/date); -} -\endverbatim - -Of course, it may not match your personal definition of "host load". In this -case, please detail what you mean on the mailing list, and we will extend -this FAQ section to fit your taste if possible. - -\subsubsection faq_MIA_communication_time How can I get the *real* communication time? - -Communications are synchronous and thus if you simply get the time -before and after a communication, you'll only get the transmission -time and the time spent to really communicate (it will also take into -account the time spent waiting for the other party to be -ready). However, getting the *real* communication time is not really -hard either. The following solution is a good starting point. - -\verbatim -int sender() -{ - m_task_t task = MSG_task_create("Task", task_comp_size, task_comm_size, - calloc(1,sizeof(double))); - *((double*) task->data) = MSG_get_clock(); - MSG_task_put(task, slaves[i % slaves_count], PORT_22); - XBT_INFO("Send completed"); - return 0; -} -int receiver() -{ - m_task_t task = NULL; - double time1,time2; - - time1 = MSG_get_clock(); - a = MSG_task_get(&(task), PORT_22); - time2 = MSG_get_clock(); - if(time1<*((double *)task->data)) - time1 = *((double *) task->data); - XBT_INFO("Communication time : \"%f\" ", time2-time1); - free(task->data); - MSG_task_destroy(task); - return 0; -} -\endverbatim - -\subsection faq_MIA_SimDag SimDag related questions - -\subsubsection faq_SG_comm Implementing communication delays between tasks. +@subsubsection Implementing communication delays between SimDAG tasks. A classic question of SimDag newcomers is about how to express a communication delay between tasks. The thing is that in SimDag, both @@ -256,10 +9,10 @@ model a data dependency between two DAG tasks t1 and t2, you have to create 3 SD_tasks: t1, t2 and c and add dependencies in the following way: -\verbatim -SD_task_dependency_add(NULL, NULL, t1, c); -SD_task_dependency_add(NULL, NULL, c, t2); -\endverbatim +@code +SD_task_dependency_add(t1, c); +SD_task_dependency_add(c, t2); +@endcode This way task t2 cannot start before the termination of communication c which in turn cannot start before t1 ends. @@ -272,610 +25,4 @@ comprising the workstations on which t1 and t2 are scheduled (w1 and w2 for example) and build a communication matrix that should look like [0;amount ; 0; 0]. -\subsubsection faq_SG_DAG How to implement a distributed dynamic scheduler of DAGs. - -Distributed is somehow "contagious". If you start making distributed -decisions, there is no way to handle DAGs directly anymore (unless I -am missing something). You have to encode your DAGs in term of -communicating process to make the whole scheduling process -distributed. Here is an example of how you could do that. Assume T1 -has to be done before T2. - -\verbatim - int your_agent(int argc, char *argv[] { - ... - T1 = MSG_task_create(...); - T2 = MSG_task_create(...); - ... - while(1) { - ... - if(cond) MSG_task_execute(T1); - ... - if((MSG_task_get_remaining_computation(T1)=0.0) && (you_re_in_a_good_mood)) - MSG_task_execute(T2) - else { - /* do something else */ - } - } - } -\endverbatim - -If you decide that the distributed part is not that much important and that -DAG is really the level of abstraction you want to work with, then you should -give a try to \ref SD_API. - -\subsection faq_MIA_generic Generic features - -\subsubsection faq_MIA_batch_scheduler Is there a native support for batch schedulers in SimGrid? - -No, there is no native support for batch schedulers and none is -planned because this is a very specific need (and doing it in a -generic way is thus very hard). However some people have implemented -their own batch schedulers. Vincent Garonne wrote one during his PhD -and put his code in the contrib directory of our SVN so that other can -keep working on it. You may find inspiring ideas in it. - -\subsubsection faq_MIA_checkpointing I need a checkpointing thing - -Actually, it depends on whether you want to checkpoint the simulation, or to -simulate checkpoints. - -The first one could help if your simulation is a long standing process you -want to keep running even on hardware issues. It could also help to -rewind the simulation by jumping sometimes on an old checkpoint to -cancel recent calculations.\n -Unfortunately, such thing will probably never exist in SG. One would have to -duplicate all data structures because doing a rewind at the simulator level -is very very hard (not talking about the malloc free operations that might -have been done in between). Instead, you may be interested in the Libckpt -library (http://www.cs.utk.edu/~plank/plank/www/libckpt.html). This is the -checkpointing solution used in the condor project, for example. It makes it -easy to create checkpoints (at the OS level, creating something like core -files), and rerunning them on need. - -If you want to simulate checkpoints instead, it means that you want the -state of an executing task (in particular, the progress made towards -completion) to be saved somewhere. So if a host (and the task executing on -it) fails (cf. #MSG_HOST_FAILURE), then the task can be restarted -from the last checkpoint.\n - -Actually, such a thing does not exist in SimGrid either, but it's just -because we don't think it is fundamental and it may be done in the user code -at relatively low cost. You could for example use a watcher that -periodically get the remaining amount of things to do (using -MSG_task_get_remaining_computation()), or fragment the task in smaller -subtasks. - -\subsection faq_platform Platform building and Dynamic resources - -\subsubsection faq_platform_example Where can I find SimGrid platform files? - -There are several little examples in the archive, in the examples/msg -directory. From time to time, we are asked for other files, but we -don't have much at hand right now. - -You should refer to the Platform Description Archive -(http://pda.gforge.inria.fr) project to see the other platform file we -have available, as well as the Simulacrum simulator, meant to generate -SimGrid platforms using all classical generation algorithms. - -\subsubsection faq_platform_alnem How can I automatically map an existing platform? - -We are working on a project called ALNeM (Application-Level Network -Mapper) which goal is to automatically discover the topology of an -existing network. Its output will be a platform description file -following the SimGrid syntax, so everybody will get the ability to map -their own lab network (and contribute them to the catalog project). -This tool is not ready yet, but it move quite fast forward. Just stay -tuned. - -\subsubsection faq_platform_synthetic Generating synthetic but realistic platforms - -The third possibility to get a platform file (after manual or -automatic mapping of real platforms) is to generate synthetic -platforms. Getting a realistic result is not a trivial task, and -moreover, nobody is really able to define what "realistic" means when -speaking of topology files. You can find some more thoughts on this -topic in these -slides. - -If you are looking for an actual tool, there we have a little tool to -annotate Tiers-generated topologies. This perl-script is in -tools/platform_generation/ directory of the SVN. Dinda et Al. -released a very comparable tool, and called it GridG. - - -The specified computing power will be available to up to 6 sequential -tasks without sharing. If more tasks are placed on this host, the -resource will be shared accordingly. For example, if you schedule 12 -tasks on the host, each will get half of the computing power. Please -note that although sound, this model were never scientifically -assessed. Please keep this fact in mind when using it. - - -\subsubsection faq_platform_random Using random variable for the resource power or availability - -The best way to model the resouce power using a random variable is to -use an availability trace that is directed by a probability -distribution. This can be done using the function -tmgr_trace_generator_value() below. The date and value generators is -created with one of tmgr_event_generator_new_uniform(), -tmgr_event_generator_new_exponential() or -tmgr_event_generator_new_weibull() (if you need other generators, -adding them to src/surf/trace_mgr.c should be quite trivial and your -patch will be welcomed). Once your trace is created, you have to -connect it to the resource with the function -sg_platf_new_trace_connect(). - -That the process is very similar if you want to model the -resource availability with a random variable (deciding whether it's -on/off instead of deciding its speed) using the function -tmgr_trace_generator_state() or tmgr_trace_generator_avail_unavail() -instead of tmgr_trace_generator_value(). - -Unfortunately, all this is currently lacking a proper documentation, -and there is even no proper example of use. You'll thus have to check -the header file include/simgrid/platf.h and experiment a bit by -yourself. The following code should be a good starting point, and -contributing a little clean example would be a good way to help the -SimGrid project. - -@code -tmgr_trace_generator_value("mytrace",tmgr_event_generator_new_exponential(.5) - tmgr_event_generator_new_uniform(100000,9999999)); - -sg_platf_trace_connect_cbarg_t myconnect = SG_PLATF_TRACE_CONNECT_INITIALIZER; -myconnect.kind = SURF_TRACE_CONNECT_KIND_BANDWIDTH; -myconnect.trace = "mytrace"; -myconnect.element = "mylink"; - -sg_platf_trace_connect(myconnect); -@endcode - -\section faq_troubleshooting Troubleshooting - -\subsection faq_trouble_changelog The feature X stopped to work after my last update - -I guess that you want to read the ChangeLog file, that always contains -all the information that could be important to the users during the -upgrade. Actually, you may want to read it (alongside with the NEWS -file that highlights the most important changes) even before you -upgrade your copy of SimGrid, too. - -Backward compatibility is very important to us, as we want to provide -a scientific tool allowing to evaluate the code you write in several -years, too. That being said, we sometimes change the interface to make -them more usable to the users. When we do so, we always keep the old -interface as DEPRECATED. If you still want to use them, you want to -define the SIMGRID_DEPRECATED preprocessor symbol before loading the -SimGrid files: - -@verbatim -#define SIMGRID_DEPRECATED -#include -@endverbatim - -\subsection faq_trouble_lib_compil SimGrid compilation and installation problems - -\subsubsection faq_trouble_lib_config cmake fails! - -We know only one reason for the configure to fail: - - - You are using a broken build environment\n - If symptom is that the configury magic complains about gcc not being able to build - executables, you are probably missing the libc6-dev package. Damn Ubuntu. - -If you experience other kind of issue, please get in touch with us. We are -always interested in improving our portability to new systems. - -\subsubsection faq_trouble_distcheck Dude! "ctest" fails on my machine! - -Don't assume we never run this target, because we do. Check -http://cdash.inria.fr/CDash/index.php?project=Simgrid (click on -previous if there is no result for today: results are produced only by -11am, French time) and -https://buildd.debian.org/status/logs.php?pkg=simgrid if you don't believe us. - -If it's failing on your machine in a way not experienced by the -autobuilders above, please drop us a mail on the mailing list so that -we can check it out. Make sure to read \ref faq_bugrepport before you -do so. - -\subsection faq_trouble_compil User code compilation problems - -\subsubsection faq_trouble_err_logcat "gcc: _simgrid_this_log_category_does_not_exist__??? undeclared (first use in this function)" - -This is because you are using the log mecanism, but you didn't created -any default category in this file. You should refer to \ref XBT_log -for all the details, but you simply forgot to call one of -XBT_LOG_NEW_DEFAULT_CATEGORY() or XBT_LOG_NEW_DEFAULT_SUBCATEGORY(). - -\subsubsection faq_trouble_pthreadstatic "gcc: undefined reference to pthread_key_create" - -This indicates that one of the library SimGrid depends on (libpthread -here) was missing on the linking command line. Dependencies of -libsimgrid are expressed directly in the dynamic library, so it's -quite impossible that you see this message when doing dynamic linking. - -If you compile your code statically (and if you use a pthread version -of SimGrid -- see \ref faq_more_processes), you must absolutely -specify -lpthread on the linker command line. As usual, this should -come after -lsimgrid on this command line. - -\subsubsection faq_trouble_lib_msg_deprecated "gcc: undefined reference to MSG_*" - -Since version 3.7 all the m_channel_t mecanism is deprecated. So functions -about this mecanism may get removed in future releases. - -List of functions: - -\li XBT_PUBLIC(int) MSG_get_host_number(void); - -\li XBT_PUBLIC(m_host_t *) MSG_get_host_table(void); - -\li XBT_PUBLIC(MSG_error_t) MSG_get_errno(void); - -\li XBT_PUBLIC(MSG_error_t) MSG_task_get(m_task_t * task, m_channel_t channel); - -\li XBT_PUBLIC(MSG_error_t) MSG_task_get_with_timeout(m_task_t * task, m_channel_t channel, double max_duration); - -\li XBT_PUBLIC(MSG_error_t) MSG_task_get_from_host(m_task_t * task, int channel, m_host_t host); - -\li XBT_PUBLIC(MSG_error_t) MSG_task_get_ext(m_task_t * task, int channel, double max_duration, m_host_t host); - -\li XBT_PUBLIC(MSG_error_t) MSG_task_put(m_task_t task, m_host_t dest, m_channel_t channel); - -\li XBT_PUBLIC(MSG_error_t) MSG_task_put_bounded(m_task_t task, m_host_t dest, m_channel_t channel, double max_rate); - -\li XBT_PUBLIC(MSG_error_t) MSG_task_put_with_timeout(m_task_t task, m_host_t dest, m_channel_t channel, double max_duration); - -\li XBT_PUBLIC(int) MSG_task_Iprobe(m_channel_t channel); - -\li XBT_PUBLIC(int) MSG_task_probe_from(m_channel_t channel); - -\li XBT_PUBLIC(int) MSG_task_probe_from_host(int channel, m_host_t host); - -\li XBT_PUBLIC(MSG_error_t) MSG_set_channel_number(int number); - -\li XBT_PUBLIC(int) MSG_get_channel_number(void); - -If you want them you have to compile Simgrid v3.7 with option "-Denable_msg_deprecated=ON". -Using them should print warning to inform what new function you have to use. - -\subsection faq_trouble_errors Runtime error messages - -\subsubsection faq_flexml_limit "surf_parse_lex: Assertion `next limit' failed." - -This is because your platform file is too big for the parser. - -Actually, the message comes directly from FleXML, the technology on top of -which the parser is built. FleXML has the bad idea of fetching the whole -document in memory before parsing it. And moreover, the memory buffer size -must be determined at compilation time. - -We use a value which seems big enough for our need without bloating the -simulators footprints. But of course your mileage may vary. In this case, -just edit src/surf/surfxml.l modify the definition of -FLEXML_BUFFERSTACKSIZE. E.g. - -\verbatim -#define FLEXML_BUFFERSTACKSIZE 1000000000 -\endverbatim - -Then recompile and everything should be fine, provided that your version of -Flex is recent enough (>= 2.5.31). If not the compilation process should -warn you. - -A while ago, we worked on FleXML to reduce a bit its memory consumption, but -these issues remain. There is two things we should do: - - - use a dynamic buffer instead of a static one so that the only limit - becomes your memory, not a stupid constant fixed at compilation time - (maybe not so difficult). - - change the parser so that it does not need to get the whole file in - memory before parsing - (seems quite difficult, but I'm a complete newbe wrt flex stuff). - -These are changes to FleXML itself, not SimGrid. But since we kinda hijacked -the development of FleXML, I can grant you that any patches would be really -welcome and quickly integrated. - -Update: A new version of FleXML (1.7) was released. Most of the work -was done by William Dowling, who use it in his own work. The good point is -that it now use a dynamic buffer, and that the memory usage was greatly -improved. The downside is that William also changed some things internally, -and it breaks the hack we devised to bypass the parser, as explained in -\ref pf_flexml_bypassing. Indeed, this is not a classical usage of the -parser, and Will didn't imagine that we may have used (and even documented) -such a crude usage of FleXML. So, we now have to repair the bypassing -functionality to use the latest FleXML version and fix the memory usage in -SimGrid. - -\subsubsection faq_trouble_errors_big_fat_warning I'm told that my XML files are too old. - -The format of the XML platform description files is sometimes -improved. For example, we decided to change the units used in SimGrid -from MBytes, MFlops and seconds to Bytes, Flops and seconds to ease -people exchanging small messages. We also reworked the route -descriptions to allow more compact descriptions. - -That is why the XML files are versionned using the 'version' attribute -of the root tag. Currently, it should read: -\verbatim - -\endverbatim - -If your files are too old, you can use the simgrid_update_xml.pl -script which can be found in the tools directory of the archive. - -\subsection faq_trouble_debug Debugging SMPI applications - -In order to debug SMPI programs, you can use the following options: - -- -wrapper 'gdb --args': this option is used to use a wrapper - in order to call the SMPI process. Good candidates for this options - are "gdb --args", "valgrind", "rr record", "strace", etc; - -- -foreground: this options gives the debugger access to the terminal - which is needed in order to use an interactive debugger. - -Both options are needed in order to run the SMPI process under GDB. - -\subsection faq_trouble_valgrind Valgrind-related and other debugger issues - -If you don't, you really should use valgrind to debug your code, it's -almost magic. - -\subsubsection faq_trouble_vg_longjmp longjmp madness in valgrind - -This is when valgrind starts complaining about longjmp things, just like: - -\verbatim ==21434== Conditional jump or move depends on uninitialised value(s) -==21434== at 0x420DBE5: longjmp (longjmp.c:33) -==21434== -==21434== Use of uninitialised value of size 4 -==21434== at 0x420DC3A: __longjmp (__longjmp.S:48) -\endverbatim - -This is the sign that you didn't used the exception mecanism well. Most -probably, you have a return; somewhere within a TRY{} -block. This is evil, and you must not do this. Did you read the section -about \ref XBT_ex?? - -\subsubsection faq_trouble_vg_libc Valgrind spits tons of errors about backtraces! - -It may happen that valgrind, the memory debugger beloved by any decent C -programmer, spits tons of warnings like the following : -\verbatim ==8414== Conditional jump or move depends on uninitialised value(s) -==8414== at 0x400882D: (within /lib/ld-2.3.6.so) -==8414== by 0x414EDE9: (within /lib/tls/i686/cmov/libc-2.3.6.so) -==8414== by 0x400B105: (within /lib/ld-2.3.6.so) -==8414== by 0x414F937: _dl_open (in /lib/tls/i686/cmov/libc-2.3.6.so) -==8414== by 0x4150F4C: (within /lib/tls/i686/cmov/libc-2.3.6.so) -==8414== by 0x400B105: (within /lib/ld-2.3.6.so) -==8414== by 0x415102D: __libc_dlopen_mode (in /lib/tls/i686/cmov/libc-2.3.6.so) -==8414== by 0x412D6B9: backtrace (in /lib/tls/i686/cmov/libc-2.3.6.so) -==8414== by 0x8076446: xbt_dictelm_get_ext (dict_elm.c:714) -==8414== by 0x80764C1: xbt_dictelm_get (dict_elm.c:732) -==8414== by 0x8079010: xbt_cfg_register (config.c:208) -==8414== by 0x806821B: MSG_config (msg_config.c:42) -\endverbatim - -This problem is somewhere in the libc when using the backtraces and there is -very few things we can do ourselves to fix it. Instead, here is how to tell -valgrind to ignore the error. Add the following to your ~/.valgrind.supp (or -create this file on need). Make sure to change the obj line according to -your personnal mileage (change 2.3.6 to the actual version you are using, -which you can retrieve with a simple "ls /lib/ld*.so"). - -\verbatim { - name: Backtrace madness - Memcheck:Cond - obj:/lib/ld-2.3.6.so - fun:dl_open_worker - fun:_dl_open - fun:do_dlopen - fun:dlerror_run - fun:__libc_dlopen_mode -}\endverbatim - -Then, you have to specify valgrind to use this suppression file by passing -the --suppressions=$HOME/.valgrind.supp option on the command line. -You can also add the following to your ~/.bashrc so that it gets passed -automatically. Actually, it passes a bit more options to valgrind, and this -happen to be my personnal settings. Check the valgrind documentation for -more information. - -\verbatim export VALGRIND_OPTS="--leak-check=yes --leak-resolution=high --num-callers=40 --tool=memcheck --suppressions=$HOME/.valgrind.supp" \endverbatim - -\subsubsection faq_trouble_backtraces Truncated backtraces - -When debugging SimGrid, it's easier to pass the ---disable-compiler-optimization flag to the configure if valgrind or -gdb get fooled by the optimization done by the compiler. But you -should remove these flag when everything works before going in -production (before launching your 1252135 experiments), or everything -will run only one half of the true SimGrid potential. - -\subsection faq_deadlock There is a deadlock in my code!!! - -Unfortunately, we cannot debug every code written in SimGrid. We -furthermore believe that the framework provides ways enough -information to debug such informations yourself. If the textual output -is not enough, Make sure to check the \ref faq_visualization FAQ entry to see -how to get a graphical one. - -Now, if you come up with a really simple example that deadlocks and -you're absolutely convinced that it should not, you can ask on the -list. Just be aware that you'll be severely punished if the mistake is -on your side... We have plenty of FAQ entries to redact and new -features to implement for the impenitents! ;) - -Using - -\subsection faq_surf_network_latency I get weird timings when I play with the latencies. - -OK, first of all, remember that units should be Bytes, Flops and -Seconds. If you don't use such units, some SimGrid constants (e.g. the -SG_TCP_CTE_GAMMA constant used in most network models) won't have the -right unit and you'll end up with weird results. - -Here is what happens with a single transfer of size L on a link -(bw,lat) when nothing else happens. - -\verbatim -0-----lat--------------------------------------------------t -|-----|**** real_bw =min(bw,SG_TCP_CTE_GAMMA/(2*lat)) *****| -\endverbatim - -In more complex situations, this min is the solution of a complex -max-min linear system. Have a look -here -and read the two threads "Bug in SURF?" and "Surf bug not -fixed?". You'll have a few other examples of such computations. You -can also read "A Network Model for Simulation of Grid Application" by -Henri Casanova and Loris Marchal to have all the details. The fact -that the real_bw is smaller than bw is easy to understand. The fact -that real_bw is smaller than SG_TCP_CTE_GAMMA/(2*lat) is due to the -window-based congestion mechanism of TCP. With TCP, you can't exploit -your huge network capacity if you don't have a good round-trip-time -because of the acks... - -Anyway, what you get is t=lat + L/min(bw,SG_TCP_CTE_GAMMA/(2*lat)). - - * if I you set (bw,lat)=(100 000 000, 0.00001), you get t = 1.00001 (you fully -use your link) - * if I you set (bw,lat)=(100 000 000, 0.0001), you get t = 1.0001 (you're on the -limit) - * if I you set (bw,lat)=(100 000 000, 0.001), you get t = 10.001 (ouch!) - -This bound on the effective bandwidth of a flow is not the only thing -that may make your result be unexpected. For example, two flows -competing on a saturated link receive an amount of bandwidth inversely -proportional to their round trip time. - -\subsection faq_bugrepport So I've found a bug in SimGrid. How to report it? - -We do our best to make sure to hammer away any bugs of SimGrid, but this is -still an academic project so please be patient if/when you find bugs in it. -If you do, the best solution is to drop an email either on the simgrid-user -or the simgrid-devel mailing list and explain us about the issue. You can -also decide to open a formal bug report using the -relevant -interface. You need to login on the server to get the ability to submit -bugs. - -We will do our best to solve any problem repported, but you need to help us -finding the issue. Just telling "it segfault" isn't enough. Telling "It -segfaults when running the attached simulator" doesn't really help either. -You may find the following article interesting to see how to repport -informative bug repports: -http://www.chiark.greenend.org.uk/~sgtatham/bugs.html (it is not SimGrid -specific at all, but it's full of good advices). - -\author Da SimGrid team - */ - -****************************************************************** -* OLD CRUFT NOT USED ANYMORE * -****************************************************************** - - -subsection faq_crosscompile Cross-compiling a Windows DLL of SimGrid from linux - -At the moment, we do not distribute Windows pre-compiled version of SimGrid -because the support for this platform is still experimental. We know that -some parts of the GRAS environment do not work, and we think that the others -environments (MSG and SD) have good chances to work, but we didn't test -ourselves. This section explains how we generate the SimGrid DLL so that you -can build it for yourself. First of all, you need to have a version more -recent than 3.1 (ie, a SVN version as time of writting). - -In order to cross-compile the package to windows from linux, you need to -install mingw32 (minimalist gnu win32). On Debian, you can do so by -installing the packages mingw32 (compiler), mingw32-binutils (linker and -so), mingw32-runtime. - -You can use the VPATH support of configure to compile at the same time for -linux and windows without dupplicating the source nor cleaning the tree -between each. Just run bootstrap (if you use the SVN) to run the autotools. -Then, create a linux and a win directories. Then, type: -\verbatim cd linux; ../configure --srcdir=.. ; make; cd .. -cd win; ../configure --srcdir=.. --host=i586-mingw32msvc ; make; cd .. -\endverbatim -The trick to VPATH builds is to call configure from another directory, -passing it an extra --srcdir argument to tell it where all the sources are. -It will understand you want to use VPATH. Then, the trick to cross-compile -is simply to add a --host argument specifying the target you want to build -for. The i586-mingw32msvc string is what you have to pass to use the mingw32 -environment as distributed in Debian. - -After that, you can run all make targets from both directories, and test -easily that what you change for one arch does not break the other one. - -It is possible that this VPATH build thing breaks from time to time in the -SVN since it's quite fragile, but it's granted to work in any released -version. If you experience problems, drop us a mail. - -Another possible source of issue is that at the moment, building the -examples request to use the gras_stub_generator tool, which is a compiled -program, not a script. In cross-compilation, you need to cross-execute with -wine for example, which is not really pleasant. We are working on this, but -in the meanwhile, simply don't build the examples in cross-compilation -(cd src before running make). - -Program (cross-)compiled with mingw32 do request an extra DLL at run-time to be -usable. For example, if you want to test your build with wine, you should do -the following to put this library where wine looks for DLLs. -\verbatim -cp /usr/share/doc/mingw32-runtime/mingwm10.dll.gz ~/.wine/c/windows/system/ -gunzip ~/.wine/c/windows/system/mingwm10.dll.gz -\endverbatim - -The DLL is built in src/.libs, and installed in the prefix/bin directory -when you run make install. - -If you want to use it in a native project on windows, you need to use -simgrid.dll and mingwm10.dll. For each DLL, you need to build .def file -under linux (listing the defined symbols), and convert it into a .lib file -under windows (specifying this in a way that windows compilers like). To -generate the def files, run (under linux): -\verbatim echo "LIBRARY libsimgrid-0.dll" > simgrid.def -echo EXPORTS >> simgrid.def -nm libsimgrid-0.dll | grep ' T _' | sed 's/.* T _//' >> simgrid.def -nm libsimgrid-0.dll | grep ' D _' | sed 's/.* D _//' | sed 's/$/ DATA/' >> simgrid.def - -echo "LIBRARY mingwm10.dll" > mingwm10.def -echo EXPORTS >> mingwm10.def -nm mingwm10.dll | grep ' T _' | sed 's/.* T _//' >> mingwm10.def -nm mingwm10.dll | grep ' D _' | sed 's/.* D _//' | sed 's/$/ DATA/' >> mingwm10.def -\endverbatim - -To create the import .lib files, use the lib windows tool (from -MSVC) the following way to produce simgrid.lib and mingwm10.lib -\verbatim lib /def:simgrid.def -lib /def:mingwm10.def -\endverbatim - -If you happen to use Borland C Builder, the right command line is the -following (note that you don't need any file.def to get this working). -\verbatim implib simgrid.lib libsimgrid-0.dll -implib mingwm10.lib mingwm10.dll -\endverbatim - -Then, set the following parameters in Visual C++ 2005: -Linker -> Input -> Additional dependencies = simgrid.lib mingwm10.lib - -Just in case you wonder how to generate a DLL from libtool in another -project, we added -no-undefined to any lib*_la_LDFLAGS variables so that -libtool accepts to generate a dynamic library under windows. Then, to make -it true, we pass any dependencies (such as -lws2 under windows or -lpthread -on need) on the linking line. Passing such deps is a good idea anyway so -that they get noted in the library itself, avoiding the users to know about -our dependencies and put them manually on their compilation line. Then we -added the AC_LIBTOOL_WIN32_DLL macro just before AC_PROG_LIBTOOL in the -configure.ac. It means that we exported any symbols which need to be. -Nowadays, functions get automatically exported, so we don't need to load our -header files with tons of __declspec(dllexport) cruft. We only need to do so -for data, but there is no public data in SimGrid so we are good.