X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/1f3b2d6bb4a31767e013b4d6739602fe1db89bf3..534d31e4fc62b248c646ad6e19518e8a7b11e048:/doc/FAQ.doc diff --git a/doc/FAQ.doc b/doc/FAQ.doc index 852ad8843d..381deda119 100644 --- a/doc/FAQ.doc +++ b/doc/FAQ.doc @@ -1,4 +1,4 @@ -/*! \page faq Frequently Asked Questions +/*! \page FAQ Frequently Asked Questions \htmlinclude .FAQ.doc.toc @@ -36,10 +36,10 @@ on their own. The difference between both is that MSG is somehow easier to use, but GRAS is not limited to the simulator. Once you're done writing your GRAS code, -you can run your code both in the simulator or on a real platform. For this, -there is two implementations of the GRAS interface, one for simulation, one -for real execution. So, you just have to relink your code to chose one of -both world. +you can run your code both in the simulator and on a real platform. For this, +there are two implementations of the GRAS interface, one for simulation, and one +for real execution. So, you just have to relink your code to choose one of +both worlds. \subsection faq_visualization Visualizing and analyzing the results @@ -56,17 +56,19 @@ We also have a more graphical output. Have a look at section \ref options_tracin \subsection faq_C Argh! Do I really have to code in C? -Up until now, there is no binding for other languages. If you use C++, +Currently bindings on top of MSG are supported for Java, Ruby and Lua. You can find a few +documentation about them on the doc page. Note that bindings are released separately from the main dist +and so have their own version numbers. + +Moreover If you use C++, you should be able to use the SimGrid library as a standard C library and everything should work fine (simply link against this library; recompiling SimGrid with a C++ compiler won't work and it wouldn't help if you could). -In fact, we are currently working on Java bindings of MSG to allow -all the undergrad students of the world to use this tool. This is a -little more tricky than I would have expected, but the work is moving -fast forward [2006/05/13]. More languages are evaluated, but for now, -we do not feel a real demand for any other language. Please speak up! +For now, +we do not feel a real demand for any other language. But if you think there is one, + please speak up! \section faq_howto Feature related questions @@ -129,11 +131,11 @@ As we already told, we prefer keeping the API as simple as possible. This kind of feature is rather easy to implement by users and the semantic you associate really depends on people. Having a *generic* task duplication mechanism is not that trivial (in -particular because of the data field). That is why I would recommand +particular because of the data field). That is why I would recommend that you write it by yourself even if I can give you advice on how to do it. -You have the following functions to get informations about a task: +You have the following functions to get information about a task: MSG_task_get_name(), MSG_task_get_compute_duration(), MSG_task_get_remaining_computation(), MSG_task_get_data_size(), and MSG_task_get_data(). @@ -155,7 +157,7 @@ we have introduced the following functions: - MSG_comm_destroy() We refer you to the description of these functions for more details on their usage as well -as to the exemple section on \ref MSG_ex_asynchronous_communications. +as to the example section on \ref MSG_ex_asynchronous_communications. \subsubsection faq_MIA_thread_synchronization I need to synchronize my MSG processes @@ -183,10 +185,10 @@ even take the currently running tasks into account. In some SURF models, communications have an influence on computational power. Should it be taken into account too? -First of all, it's near to impossible to predict the load beforehands in the +First of all, it's near to impossible to predict the load beforehand in the simulator since it depends on too much parameters (background load variation, bandwidth sharing algorithmic complexity) some of them even being -not known beforehands (other task starting at the same time). So, getting +not known beforehand (other task starting at the same time). So, getting this information is really hard (just like in real life). It's not just that we want MSG to be as painful as real life. But as it is in some way realistic, we face some of the same problems as we would face in real life. @@ -317,7 +319,7 @@ Here are a few tricks you can apply if you want to increase the amount of processes in your simulations. - A few thousands of simulated processes (soft tricks)\n - SimGrid can use either pthreads library or the UNIX98 contextes. On + SimGrid can use either pthreads library or the UNIX98 contexts. On most systems, the number of pthreads is limited and then your simulation may be limited for a stupid reason. This is especially true with the current linux pthreads, and I cannot get more than @@ -325,21 +327,21 @@ of processes in your simulations. contexts allow me to raise the limit to 25,000 simulated processes on my laptop.\n\n The --with-context option of the ./configure - script allows you to choose between UNIX98 contextes + script allows you to choose between UNIX98 contexts (--with-context=ucontext) and the pthread version (--with-context=pthread). The default value is ucontext when the script detect a working UNIX98 context implementation. On Windows boxes, the provided value is discarded and an adapted version is picked up.\n\n - We experienced some issues with contextes on some rare systems + We experienced some issues with contexts on some rare systems (solaris 8 and lower or old alpha linuxes comes to mind). The main - problem is that the configure script detect the contextes as being + problem is that the configure script detect the contexts as being functional when it's not true. If you happen to use such a system, switch manually to the pthread version, and provide us with a good patch for the configure script so that it is done automatically ;) - Hundred thousands of simulated processes (hard-core tricks)\n - As explained above, SimGrid can use UNIX98 contextes to represent + As explained above, SimGrid can use UNIX98 contexts to represent and handle the simulated processes. Thanks to this, the main limitation to the number of simulated processes becomes the available memory.\n\n @@ -347,20 +349,20 @@ of processes in your simulations. between 25,000 processes on my laptop (1Gb memory, 1.5Gb swap).\n - First of all, make sure your code runs for a few hundreds processes before trying to push the limit. Make sure it's - valgrind-clean, ie that valgrind does not report neither memory + valgrind-clean, i.e. that valgrind does not report neither memory error nor memory leaks. Indeed, numerous simulated processes result in *fat* simulation hindering debugging. - It was really boring to write 25,000 entries in the deployment file, so I wrote a little script examples/gras/mutual_exclusion/simple_token/make_deployment.pl, which you may want to adapt to your case. You could also think about hijacking - the SURFXML parser (have look at \ref faq_flexml_bypassing). + the SURFXML parser (have look at \ref pf_flexml_bypassing). - The deployment file became quite big, so I had to do what is in the FAQ entry \ref faq_flexml_limit - Each UNIX98 context has its own stack entry. As debugging this is - quite hairly, the default value is a bit overestimated so that - user don't get into trouble about this. You want to tune this - size to increse the number of processes. This is the + quite hairy, the default value is a bit overestimated so that + user doesn't get into trouble about this. You want to tune this + size to increase the number of processes. This is the STACK_SIZE define in src/xbt/xbt_context_sysv.c, which is 128kb by default. Reduce this as much as you can, but be warned that if this value @@ -414,7 +416,7 @@ completion) to be saved somewhere. So if a host (and the task executing on it) fails (cf. #MSG_HOST_FAILURE), then the task can be restarted from the last checkpoint.\n -Actually, such a thing does not exists in SimGrid either, but it's just +Actually, such a thing does not exist in SimGrid either, but it's just because we don't think it is fundamental and it may be done in the user code at relatively low cost. You could for example use a watcher that periodically get the remaining amount of things to do (using @@ -425,7 +427,7 @@ subtasks. \subsubsection faq_platform_example Where can I find SimGrid platform files? -There is several little examples in the archive, in the examples/msg +There are several little examples in the archive, in the examples/msg directory. From time to time, we are asked for other files, but we don't have much at hand right now. @@ -459,219 +461,16 @@ annotate Tiers-generated topologies. This perl-script is in tools/platform_generation/ directory of the SVN. Dinda et Al. released a very comparable tool, and called it GridG. -\subsubsection faq_SURF_multicore Modeling multi-core resources - -There is currently no native support for multi-core or SMP machines in -SimGrid. We are currently working on it, but coming up with the right -model is very hard: Cores share caches and bus to access memory and -thus interfere with each others. Memory contention is a crucial -component of multi-core modeling. -In the meanwhile, some user-level tricks can reveal sufficient for -you. For example, you may model each core by a CPU and add some very -high speed links between them. This complicates a bit the user code -since you have to remember that when you assign something to a (real) -host, it can be any of the (fake) hosts representing the cores of a -given machine. For that, you can use the prop tag of the XML files as -follows. Your code should then look at the ‘machine’ property -associated with each workstation, and run parallel tasks over all -cores of the machine. +The specified computing power will be available to up to 6 sequential +tasks without sharing. If more tasks are placed on this host, the +resource will be shared accordingly. For example, if you schedule 12 +tasks on the host, each will get half of the computing power. Please +note that although sound, this model were never scientifically +assessed. Please keep this fact in mind when using it. -\verbatim - - - - - - - - -\endverbatim - -\subsubsection faq_SURF_dynamic Modeling dynamic resource availability - -A nice feature of SimGrid is that it enables you to seamlessly have -resources whose availability change over time. When you build a -platform, you generally declare hosts like that: - -\verbatim - -\endverbatim - -If you want the availability of "host A" to change over time, the only -thing you have to do is change this definition like that: - -\verbatim - -\endverbatim - -For hosts, availability files are expressed in fraction of available -power. Let's have a look at what "trace_A.txt" may look like: - -\verbatim -PERIODICITY 1.0 -0.0 1.0 -11.0 0.5 -20.0 0.9 -\endverbatim - -At time 0, our host will deliver 100 flop/s. At time 11.0, it will -deliver only 50 flop/s until time 20.0 where it will will start -delivering 90 flop/s. Last at time 21.0 (20.0 plus the periodicity -1.0), we'll be back to the beginning and it will deliver 100 flop/s. - -Now let's look at the state file: -\verbatim -PERIODICITY 10.0 -1.0 -1.0 -2.0 1.0 -\endverbatim - -A negative value means "off" while a positive one means "on". At time -1.0, the host is on. At time 1.0, it is turned off and at time 2.0, it -is turned on again until time 12 (2.0 plus the periodicity 10.0). It -will be turned on again at time 13.0 until time 23.0, and so on. - -Now, let's look how the same kind of thing can be done for network -links. A usual declaration looks like: - -\verbatim - -\endverbatim - -You have at your disposal the following options: bandwidth_file, -latency_file and state_file. The only difference with hosts is that -bandwidth_file and latency_file do not express fraction of available -power but are expressed directly in bytes per seconds and seconds. - -\subsubsection faq_platform_multipath How to express multipath routing in platform files? - -It is unfortunately impossible to express the fact that there is more -than one routing path between two given hosts. Let's consider the -following platform file: - -\verbatim - - - - - - - - - -\endverbatim - -Although it is perfectly valid, it does not mean that data traveling -from A to C can either go directly (using link 3) or through B (using -links 1 and 2). It simply means that the routing on the graph is not -trivial, and that data do not following the shortest path in number of -hops on this graph. Another way to say it is that there is no implicit -in these routing descriptions. The system will only use the routes you -declare (such as <route src="A" dst="C"><link:ctn -id="3"/></route>), without trying to build new routes by aggregating -the provided ones. - -You are also free to declare platform where the routing is not -symmetric. For example, add the following to the previous file: - -\verbatim - - - - -\endverbatim - -This makes sure that data from C to A go through B where data from A -to C go directly. Don't worry about realism of such settings since -we've seen ways more weird situation in real settings (in fact, that's -the realism of very regular platforms which is questionable, but -that's another story). - -\subsubsection faq_flexml_bypassing Bypassing the XML parser with your own C functions - -So you want to bypass the XML files parser, uh? Maybe doing some parameter -sweep experiments on your simulations or so? This is possible, and -it's not even really difficult (well. Such a brutal idea could be -harder to implement). Here is how it goes. - -For this, you have to first remember that the XML parsing in SimGrid is done -using a tool called FleXML. Given a DTD, this gives a flex-based parser. If -you want to bypass the parser, you need to provide some code mimicking what -it does and replacing it in its interactions with the SURF code. So, let's -have a look at these interactions. - -FleXML parser are close to classical SAX parsers. It means that a -well-formed SimGrid platform XML file might result in the following -"events": - - - start "platform_description" with attribute version="2" - - start "host" with attributes id="host1" power="1.0" - - end "host" - - start "host" with attributes id="host2" power="2.0" - - end "host" - - start "link" with ... - - end "link" - - start "route" with ... - - start "link:ctn" with ... - - end "link:ctn" - - end "route" - - end "platform_description" - -The communication from the parser to the SURF code uses two means: -Attributes get copied into some global variables, and a surf-provided -function gets called by the parser for each event. For example, the event - - start "host" with attributes id="host1" power="1.0" - -let the parser do something roughly equivalent to: -\verbatim - strcpy(A_host_id,"host1"); - A_host_power = 1.0; - STag_host(); -\endverbatim - -In SURF, we attach callbacks to the different events by initializing the -pointer functions to some the right surf functions. Since there can be -more than one callback attached to the same event (if more than one -model is in use, for example), they are stored in a dynar. Example in -workstation_ptask_L07.c: -\verbatim - /* Adding callback functions */ - surf_parse_reset_parser(); - surfxml_add_callback(STag_surfxml_host_cb_list, &parse_cpu_init); - surfxml_add_callback(STag_surfxml_prop_cb_list, &parse_properties); - surfxml_add_callback(STag_surfxml_link_cb_list, &parse_link_init); - surfxml_add_callback(STag_surfxml_route_cb_list, &parse_route_set_endpoints); - surfxml_add_callback(ETag_surfxml_link_c_ctn_cb_list, &parse_route_elem); - surfxml_add_callback(ETag_surfxml_route_cb_list, &parse_route_set_route); - - /* Parse the file */ - surf_parse_open(file); - xbt_assert(!surf_parse(), "Parse error in %s", file); - surf_parse_close(); -\endverbatim - -So, to bypass the FleXML parser, you need to write your own version of the -surf_parse function, which should do the following: - - Fill the A__ variables with the wanted values - - Call the corresponding STag__fun function to simulate tag start - - Call the corresponding ETag__fun function to simulate tag end - - (do the same for the next set of values, and loop) - -Then, tell SimGrid that you want to use your own "parser" instead of the stock one: -\verbatim - surf_parse = surf_parse_bypass_environment; - MSG_create_environment(NULL); - surf_parse = surf_parse_bypass_application; - MSG_launch_application(NULL); -\endverbatim - -A set of macros are provided at the end of -include/surf/surfxml_parse.h to ease the writing of the bypass -functions. An example of this trick is distributed in the file -examples/msg/masterslave/masterslave_bypass.c \section faq_troubleshooting Troubleshooting @@ -722,6 +521,46 @@ of SimGrid -- see \ref faq_more_processes), you must absolutely specify -lpthread on the linker command line. As usual, this should come after -lsimgrid on this command line. +\subsubsection faq_trouble_lib_msg_deprecated "gcc: undefined reference to MSG_*" + +Since version 3.7 all the m_channel_t mecanism is deprecated. So functions +about this mecanism may get removed in future releases. + +List of functions: + +\li XBT_PUBLIC(int) MSG_get_host_number(void); + +\li XBT_PUBLIC(m_host_t *) MSG_get_host_table(void); + +\li XBT_PUBLIC(MSG_error_t) MSG_get_errno(void); + +\li XBT_PUBLIC(MSG_error_t) MSG_task_get(m_task_t * task, m_channel_t channel); + +\li XBT_PUBLIC(MSG_error_t) MSG_task_get_with_timeout(m_task_t * task, m_channel_t channel, double max_duration); + +\li XBT_PUBLIC(MSG_error_t) MSG_task_get_from_host(m_task_t * task, int channel, m_host_t host); + +\li XBT_PUBLIC(MSG_error_t) MSG_task_get_ext(m_task_t * task, int channel, double max_duration, m_host_t host); + +\li XBT_PUBLIC(MSG_error_t) MSG_task_put(m_task_t task, m_host_t dest, m_channel_t channel); + +\li XBT_PUBLIC(MSG_error_t) MSG_task_put_bounded(m_task_t task, m_host_t dest, m_channel_t channel, double max_rate); + +\li XBT_PUBLIC(MSG_error_t) MSG_task_put_with_timeout(m_task_t task, m_host_t dest, m_channel_t channel, double max_duration); + +\li XBT_PUBLIC(int) MSG_task_Iprobe(m_channel_t channel); + +\li XBT_PUBLIC(int) MSG_task_probe_from(m_channel_t channel); + +\li XBT_PUBLIC(int) MSG_task_probe_from_host(int channel, m_host_t host); + +\li XBT_PUBLIC(MSG_error_t) MSG_set_channel_number(int number); + +\li XBT_PUBLIC(int) MSG_get_channel_number(void); + +If you want them you have to compile Simgrid v3.7 with option "-Denable_msg_deprecated=ON". +Using them should print warning to inform what new function you have to use. + \subsection faq_trouble_errors Runtime error messages \subsubsection faq_flexml_limit "surf_parse_lex: Assertion `next limit' failed." @@ -765,7 +604,7 @@ was done by William Dowling, who use it in his own work. The good point is that it now use a dynamic buffer, and that the memory usage was greatly improved. The downside is that William also changed some things internally, and it breaks the hack we devised to bypass the parser, as explained in -\ref faq_flexml_bypassing. Indeed, this is not a classical usage of the +\ref pf_flexml_bypassing. Indeed, this is not a classical usage of the parser, and Will didn't imagine that we may have used (and even documented) such a crude usage of FleXML. So, we now have to repair the bypassing functionality to use the lastest FleXML version and fix the memory usage in @@ -978,9 +817,7 @@ informative bug repports: http://www.chiark.greenend.org.uk/~sgtatham/bugs.html (it is not SimGrid specific at all, but it's full of good advices). -\author Arnaud Legrand (arnaud.legrand::imag.fr) -\author Martin Quinson (martin.quinson::loria.fr) - +\author Da SimGrid team */