+DAG is really the level of abstraction you want to work with, then you should
+give a try to \ref SD_API.
+
+\subsection faq_MIA_generic Generic features
+
+\subsubsection faq_more_processes Increasing the amount of simulated processes
+
+Here are a few tricks you can apply if you want to increase the amount
+of processes in your simulations.
+
+ - <b>A few thousands of simulated processes</b> (soft tricks)\n
+ SimGrid can use either pthreads library or the UNIX98 contextes. On
+ most systems, the number of pthreads is limited and then your
+ simulation may be limited for a stupid reason. This is especially
+ true with the current linux pthreads, and I cannot get more than
+ 2000 simulated processes with pthreads on my box. The UNIX98
+ contexts allow me to raise the limit to 25,000 simulated processes
+ on my laptop.\n\n
+ The <tt>--with-context</tt> option of the <tt>./configure</tt>
+ script allows you to choose between UNIX98 contextes
+ (<tt>--with-context=ucontext</tt>) and the pthread version
+ (<tt>--with-context=pthread</tt>). The default value is ucontext
+ when the script detect a working UNIX98 context implementation. On
+ Windows boxes, the provided value is discarded and an adapted
+ version is picked up.\n\n
+ We experienced some issues with contextes on some rare systems
+ (solaris 8 and lower or old alpha linuxes comes to mind). The main
+ problem is that the configure script detect the contextes as being
+ functional when it's not true. If you happen to use such a system,
+ switch manually to the pthread version, and provide us with a good
+ patch for the configure script so that it is done automatically ;)
+
+ - <b>Hundred thousands of simulated processes</b> (hard-core tricks)\n
+ As explained above, SimGrid can use UNIX98 contextes to represent
+ and handle the simulated processes. Thanks to this, the main
+ limitation to the number of simulated processes becomes the
+ available memory.\n\n
+ Here are some tricks I had to use in order to run a token ring
+ between 25,000 processes on my laptop (1Gb memory, 1.5Gb swap).\n
+ - First of all, make sure your code runs for a few hundreds
+ processes before trying to push the limit. Make sure it's
+ valgrind-clean, ie that valgrind does not report neither memory
+ error nor memory leaks. Indeed, numerous simulated processes
+ result in *fat* simulation hindering debugging.
+ - It was really boring to write 25,000 entries in the deployment
+ file, so I wrote a little script
+ <tt>examples/gras/mutual_exclusion/simple_token/make_deployment.pl</tt>, which you may
+ want to adapt to your case. You could also think about hijacking
+ the SURFXML parser (have look at \ref faq_flexml_bypassing).
+ - The deployment file became quite big, so I had to do what is in
+ the FAQ entry \ref faq_flexml_limit
+ - Each UNIX98 context has its own stack entry. As debugging this is
+ quite hairly, the default value is a bit overestimated so that
+ user don't get into trouble about this. You want to tune this
+ size to increse the number of processes. This is the
+ <tt>STACK_SIZE</tt> define in
+ <tt>src/xbt/xbt_context_sysv.c</tt>, which is 128kb by default.
+ Reduce this as much as you can, but be warned that if this value
+ is too low, you'll get a segfault. The token ring example, which
+ is quite simple, runs with 40kb stacks.
+ - You may tweak the logs to reduce the stack size further. When
+ logging something, we try to build the string to display in a
+ char array on the stack. The size of this array is constant (and
+ equal to XBT_LOG_BUFF_SIZE, defined in include/xbt/log/h). If the
+ string is too large to fit this buffer, we move to a dynamically
+ sized buffer. In which case, we have to traverse one time the log
+ event arguments to compute the size we need for the buffer,
+ malloc it, and traverse the argument list again to do the actual
+ job.\n
+ The idea here is to move XBT_LOG_BUFF_SIZE to 1, forcing the logs
+ to use a dynamic array each time. This allows us to lower further
+ the stack size at the price of some performance loss...\n
+ This allowed me to run the reduce the stack size to ... 4k. Ie,
+ on my 1Gb laptop, I can run more than 250,000 processes!
+
+\subsubsection faq_MIA_batch_scheduler Is there a native support for batch schedulers in SimGrid?
+
+No, there is no native support for batch schedulers and none is
+planned because this is a very specific need (and doing it in a
+generic way is thus very hard). However some people have implemented
+their own batch schedulers. Vincent Garonne wrote one during his PhD
+and put his code in the contrib directory of our SVN so that other can
+keep working on it. You may find inspiring ideas in it.
+
+\subsubsection faq_MIA_checkpointing I need a checkpointing thing
+
+Actually, it depends on whether you want to checkpoint the simulation, or to
+simulate checkpoints.
+
+The first one could help if your simulation is a long standing process you
+want to keep running even on hardware issues. It could also help to
+<i>rewind</i> the simulation by jumping sometimes on an old checkpoint to
+cancel recent calculations.\n
+Unfortunately, such thing will probably never exist in SG. One would have to
+duplicate all data structures because doing a rewind at the simulator level
+is very very hard (not talking about the malloc free operations that might
+have been done in between). Instead, you may be interested in the Libckpt
+library (http://www.cs.utk.edu/~plank/plank/www/libckpt.html). This is the
+checkpointing solution used in the condor project, for example. It makes it
+easy to create checkpoints (at the OS level, creating something like core
+files), and rerunning them on need.
+
+If you want to simulate checkpoints instead, it means that you want the
+state of an executing task (in particular, the progress made towards
+completion) to be saved somewhere. So if a host (and the task executing on
+it) fails (cf. #MSG_HOST_FAILURE), then the task can be restarted
+from the last checkpoint.\n
+
+Actually, such a thing does not exists in SimGrid either, but it's just
+because we don't think it is fundamental and it may be done in the user code
+at relatively low cost. You could for example use a watcher that
+periodically get the remaining amount of things to do (using
+MSG_task_get_remaining_computation()), or fragment the task in smaller
+subtasks.
+
+\subsection faq_platform Platform building and Dynamic resources
+
+\subsubsection faq_platform_example Where can I find SimGrid platform files?
+
+There is several little examples in the archive, in the examples/msg
+directory. From time to time, we are asked for other files, but we
+don't have much at hand right now.
+
+You should refer to the Platform Description Archive
+(http://pda.gforge.inria.fr) project to see the other platform file we
+have available, as well as the Simulacrum simulator, meant to generate
+SimGrid platforms using all classical generation algorithms.
+
+\subsubsection faq_platform_alnem How can I automatically map an existing platform?
+
+We are working on a project called ALNeM (Application-Level Network
+Mapper) which goal is to automatically discover the topology of an
+existing network. Its output will be a platform description file
+following the SimGrid syntax, so everybody will get the ability to map
+their own lab network (and contribute them to the catalog project).
+This tool is not ready yet, but it move quite fast forward. Just stay
+tuned.
+
+\subsubsection faq_platform_synthetic Generating synthetic but realistic platforms
+
+The third possibility to get a platform file (after manual or
+automatic mapping of real platforms) is to generate synthetic
+platforms. Getting a realistic result is not a trivial task, and
+moreover, nobody is really able to define what "realistic" means when
+speaking of topology files. You can find some more thoughts on this
+topic in these
+<a href="http://graal.ens-lyon.fr/~alegrand/articles/Simgrid-Introduction.pdf">slides</a>.
+
+If you are looking for an actual tool, there we have a little tool to
+annotate Tiers-generated topologies. This perl-script is in
+<tt>tools/platform_generation/</tt> directory of the SVN. Dinda et Al.
+released a very comparable tool, and called it GridG.
+
+\subsubsection faq_SURF_dynamic Expressing dynamic resource availability in platform files
+
+A nice feature of SimGrid is that it enables you to seamlessly have
+resources whose availability change over time. When you build a
+platform, you generally declare hosts like that:
+
+\verbatim
+ <host id="host A" power="100.00"/>
+\endverbatim
+
+If you want the availability of "host A" to change over time, the only
+thing you have to do is change this definition like that:
+
+\verbatim
+ <host id="host A" power="100.00" availability_file="trace_A.txt" state_file="trace_A_failure.txt"/>
+\endverbatim
+
+For hosts, availability files are expressed in fraction of available
+power. Let's have a look at what "trace_A.txt" may look like:
+
+\verbatim
+PERIODICITY 1.0
+0.0 1.0
+11.0 0.5
+20.0 0.9
+\endverbatim
+
+At time 0, our host will deliver 100 flop/s. At time 11.0, it will
+deliver only 50 flop/s until time 20.0 where it will will start
+delivering 90 flop/s. Last at time 21.0 (20.0 plus the periodicity
+1.0), we'll be back to the beginning and it will deliver 100 flop/s.
+
+Now let's look at the state file:
+\verbatim
+PERIODICITY 10.0
+1.0 -1.0
+2.0 1.0
+\endverbatim
+
+A negative value means "off" while a positive one means "on". At time
+1.0, the host is on. At time 1.0, it is turned off and at time 2.0, it
+is turned on again until time 12 (2.0 plus the periodicity 10.0). It
+will be turned on again at time 13.0 until time 23.0, and so on.
+
+Now, let's look how the same kind of thing can be done for network
+links. A usual declaration looks like:
+
+\verbatim
+ <link id="LinkA" bandwidth="10.0" latency="0.2"/>
+\endverbatim
+
+You have at your disposal the following options: bandwidth_file,
+latency_file and state_file. The only difference with hosts is that
+bandwidth_file and latency_file do not express fraction of available
+power but are expressed directly in bytes per seconds and seconds.
+
+\subsubsection faq_platform_multipath How to express multipath routing in platform files?
+
+It is unfortunately impossible to express the fact that there is more
+than one routing path between two given hosts. Let's consider the
+following platform file:
+
+\verbatim
+<route src="A" dst="B">
+ <link:ctn id="1"/>
+</route>
+<route src="B" dst="C">
+ <link:ctn id="2"/>
+</route>
+<route src="A" dst="C">
+ <link:ctn id="3"/>
+</route>
+\endverbatim
+
+Although it is perfectly valid, it does not mean that data traveling
+from A to C can either go directly (using link 3) or through B (using
+links 1 and 2). It simply means that the routing on the graph is not
+trivial, and that data do not following the shortest path in number of
+hops on this graph. Another way to say it is that there is no implicit
+in these routing descriptions. The system will only use the routes you
+declare (such as <route src="A" dst="C"><link:ctn
+id="3"/></route>), without trying to build new routes by aggregating
+the provided ones.
+
+You are also free to declare platform where the routing is not
+symmetric. For example, add the following to the previous file:
+
+\verbatim
+<route src="C" dst="A">
+ <link:ctn id="2"/>
+ <link:ctn id="1"/>
+</route>
+\endverbatim
+
+This makes sure that data from C to A go through B where data from A
+to C go directly. Don't worry about realism of such settings since
+we've seen ways more weird situation in real settings (in fact, that's
+the realism of very regular platforms which is questionable, but
+that's another story).
+
+\subsubsection faq_flexml_bypassing Bypassing the XML parser with your own C functions
+
+So you want to bypass the XML files parser, uh? Maybe doing some parameter
+sweep experiments on your simulations or so? This is possible, and
+it's not even really difficult (well. Such a brutal idea could be
+harder to implement). Here is how it goes.
+
+For this, you have to first remember that the XML parsing in SimGrid is done
+using a tool called FleXML. Given a DTD, this gives a flex-based parser. If
+you want to bypass the parser, you need to provide some code mimicking what
+it does and replacing it in its interactions with the SURF code. So, let's
+have a look at these interactions.
+
+FleXML parser are close to classical SAX parsers. It means that a
+well-formed SimGrid platform XML file might result in the following
+"events":
+
+ - start "platform_description" with attribute version="2"
+ - start "host" with attributes id="host1" power="1.0"
+ - end "host"
+ - start "host" with attributes id="host2" power="2.0"
+ - end "host"
+ - start "link" with ...
+ - end "link"
+ - start "route" with ...
+ - start "link:ctn" with ...
+ - end "link:ctn"
+ - end "route"
+ - end "platform_description"
+
+The communication from the parser to the SURF code uses two means:
+Attributes get copied into some global variables, and a surf-provided
+function gets called by the parser for each event. For example, the event
+ - start "host" with attributes id="host1" power="1.0"
+
+let the parser do something roughly equivalent to:
+\verbatim
+ strcpy(A_host_id,"host1");
+ A_host_power = 1.0;
+ STag_host();
+\endverbatim
+
+In SURF, we attach callbacks to the different events by initializing the
+pointer functions to some the right surf functions. Since there can be
+more than one callback attached to the same event (if more than one
+model is in use, for example), they are stored in a dynar. Example in
+workstation_ptask_L07.c:
+\verbatim
+ /* Adding callback functions */
+ surf_parse_reset_parser();
+ surfxml_add_callback(STag_surfxml_host_cb_list, &parse_cpu_init);
+ surfxml_add_callback(STag_surfxml_prop_cb_list, &parse_properties);
+ surfxml_add_callback(STag_surfxml_link_cb_list, &parse_link_init);
+ surfxml_add_callback(STag_surfxml_route_cb_list, &parse_route_set_endpoints);
+ surfxml_add_callback(ETag_surfxml_link_c_ctn_cb_list, &parse_route_elem);
+ surfxml_add_callback(ETag_surfxml_route_cb_list, &parse_route_set_route);
+
+ /* Parse the file */
+ surf_parse_open(file);
+ xbt_assert1((!surf_parse()), "Parse error in %s", file);
+ surf_parse_close();
+\endverbatim
+
+So, to bypass the FleXML parser, you need to write your own version of the
+surf_parse function, which should do the following:
+ - Fill the A_<tag>_<attribute> variables with the wanted values
+ - Call the corresponding STag_<tag>_fun function to simulate tag start
+ - Call the corresponding ETag_<tag>_fun function to simulate tag end
+ - (do the same for the next set of values, and loop)
+
+Then, tell SimGrid that you want to use your own "parser" instead of the stock one:
+\verbatim
+ surf_parse = surf_parse_bypass_environment;
+ MSG_create_environment(NULL);
+ surf_parse = surf_parse_bypass_application;
+ MSG_launch_application(NULL);
+\endverbatim
+
+A set of macros are provided at the end of
+include/surf/surfxml_parse.h to ease the writing of the bypass
+functions. An example of this trick is distributed in the file
+examples/msg/masterslave/masterslave_bypass.c
+
+\subsection faq_simgrid_configuration Changing SimGrid's behavior
+
+A number of options can be given at runtime to change the default
+SimGrid behavior. In particular, you can change the default cpu and
+network models...
+
+\subsubsection faq_simgrid_configuration_gtnets Using GTNetS
+
+It is possible to use a packet-level network simulator
+instead of the default flow-based simulation. You may want to use such
+an approach if you have doubts about the validity of the default model
+or if you want to perform some validation experiments. At the moment,
+we support the GTNetS simulator (it is still rather experimental
+though, so leave us a message if you play with it).
+
+
+<i>
+To enable GTNetS model inside SimGrid it is needed to patch the GTNetS simulator source code
+and build/install it from scratch
+</i>
+
+ - <b>Download and enter the recent downloaded GTNetS directory</b>
+
+ \verbatim
+ svn checkout svn://scm.gforge.inria.fr/svn/simgrid/contrib/trunk/GTNetS/
+ cd GTNetS
+ \endverbatim
+
+
+ - <b>Use the following commands to unzip and patch GTNetS package to work within SimGrid.</b>
+
+ \verbatim
+ unzip gtnets-current.zip
+ tar zxvf gtnets-current-patch.tgz
+ cd gtnets-current
+ cat ../00*.patch | patch -p1
+ \endverbatim
+
+ - <b>OPTIONALLY</b> you can use a patch for itanium 64bit processor family.
+
+ \verbatim
+ cat ../AMD64-FATAL-Removed-DUL_SIZE_DIFF-Added-fPIC-compillin.patch | patch -p1
+ \endverbatim
+
+ - <b>Compile GTNetS</b>
+
+ Due to portability issues it is possible that GTNetS does not compile in your architecture. The patches furnished in SimGrid SVN repository are intended for use in Linux architecture only. Unfortunately, we do not have the time, the money, neither the manpower to guarantee GTNetS portability. We advice you to use one of GTNetS communication channel to get more help in compiling GTNetS.
+
+
+ \verbatim
+ ln -sf Makefile.linux Makefile
+ make depend
+ make debug
+ \endverbatim
+