+and we will do it when we will have a little bit more time. We have
+tried to document the examples so that they are understandable. Tell
+us if something is not clear and once again feel free to participate!
+:)
+
+\subsection faq_MIA_taskdup Missing in action: Task duplication/replication
+
+There is no task duplication in MSG. When you create a task, you can
+process it or send it somewhere else. As soon as a process has sent
+this task, he doesn't have this task anymore. It's gone. The receiver
+process has got the task. However, you could decide upon receiving to
+create a "copy" of a task but you have to handle by yourself the
+semantic associated to this "duplication".
+
+As we already told, we prefer keeping the API as simple as
+possible. This kind of feature is rather easy to implement by users
+and the semantic you associate really depends on people. Having a
+*generic* task duplication mechanism is not that trivial (in
+particular because of the data field). That is why I would recommand
+that you write it by yourself even if I can give you advice on how to
+do it.
+
+You have the following functions to get informations about a task:
+MSG_task_get_name(), MSG_task_get_compute_duration(),
+MSG_task_get_remaining_computation(), MSG_task_get_data_size(),
+and MSG_task_get_data().
+
+You could use a dictionnary (#xbt_dict_t) of dynars (#xbt_dynar_t). If
+you still don't see how to do it, please come back to us...
+
+\subsection faq_MIA_asynchronous I want to do asynchronous communications in MSG
+
+Up until now, there is no asynchronous communications in MSG. However,
+you can create as many process as you want so you should be able to do
+whatever you want... I've written a queue module to help implementing
+some asynchronous communications at low cost (creating thousands of
+process only to handle communications may be problematic in term of
+performance at some point). I'll add it in the distribution asap.
+
+\subsection faq_MIA_thread_synchronization I need to synchronize my MSG processes
+
+You obviously cannot use pthread_mutexes of pthread_conds. The best
+thing would be to propose similar structures. Unfortunately, we
+haven't found time to do it yet. However you can try to play with
+MSG_process_suspend() and MSG_process_resume(). You can even do some
+synchronization with fake communications (using MSG_task_get(),
+MSG_task_put() and MSG_task_Iprobe()).
+
+\subsection faq_MIA_host_load Where is the get_host_load function hidden in MSG?
+
+There is no such thing because its semantic wouldn't be really
+clear. Of course, it is something about the amount of host throughput,
+but there is as many definition of "host load" as people asking for
+this function. First, you have to remember that resource availability
+may vary over time, which make any load notion harder to define.
+
+It may be instantaneous value or an average one. Moreover it may be only the
+power of the computer, or may take the background load into account, or may
+even take the currently running tasks into account. In some SURF models,
+communications have an influence on computational power. Should it be taken
+into account too?
+
+First of all, it's near to impossible to predict the load beforehands in the
+simulator since it depends on too much parameters (background load
+variation, bandwidth sharing algorithmic complexity) some of them even being
+not known beforehands (other task starting at the same time). So, getting
+this information is really hard (just like in real life). It's not just that
+we want MSG to be as painful as real life. But as it is in some way
+realistic, we face some of the same problems as we would face in real life.
+
+How would you do it for real? The most common option is to use something
+like NWS that performs active probes. The best solution is probably to do
+the same within MSG, as in next code snippet. It is very close from what you
+would have to do out of the simulator, and thus gives you information that
+you could also get in real settings to not hinder the realism of your
+simulation.
+
+\verbatim
+double get_host_load() {
+ m_task_t task = MSG_task_create("test", 0.001, 0, NULL);
+ double date = MSG_get_clock();
+
+ MSG_task_execute(task);
+ date = MSG_get_clock() - date;
+ MSG_task_destroy(task);
+ return (0.001/date);
+}
+\endverbatim
+
+Of course, it may not match your personal definition of "host load". In this
+case, please detail what you mean on the mailing list, and we will extend
+this FAQ section to fit your taste if possible.
+
+\subsection faq_MIA_communication_time How can I get the *real* communication time ?
+
+Communications are synchronous and thus if you simply get the time
+before and after a communication, you'll only get the transmission
+time and the time spent to really communicate (it will also take into
+account the time spent waiting for the other party to be
+ready). However, getting the *real* communication time is not really
+hard either. The following solution is a good starting point.
+
+\verbatim
+int sender()
+{
+ m_task_t task = MSG_task_create("Task", task_comp_size, task_comm_size,
+ calloc(1,sizeof(double)));
+ *((double*) task->data) = MSG_get_clock();
+ MSG_task_put(task, slaves[i % slaves_count], PORT_22);
+ INFO0("Send completed");
+ return 0;
+}
+int receiver()
+{
+ m_task_t task = NULL;
+ double time1,time2;
+
+ time1 = MSG_get_clock();
+ a = MSG_task_get(&(task), PORT_22);
+ time2 = MSG_get_clock();
+ if(time1<*((double *)task->data))
+ time1 = *((double *) task->data);
+ INFO1("Communication time : \"%f\" ", time2-time1);
+ free(task->data);
+ MSG_task_destroy(task);
+ return 0;
+}
+\endverbatim
+
+\subsection faq_MIA_batch_scheduler Is there a native support for batch schedulers in SimGrid ?
+
+No, there is no native support for batch schedulers and none is
+planned because this is a very specific need (and doing it in a
+generic way is thus very hard). However some people have implemented
+their own batch schedulers. Vincent Garonne wrote one during his PhD
+and put his code in the contrib directory of our CVS so that other can
+keep working on it. You may find inspinring ideas in it.
+
+\subsection faq_MIA_checkpointing I need a checkpointing thing
+
+Actually, it depends on whether you want to checkpoint the simulation, or to
+simulate checkpoints.
+
+The first one could help if your simulation is a long standing process you
+want to keep running even on hardware issues. It could also help to
+<i>rewind</i> the simulation by jumping sometimes on an old checkpoint to
+cancel recent calculations.\n
+Unfortunately, such thing will probably never exist in SG. One would have to
+duplicate all data structures because doing a rewind at the simulator level
+is very very hard (not talking about the malloc free operations that might
+have been done in between). Instead, you may be interested in the Libckpt
+library (http://www.cs.utk.edu/~plank/plank/www/libckpt.html). This is the
+checkpointing solution used in the condor project, for example. It makes it
+easy to create checkpoints (at the OS level, creating something like core
+files), and rerunning them on need.
+
+If you want to simulate checkpoints instead, it means that you want the
+state of an executing task (in particular, the progress made towards
+completion) to be saved somewhere. So if a host (and the task executing on
+it) fails (cf. #MSG_HOST_FAILURE), then the task can be restarted
+from the last checkpoint.\n
+
+Actually, such a thing does not exists in SimGrid either, but it's just
+because we don't think it is fundamental and it may be done in the user code
+at relatively low cost. You could for example use a watcher that
+periodically get the remaining amount of things to do (using
+MSG_task_get_remaining_computation()), or fragment the task in smaller
+subtasks.
+
+\section faq_SG Where has SG disappeared?!?
+
+OK, it's time to explain what's happening to the SimGrid project. Let's
+start with a little bit of history.
+
+* Historically, SimGrid was a low-level toolkit for scheduling with
+classical models such as DAGs. That was SimGrid v.1.* aka SG, written
+by Henri Casanova. I (Arnaud) had been using it in its earliest
+versions during an internship at UCSD.
+
+Then we have realized that encoding distributed algorithm in SG was a
+real pain.
+
+* So we have built MSG on top of SG and have released SimGrid v.2.*. MSG
+offered a very basic API to encode a distributed application easily.
+However encoding MSG on top of SG was not really convenient and did not
+use the DAG part since the control of the task synchronization was done
+on top of MSG and no more in SG. We have been playing a little bit with
+MSG. We have realized that:
+
+ \li 1) the platform modeling was quite flexible and could be "almost"
+ automated (e.g. using random generator and post-annotations);
+
+ \li 2) SG was the bottleneck because of the way we were using
+ it. We needed to simulate concurrent transfers, complex load
+ sharing mechanisms. Many optimizations (e.g. trace integration)
+ were totally inefficient when combined with MSG and made extending SG
+ to implement new sharing policies, parallel tasks models, or failures
+ (many people were asking for these kind of features) a real pain;
+
+ \li 3) the application modeling was not really easy. Even though the
+ application modeling depends on people's applications, we thought
+ we could improve things here. One of our target here was realistic
+distributed applications ranging from computer sensor networks like
+the NWS to peer-to-peer applications;
+
+* So we have been planning mainly two things for SimGrid 3:
+
+ \li 1) I have proposed to get rid of SG and to re-implement a new kernel
+ that would be faster and more flexible. That is what I did in the
+end of 2004: SURF. SURF is based on a fast max-min linear solver
+using O(1) data-structures. I have quickly replaced SG by SURF in
+MSG and the result has been that on the MSG example, the new
+version was more than 10 times faster while we had gain a lot of
+flexibility. I think I could still easily make MSG faster but I
+have to work on MSG now (e.g. using some of the O(1)
+data-structures I've been using to build SURF) since it has become
+the bottleneck. Some MSG functions have been removed from the API
+but they were mainly intended to build the platform by hand (they
+had appeared in the earliest versions of MSG) and were therefore
+not useful anymore since we are providing a complete mechanism to
+automatically build the platform and deploy the agents on it.;
+
+ \li 2) GRAS is a new project Martin and I have come up with. The idea is
+to have a programming environment that let you program real
+distributed applications while letting you the ability to run it in
+the simulator without having to change the slightest line of your
+code. From the simulation point of view, GRAS performs the
+application modeling automatically... Up until now, GRAS works on
+top MSG for historical reasons but I'm going to make it work
+directly on top of SURF so that it can use all the flex and the
+speed provided by SURF.
+
+Those two things are working, but we want to make everything as clean as
+possible before releasing SimGrid v.3.
+
+So what about those nice DAGs we used to have in SimGrid v.1.? They're
+not anymore in SimGrid v.3. At least not in their original form... Let
+me recall you the way SimGrid 3 is organized:
+
+\verbatim
+________________
+| User code |
+|______________|
+| | MSG | GRAS |
+| -------------|
+| | SURF |
+| -------------|
+| XBT |
+----------------
+\endverbatim
+
+XBT is our tool box and now, you should have an idea of what the other
+ones are. As you can see, the primitive SG is not here
+anymore. However we have written a brand new and cleaner API for this
+purpose: \ref SD_API. It is built directly on top of SURF and provides
+an API rather close to the old SG:
+
+\verbatim
+______________________
+| User code |
+|____________________|
+| | MSG | GRAS | SD |
+| -------------------|
+| | SURF |
+| -------------------|
+| XBT |
+----------------------
+\endverbatim
+
+The nice thing is that, as it is writen on top of SURF, it seamlessly
+support DAG of parallel tasks as well as complex communications
+patterns. Some old codes using SG are currently under rewrite using
+\ref SD_API to check that all needful functions are provided.
+
+\subsection faq_SG_DAG How to implement a distributed dynamic scheduler of DAGs.
+
+Distributed is somehow "contagious". If you start making distributed
+decisions, there is no way to handle DAGs directly anymore (unless I
+am missing something). You have to encode your DAGs in term of
+communicating process to make the whole scheduling process
+distributed. Here is an example of how you could do that. Assume T1
+has to be done before T2.
+
+\verbatim
+ int your_agent(int argc, char *argv[] {
+ ...
+ T1 = MSG_task_create(...);
+ T2 = MSG_task_create(...);
+ ...
+ while(1) {
+ ...
+ if(cond) MSG_task_execute(T1);
+ ...
+ if((MSG_task_get_remaining_computation(T1)=0.0) && (you_re_in_a_good_mood))
+ MSG_task_execute(T2)
+ else {
+ /* do something else */
+ }
+ }
+ }
+\endverbatim
+
+If you decide that the distributed part is not that much important and that
+DAG is really the level of abstraction you want to work with, then you should
+give a try to \ref SD_API.
+
+\section faq_dynamic Dynamic resources and platform building