1 /*! \page outcomes_vizu Visualization and Statistical Analysis
3 SimGrid comes with an extensive support to trace and register what
4 happens during the simulation, so that it can be either visualized or
5 statistically analysed after the simulation.
7 This tracing is widely used to observe and understand the behavior of
8 parallel applications and distributed algorithms. Usually, this is
9 done in a two-step fashion: the user instruments the application and
10 the traces are analyzed after the end of the execution. The analysis
11 can highlights unexpected behaviors, bottlenecks and sometimes can be
12 used to correct distributed algorithms. The SimGrid team has
13 instrumented the library in order to let users trace their simulations
14 and analyze them. This part of the user manual explains how the
15 tracing-related features can be enabled and used during the
16 development of simulators using the SimGrid library.
18 \section instr_category_functions Tracing categories functions
20 The SimGrid library is instrumented so users can trace the platform
21 utilization using MSG, SimDAG and SMPI interfaces. It registers how
22 much power is used for each host and how much bandwidth is used for
23 each link of the platform. The idea with this type of tracing is to
24 observe the overall view of resources utilization in the first place,
25 especially the identification of bottlenecks, load-balancing among
28 Another possibility is to trace resource utilization by
29 categories. Categorized resource utilization tracing gives SimGrid
30 users to possibility to classify MSG and SimDAG tasks by category,
31 tracing resource utilization for each of the categories. The functions
32 below let the user declare a category and apply it to tasks. <em>The
33 tasks that are not classified according to a category are not
34 traced</em>. Even if the user does not specify any category, the
35 simulations can still be traced in terms of resource utilization by
36 using a special parameter that is detailed below (see section \ref
37 tracing_tracing_options).
39 \li \c TRACE_category(const char *category)
40 \li \c TRACE_category_with_color(const char *category, const char *color)
41 \li \c MSG_task_set_category(msg_task_t task, const char *category)
42 \li \c MSG_task_get_category(msg_task_t task)
43 \li \c SD_task_set_category(SD_task_t task, const char *category)
44 \li \c SD_task_get_category(SD_task_t task)
46 \section instr_mark_functions Tracing marks functions
47 \li \c TRACE_declare_mark(const char *mark_type)
48 \li \c TRACE_mark(const char *mark_type, const char *mark_value)
50 \section instr_uservariables_functions Tracing user variables functions
54 \li \c TRACE_host_variable_declare(const char *variable)
55 \li \c TRACE_host_variable_declare_with_color(const char *variable, const char *color)
56 \li \c TRACE_host_variable_set(const char *host, const char *variable, double value)
57 \li \c TRACE_host_variable_add(const char *host, const char *variable, double value)
58 \li \c TRACE_host_variable_sub(const char *host, const char *variable, double value)
59 \li \c TRACE_host_variable_set_with_time(double time, const char *host, const char *variable, double value)
60 \li \c TRACE_host_variable_add_with_time(double time, const char *host, const char *variable, double value)
61 \li \c TRACE_host_variable_sub_with_time(double time, const char *host, const char *variable, double value)
65 \li \c TRACE_link_variable_declare(const char *variable)
66 \li \c TRACE_link_variable_declare_with_color(const char *variable, const char *color)
67 \li \c TRACE_link_variable_set(const char *link, const char *variable, double value)
68 \li \c TRACE_link_variable_add(const char *link, const char *variable, double value)
69 \li \c TRACE_link_variable_sub(const char *link, const char *variable, double value)
70 \li \c TRACE_link_variable_set_with_time(double time, const char *link, const char *variable, double value)
71 \li \c TRACE_link_variable_add_with_time(double time, const char *link, const char *variable, double value)
72 \li \c TRACE_link_variable_sub_with_time(double time, const char *link, const char *variable, double value)
74 For links, but use source and destination to get route:
76 \li \c TRACE_link_srcdst_variable_set(const char *src, const char *dst, const char *variable, double value)
77 \li \c TRACE_link_srcdst_variable_add(const char *src, const char *dst, const char *variable, double value)
78 \li \c TRACE_link_srcdst_variable_sub(const char *src, const char *dst, const char *variable, double value)
79 \li \c TRACE_link_srcdst_variable_set_with_time(double time, const char *src, const char *dst, const char *variable, double value)
80 \li \c TRACE_link_srcdst_variable_add_with_time(double time, const char *src, const char *dst, const char *variable, double value)
81 \li \c TRACE_link_srcdst_variable_sub_with_time(double time, const char *src, const char *dst, const char *variable, double value)
83 \section tracing_tracing_options Tracing configuration Options
85 To check which tracing options are available for your simulator, you
86 can just run it with the option \verbatim --help-tracing \endverbatim
87 to get a very detailed and updated explanation of each tracing
88 parameter. These are some of the options accepted by the tracing
89 system of SimGrid, you can use them by running your simulator with the
95 Safe switch. It activates (or deactivates) the tracing system.
96 No other tracing options take effect if this one is not activated.
104 It activates the categorized resource utilization tracing. It should
105 be enabled if tracing categories are used by this simulator.
107 --cfg=tracing/categorized:yes
111 tracing/uncategorized
113 It activates the uncategorized resource utilization tracing. Use it if
114 this simulator do not use tracing categories and resource use have to be
117 --cfg=tracing/uncategorized:yes
123 A file with this name will be created to register the simulation. The file
124 is in the Paje format and can be analyzed using Viva or Paje visualization
125 tools. More information can be found in these webpages:
126 <a href="http://github.com/schnorr/viva/">http://github.com/schnorr/viva/</a>
127 <a href="http://github.com/schnorr/pajeng/">http://github.com/schnorr/pajeng/</a>
129 --cfg=tracing/filename:mytracefile.trace
131 If you do not provide this parameter, the trace file will be named simgrid.trace.
136 This option only has effect if this simulator is SMPI-based. Traces the MPI
137 interface and generates a trace that can be analyzed using Gantt-like
138 visualizations. Every MPI function (implemented by SMPI) is transformed in a
139 state, and point-to-point communications can be analyzed with arrows.
141 --cfg=tracing/smpi:yes
147 This option only has effect if this simulator is SMPI-based. The processes
148 are grouped by the hosts where they were executed.
150 --cfg=tracing/smpi/group:yes
154 tracing/smpi/computing
156 This option only has effect if this simulator is SMPI-based. The parts external
157 to SMPI are also outputted to the trace. Provides better way to analyze the data automatically.
159 --cfg=tracing/smpi/computing:yes
163 tracing/smpi/internals
165 This option only has effect if this simulator is SMPI-based. Display internal communications
166 happening during a collective MPI call.
168 --cfg=tracing/smpi/internals:yes
172 tracing/smpi/display-sizes
174 This option only has effect if this simulator is SMPI-based. Display the sizes of the messages
175 exchanged in the trace, both in the links and on the states. For collective, size means the global size of data sent by the process in general.
177 --cfg=tracing/smpi/display-sizes:yes
181 tracing/smpi/sleeping
197 tracing/smpi/format/ti-one-file
215 This option only has effect if this simulator is MSG-based. It traces the
216 behavior of all categorized MSG processes, grouping them by hosts. This option
217 can be used to track process location if this simulator has process migration.
219 --cfg=tracing/msg/process:yes
225 This option put some events in a time-ordered buffer using the
226 insertion sort algorithm. The process of acquiring and releasing
227 locks to access this buffer and the cost of the sorting algorithm
228 make this process slow. The simulator performance can be severely
229 impacted if this option is activated, but you are sure to get a trace
230 file with events sorted.
232 --cfg=tracing/buffer:yes
238 This option changes the way SimGrid register its platform on the trace
239 file. Normally, the tracing considers all routes (no matter their
240 size) on the platform file to re-create the resource topology. If this
241 option is activated, only the routes with one link are used to
242 register the topology within an AS. Routes among AS continue to be
245 --cfg=tracing/onelink-only:yes
257 tracing/disable-power
265 tracing/disable-destroy
267 Disable the destruction of containers at the end of simulation. This
268 can be used with simulators that have a different notion of time
269 (different from the simulated time).
271 --cfg=tracing/disable-destroy:yes
277 Some visualization tools are not able to parse correctly the Paje file format.
278 Use this option if you are using one of these tools to visualize the simulation
279 trace. Keep in mind that the trace might be incomplete, without all the
280 information that would be registered otherwise.
282 --cfg=tracing/basic:yes
288 Use this to add a comment line to the top of the trace file.
290 --cfg=tracing/comment:my_string
296 Use this to add the contents of a file to the top of the trace file as comment.
298 --cfg=tracing/comment-file:textual_file.txt
318 tracing/platform/topology
328 This option generates a graph configuration file for Viva considering
329 categorized resource utilization.
331 --cfg=viva/categorized:graph_categorized.plist
337 This option generates a graph configuration file for Viva considering
338 uncategorized resource utilization.
340 --cfg=viva/uncategorized:graph_uncategorized.plist
343 Please pass \verbatim --help-tracing \endverbatim to your simulator
344 for the updated list of tracing options.
346 \section tracing_tracing_example_parameters Case studies
348 Some scenarios that might help you decide which tracing options
349 you should use to analyze your simulator.
351 \li I want to trace the resource utilization of all hosts
352 and links of the platform, and my simulator <b>does not</b> use
353 the tracing API. For that, you can run a uncategorized trace
354 with the following parameters (it will work with <b>any</b> Simgrid
359 --cfg=tracing/uncategorized:yes \
360 --cfg=tracing/filename:mytracefile.trace \
361 --cfg=viva/uncategorized:uncat.plist
364 \li I want to trace only a subset of my MSG (or SimDAG) tasks.
365 For that, you will need to create tracing categories using the
366 <b>TRACE_category (...)</b> function (as explained above),
367 and then classify your tasks to a previously declared category
368 using the <b>MSG_task_set_category (...)</b>
369 (or <b>SD_task_set_category (...)</b> for SimDAG tasks). After
370 recompiling, run your simulator with the following parameters:
374 --cfg=tracing/categorized:yes \
375 --cfg=tracing/filename:mytracefile.trace \
376 --cfg=viva/categorized:cat.plist
380 \section tracing_tracing_example Example of Instrumentation
382 A simplified example using the tracing mandatory functions.
385 int main (int argc, char **argv)
387 MSG_init (&argc, &argv);
389 //(... after deployment ...)
391 //note that category declaration must be called after MSG_create_environment
392 TRACE_category_with_color ("request", "1 0 0");
393 TRACE_category_with_color ("computation", "0.3 1 0.4");
394 TRACE_category ("finalize");
396 msg_task_t req1 = MSG_task_create("1st_request_task", 10, 10, NULL);
397 msg_task_t req2 = MSG_task_create("2nd_request_task", 10, 10, NULL);
398 msg_task_t req3 = MSG_task_create("3rd_request_task", 10, 10, NULL);
399 msg_task_t req4 = MSG_task_create("4th_request_task", 10, 10, NULL);
400 MSG_task_set_category (req1, "request");
401 MSG_task_set_category (req2, "request");
402 MSG_task_set_category (req3, "request");
403 MSG_task_set_category (req4, "request");
405 msg_task_t comp = MSG_task_create ("comp_task", 100, 100, NULL);
406 MSG_task_set_category (comp, "computation");
408 msg_task_t finalize = MSG_task_create ("finalize", 0, 0, NULL);
409 MSG_task_set_category (finalize, "finalize");
418 \section tracing_tracing_analyzing Analyzing SimGrid Simulation Traces
420 A SimGrid-based simulator, when executed with the correct parameters
421 (see above) creates a trace file in the Paje file format holding the
422 simulated behavior of the application or the platform. You have
423 several options to analyze this trace file:
425 - Dump its contents to a CSV-like format using `pj_dump` (see <a
426 href="https://github.com/schnorr/pajeng/wiki/pj_dump">PajeNG's wiki
427 on pj_dump</a> and more generally the <a
428 href="https://github.com/schnorr/pajeng/">PajeNG suite</a>) and use
429 gnuplot to plot resource usage, time spent on blocking/executing
430 functions, and so on. Filtering capabilities are at your hand by
431 doing `grep`, with the best regular expression you can provide, to
432 get only parts of the trace (for instance, only a subset of
433 resources or processes).
435 - Derive statistics from trace metrics (the ones built-in with any
436 SimGrid simulation, but also those metrics you injected in the trace
437 using the TRACE module) using the <a
438 href="http://www.r-project.org/">R project</a> and all its
439 modules. You can also combine R with <a
440 href="http://ggplot2.org/">ggplot2</a> to get a number of high
441 quality plots from your simulation metrics. You need to `pj_dump`
442 the contents of the SimGrid trace file to use R.
444 - Visualize the behavior of your simulation using classic space/time
445 views (gantt-charts) provided by the <a
446 href="https://github.com/schnorr/pajeng/">PajeNG suite</a> and any
447 other tool that supports the <a
448 href="http://paje.sourceforge.net/download/publication/lang-paje.pdf">Paje
449 file format</a>. Consider this option if you need to understand the
450 causality of your distributed simulation.
452 - Visualize the behavior of your simulation with treemaps (specially
453 if your simulation has a platform with several thousand resources),
454 provided by the <a href="http://github.com/schnorr/viva/">Viva</a>
455 visualization tool. See <a
456 href="https://github.com/schnorr/viva/wiki">Viva's wiki</a> for
457 further details on what is a treemap and how to use it.
459 - Correlate the behavior of your simulator with the platform topology
460 with an interactive, force-directed, and hierarchical graph
461 visualization, provided by <a
462 href="http://github.com/schnorr/viva/">Viva</a>. Check <a
463 href="https://github.com/schnorr/viva/wiki">Viva's wiki</a> for
464 further details. This <a
465 href="http://hal.inria.fr/hal-00738321/">research report</a>,
466 published at ISPASS 2013, has a detailed description of this
467 visualization technique.
469 - You can also check our online <a
470 href="http://simgrid.gforge.inria.fr/tutorials.html"> tutorial
471 section</a> that contains a dedicated tutorial with several
472 suggestions on how to use the tracing infrastructure. Look for the
473 SimGrid User::Visualization 101 tutorial.
475 - Ask for help on the <a
476 href="mailto:simgrid-user@lists.gforge.inria.fr">simgrid-user@lists.gforge.inria.fr</a>
477 mailing list, giving us a detailed explanation on what your
478 simulator does and what kind of information you want to trace. You
479 can also check the <a
480 href="http://lists.gforge.inria.fr/pipermail/simgrid-user/">mailing
481 list archive</a> for old messages regarding tracing and analysis.
483 \subsection tracing_viva_analysis Viva Visualization Tool
485 This subsection describe some of the concepts regarding the <a
486 href="http://github.com/schnorr/viva/">Viva Visualization Tool</a> and
487 its relation with SimGrid traces. You should refer to Viva's website
488 for further details on all its visualization techniques.
490 \subsubsection tracing_viva_time_slice Time Slice
492 The analysis of a trace file using the tool always takes into account
493 the concept of the <em>time-slice</em>. This concept means that what
494 is being visualized in the screen is always calculated considering a
495 specific time frame, with its beggining and end timestamp. The
496 time-slice is configured by the user and can be changed dynamically
497 through the window called <em>Time Interval</em> that is opened
498 whenever a trace file is being analyzed. Users are capable to select
499 the beggining and size of the time slice.
501 \subsubsection tracing_viva_graph Hierarchical Graph View
503 %As stated above (see section \ref tracing_tracing_analyzing), one
504 possibility to analyze SimGrid traces is to use Viva's graph view with
505 a graph configuration to customize the graph according to the
506 traces. A valid graph configuration (we are using the non-XML <a
507 href="http://en.wikipedia.org/wiki/Property_list">Property List
508 Format</a> to describe the configuration) can be created for any
509 SimGrid-based simulator using the
510 <em>--cfg=viva/uncategorized:graph_uncategorized.plist</em> or
511 <em>--cfg=viva/categorized:graph_categorized.plist</em> (if the
512 simulator defines resource utilization categories) when executing the
515 \subsubsection basic_conf Basic Graph Configuration
517 The basic description of the configuration is as follows:
520 node = (LINK, HOST, );
521 edge = (HOST-LINK, LINK-HOST, LINK-LINK, );
524 The nodes of the graph will be created based on the <i>node</i>
525 parameter, which in this case is the different <em>"HOST"</em>s and
526 <em>"LINK"</em>s of the platform used to simulate. The <i>edge</i>
527 parameter indicates that the edges of the graph will be created based
528 on the <em>"HOST-LINK"</em>s, <em>"LINK-HOST"</em>s, and
529 <em>"LINK-LINK"</em>s of the platform. After the definition of these
530 two parameters, the configuration must detail how the nodes
531 (<em>HOST</em>s and <em>LINK</em>s) should be drawn.
533 For that, the configuration must have an entry for each of
534 the types used. For <em>HOST</em>, as basic configuration, we have:
540 values = (power_used);
544 The parameter <em>size</em> indicates which variable from the trace
545 file will be used to define the size of the node HOST in the
546 visualization. If the simulation was executed with availability
547 traces, the size of the nodes will be changed according to these
548 traces. The parameter <em>type</em> indicates which geometrical shape
549 will be used to represent HOST, and the <em>values</em> parameter
550 indicates which values from the trace will be used to fill the shape.
552 For <em>LINK</em> we have:
558 values = (bandwidth_used);
563 The same configuration parameters are used here: <em>type</em> (with a
564 rhombus), the <em>size</em> (whose value is from trace's bandwidth
565 variable) and the <em>values</em>.
567 \subsubsection custom_graph Customizing the Graph Representation
569 Viva is capable to handle a customized graph representation based on
570 the variables present in the trace file. In the case of SimGrid, every
571 time a category is created for tasks, two variables in the trace file
572 are defined: one to indicate node utilization (how much power was used
573 by that task category), and another to indicate link utilization (how
574 much bandwidth was used by that category). For instance, if the user
575 declares a category named <i>request</i>, there will be variables
576 named <b>p</b><i>request</i> and a <b>b</b><i>request</i> (<b>p</b>
577 for power and <b>b</b> for bandwidth). It is important to notice that
578 the variable <i>prequest</i> in this case is only available for HOST,
579 and <i>brequest</i> is only available for LINK. <b>Example</b>:
580 suppose there are two categories for tasks: request and compute. To
581 create a customized graph representation with a proportional
582 separation of host and link utilization, use as configuration for HOST
589 values = (prequest, pcomputation);
594 values = (brequest, bcomputation);
598 This configuration enables the analysis of resource utilization by MSG
599 tasks through the identification of load-balancing issues and network
600 bottlenecks, for instance.