X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/0c13871d73e933c1847faf8debea7b7745a3ff44..c80ea3788d692dc92728cefdd199e83a93a6fa25:/doc/doxygen/introduction.doc diff --git a/doc/doxygen/introduction.doc b/doc/doxygen/introduction.doc index 62d52bdbae..5d7bfe5d6a 100644 --- a/doc/doxygen/introduction.doc +++ b/doc/doxygen/introduction.doc @@ -1,12 +1,7 @@ -/*! @page introduction Introduction to SimGrid +/*! @page introduction Introduction to SimGrid -This page does not really exist yet. In the meanwhile, please refer -to the tutorials -on the project web page, looking for the -SimGrid -101 tutorial. -SimGrid is a toolkit +[SimGrid](http://simgrid.gforge.inria.fr/) is a toolkit that provides core functionalities for the simulation of distributed applications in heterogeneous distributed environments. @@ -15,4 +10,475 @@ distributed and parallel application scheduling on distributed computing platforms ranging from simple network of workstations to Computational Grids. -*/ \ No newline at end of file +\tableofcontents + +\section Scenario +The goal of this practical session is to illustrate various usage of +the MSG interface. To this end we will use the following simple setting: + +> Assume we have a (possibly large) bunch of (possibly large) data to +> process and which originally reside on a server (a.k.a. master). For +> sake of simplicity, we assume all input file require the same amount +> of computation. We assume the server can be helped by a (possibly +> large) set of worker machines. What is the best way to organize the +> computations ? + +Although this looks like a very simple setting it raises several +interesting questions: + +- Which algorithm should the master use to send workload? + + The most obvious algorithm would be to send tasks to workers in a + round-robin fashion. This is the initial code we provide you. + + A less obvious but probably more efficient approach would be to set up + a request mechanism where a client first ask for tasks, which allows + the server to decide which request to answer and possibly to send + the tasks to the fastest machines. Maybe you can think of a + smarter mechanism... + +- How many tasks should the client ask for? + + Indeed, if we set up a request mechanism so that workers only + send request whenever they have no more task to process, they are + likely to be poorly exploited since they will have to wait for the + master to consider their request and for the input data to be + transferred. A client should thus probably request a pool of tasks + but if it requests too many tasks, it is likely to lead to a poor + load-balancing... + +- How is the quality of such algorithm dependent on the platform + characteristics and on the task characteristics? + + Whenever the input communication time is very small compared to + processing time and workers are homogeneous, it is likely that the + round-robin algorithm performs very well. Would it still hold true + when transfer time is not negligible and the platform is, say, + a volunteer computing system ? + +- The network topology interconnecting the master and the workers + may be quite complicated. How does such a topology impact the + previous result? + + When data transfers are the bottleneck, it is likely that a good + modeling of the platform becomes essential. In this case, you may + want to be able to account for complex platform topologies. + +- Do the algorithms depend on a perfect knowledge of this + topology? + + Should we still use a flat master worker deployment or should we + use a + +- How is such an algorithm sensitive to external workload variation? + + What if bandwidth, latency and power can vary with no warning? + Shouldn't you study whether your algorithm is sensitive to such + load variations? + +- Although an algorithm may be more efficient than another, how + does it interfere with other applications? + + %As you can see, this very simple setting may need to evolve way + beyond what you initially imagined. + +
Premature optimization is the root of all evil. -- D.E.Knuth+ + Furthermore, writing your own simulator is much harder than you + may imagine. This is why you should rely on an established and flexible + one. + +The following figure is a screenshot of [triva][fn:1] visualizing a [SimGrid +simulation][fn:2] of two master worker applications (one in light gray and +the other in dark gray) running in concurrence and showing resource +usage over a long period of time. + +![Test](./sc3-description.png) + +\section Prerequisites + +Of course, you need to install SimGrid before taking this tutorial. +Please refer to the relevant Section: \ref install. + +## Tutorials + +A lot of information on how to install and use Simgrid are +provided by the [online documentation][fn:4] and by several tutorials: + +- http://simgrid.gforge.inria.fr/tutorials/simgrid-use-101.pdf +- http://simgrid.gforge.inria.fr/tutorials/simgrid-tracing-101.pdf +- http://simgrid.gforge.inria.fr/tutorials/simgrid-platf-101.pdf + +\section intro_recommendation Recommended Steps + +## Installing Viva + +This [software][fn:1] will be useful to make fancy graph or treemap +visualizations and get a better understanding of simulations. You +will first need to install pajeng: + +~~~~{.sh} +sudo apt-get install git cmake build-essential libqt4-dev libboost-dev freeglut3-dev ; +git clone https://github.com/schnorr/pajeng.git +cd pajeng && mkdir -p build && cd build && cmake ../ -DCMAKE_INSTALL_PREFIX=$HOME && make -j install +cd ../../ +~~~~ + +Then you can install viva. + +~~~~{.sh} +sudo apt-get install libboost-dev libconfig++-dev libconfig8-dev libgtk2.0-dev freeglut3-dev +git clone https://github.com/schnorr/viva.git +cd viva && mkdir -p build_graph && cd build_graph && cmake ../ -DTUPI_LIBRARY=ON -DVIVA=ON -DCMAKE_INSTALL_PREFIX=$HOME && make -j install +cd ../../ +~~~~ + +## Installing Paje + +This [software][fn:5] provides a Gantt-chart visualization. + +~~~~{.sh} +sudo apt-get install paje.app +~~~~ + +## Installing Vite + +This software provides a [Gantt-chart visualization][fn:6]. + +~~~~{.sh} +sudo apt-get install vite +~~~~ + +\section intro_start Let's get started + +\anchor intro_setup +## Setting up and Compiling + +The corresponding archive with all source files and platform files +can be obtained [here](http://simgrid.gforge.inria.fr/tutorials/msg-tuto/msg-tuto.tgz). + +~~~~{.sh} +tar zxf msg-tuto.tgz +cd msg-tuto/src +make +~~~~ + +%As you can see, there is already a nice Makefile that compiles +everything for you. Now the tiny example has been compiled and it +can be easily run as follows: + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml 2>&1 +~~~~ + +If you create a single self-content C-file named foo.c, the +corresponding program will be simply compiled and linked with +SimGrid by typing: + +~~~~{.sh} +make foo +~~~~ + +For a more "fancy" output, you can try: + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml 2>&1 | simgrid-colorizer +~~~~ + +For a really fancy output, you should use [viva/triva][fn:1]: + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml --cfg=tracing:yes \ + --cfg=tracing/uncategorized:yes --cfg=viva/uncategorized:uncat.plist +LANG=C ; viva simgrid.trace uncat.plist +~~~~ + +For a more classical Gantt-Chart visualization, you can produce a +[Paje][fn:5] trace: + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml --cfg=tracing:yes \ + --cfg=tracing/msg/process:yes +LANG=C ; Paje simgrid.trace +~~~~ + +Alternatively, you can use [vite][fn:6]. + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml --cfg=tracing:yes \ + --cfg=tracing/msg/process:yes --cfg=tracing/basic:yes +vite simgrid.trace +~~~~ + +## Getting Rid of Workers in the Deployment File + +In the previous example, the deployment file `deployment0.xml` +is tightly connected to the platform file `platform.xml` and a +worker process is launched on each host: + +~~~~{.xml} + + +