X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/b3f448e0ccc0e5e0195bfba380b1ba3a5c5b10b6..153b16c65aabb7e3e103a83e072460cdba64fad1:/doc/gtut-tour-09-simpledata.doc diff --git a/doc/gtut-tour-09-simpledata.doc b/doc/gtut-tour-09-simpledata.doc index d5482a0e31..aea9829e90 100644 --- a/doc/gtut-tour-09-simpledata.doc +++ b/doc/gtut-tour-09-simpledata.doc @@ -1,26 +1,265 @@ /** -@page GRAS_tut_tour_simpledata Lesson 9: Exchanging simple data (TODO) +@page GRAS_tut_tour_simpledata Lesson 9: Exchanging simple data \section GRAS_tut_tour_simpledata_toc Table of Contents - \ref GRAS_tut_tour_simpledata_intro - - \ref GRAS_tut_tour_simpledata_use + - \ref GRAS_tut_tour_simpledata_intro_conv + - \ref GRAS_tut_tour_simpledata_intro_gras + - \ref GRAS_tut_tour_simpledata_use + - \ref GRAS_tut_tour_simpledata_example - \ref GRAS_tut_tour_simpledata_recap
\section GRAS_tut_tour_simpledata_intro Introduction +Until now, we only exchanged "empty" messages, ie messages with no data +attached. Simply receiving the message was a sufficient information for the +receiver to proceed. There is a similarity between them and procedures not +accepting any argument in the sequential setting. For example, our "kill" +message can be seen as a distributed version of the exit() system +call, simply stopping the process receiving this call. -\section GRAS_tut_tour_simpledata_use Actually exchanging simple data in messages +Of course, this is not enough for most applications and it is now time to +see how to attach some arbitrary data to our messages. In the GRAS parlance, +we will add a payload to the messages. Reusing the similarity between +message exchanges and procedure calls, we will now add arguments to our +calls. +Passing arguments in a distributed setting such as GRAS is a bit more +complicated than when performing a local call. The messaging layer must be +aware of the type of data you want to send and be able to actually send them +remotely, serializing them on sender side and deserializing them on the +other side. Of course, GRAS can do so for you. + +\subsection GRAS_tut_tour_simpledata_intro_conv Data conversion issues on heterogeneous platforms + +The platforms targeted by GRAS complicate the data transfers since the +machines may well be heterogeneous. You may want to exchange data between a +regular x86 machine (Intel and assimilated) and amd64 machine or even a ppc +machine (Mac). + +The first problem comes from the fact that C datatypes are not always of the +same size depending on the processor. On 32 bits machines (such as x86 and +some ppc), they are stored on 4 bytes where they are stored on 8 bytes on 64 +bits machines (such as amd64). + +Then, a second problem comes from the fact that datatypes are not +represented the same way on these architectures. amd64 and x86 are called +little-endian architectures (as opposed to big-endian architectures like +ppc) because they store the bytes of a given integer in a right-to-left way. +For example, the number 16909060 is written Ox01020304 in hexadecimal base. +On big endian machines, it will be stored as for bytes ordered that way: +01.02.03.04. On little-endian machines, it will be stored as 04.03.02.01, ie +bytes are in reverse order. + +A third problem comes from the so-called padding bytes. They come from the +fact that it is for example much more efficient for the processor to load a +4-bytes long data (such as an float) if it is aligned on a 4-bytes boundary, +ie if its first byte is placed in a region of the memory which address is a +multiple of 4. If it is not the case, the bus needs 2 cycles to retrieve the +data. That is why the compiler makes sure that any data declared in your +program are aligned in memory. When manipulating structures, it means that +the compiler may introduce some "spaces" between your fields to make sure +that each of them is aligned on the right boundary. Then, the boundaries +vary according to the aligned data. Most of the time, the alignment for a +data type is the data size (2 bytes for shorts which are 2-bytes long and so +on), but not always ;) And this all this was too easy, those values are not +only processor dependent, but also compiler and OS dependent. For example, +doubles (eight bytes) are 8-byte aligned on Windows and 4-byte aligned on +Linux... Let's take an example: + +\verbatim struct MixedData{ + char data_1; + short data_2; + char data_3; + int data_4; +};\endverbatim + +One would say that the size of this structure should be 8 bytes long on x86 +(1+2+1+4), but in fact, it is 12 bytes long. To ensure that data_2 is +2-aligned, one unused byte is added between data_1 and data_2 and 3 bytes +are wasted between data_3 and data_4 to make sure that this integer is +4-bytes aligned. Those bytes added by the compiler are called padding bytes. +Some of them may be added at the end of the structure to make sure that the +total size fulfill some criterions. On ARM machines, any structure size must +be a multiple of 4, leading a structure containing two chars to be 4 bytes +long instead of 2. + +\subsection GRAS_tut_tour_simpledata_intro_gras Dealing with hardware heterogeneity in GRAS + +All this certainly sounds scary and getting the things right can easily turn +into a nightmare if you want to do so yourself. Lukily, GRAS converts your +data seamlessly in heterogeneous exchanges. This is not really a revolution +since most high-level data exchange solution do so. For this, most solutions +convert any data to be exchanged from the sender representation into their +own format on the sender side and convert it to the receiver representation +on the other side. Sun RPC (used in NFS file systems) for example use the +XDR representation for this. When exchanging data between homogeneous +hosts, this is a clear waste of time since no conversion at all is needed, +but it is easier to implement. To deal with N kind of hardware architecture, +you only have to implement 2*N conversion schema (from any arch into the +exchange format, and from the exchange format into any arch). + +In GRAS, we prefered performance over ease of implementation, and data won't +get converted when it's not needed. Instead, data are sent in the sender +representation and it is then the responsability of the receiver process to +convert it on need. To deal with N architectures, there is N^2 conversion +schema (from any arch to any arch). Nevertheless, GRAS known 9 different +architectures, allowing it to run on almost any existing computer: Linux +(x86, ia64, amd64, alpha, sparc, hppa and PPC), Solaris (Sparc and x86), Mac +OSX, IRIX and AIX. The conversion mecanism also work with the Windows +conventions, but other issues are still to be solved on this arch. + +This approach, along with careful optimization, allows GRAS to offer very +competitive performance. It is faster than CORBA, not speaking from web +services which suffer badly from their textual data representation (XML). + +\subsection GRAS_tut_tour_simpledata_use Actually exchanging data in GRAS messages + +As stated above, all this conversion issues are dealed automatically by GRAS +and there is very few thing you should do yourself to get it working. +Simply, when you declare a message type with gras_msgtype_declare(), you +should provide a description of the payload data type as last argument. GRAS +will serialize the data, send it on the socket, convert it on need and +deserialize it for you automatically. + +That means that any given message type can only convey a payload of a +predefined type. You cannot have a message type sometimes conveying an +integer and sometimes conveying a double. But in practice, this limitation +is not very hard to live with. Comparing message exchanges to procedure +calls again, you cannot have the same procedure accepting arbitrary argument +types. What you have in Java, for example, is several functions of the same +name accepting differing argument types, which is a bit different. In C, you +can also trick the limitation by using void* arguments. And +actually, you can do the same kind of tricks in GRAS, but this is really +premature at this point of the tutorial. It is the subject of \ref +GRAS_tut_tour_exchangecb. + +Another limitation is that you can only convey one argument per message in +GRAS. We when that way in GRAS mainly because otherwise, gras_msg_send() and +the like would have to accept a variating number of parameters. It is +possible in C, but this reveals rather cumbersome since the compiler do not +check the number of arguments in any way, and the symptom on error is often +a segfault. Try passing too few parameters to printf with regard to the +format string if you want an example. Moreover, since you can convey +structures, it is easy to overcome this limitation: if you want several +arguments, simply pack them into a structure before doing so. + +There is absolutely no limitation on the type of data you can exchange in +GRAS. If you can build a C representation of your data, you can exchange it +with GRAS. More precisely, you can exchange scalars, structures, +enumerations, arrays (both static and dynamic), pointers, and even things +like chained list of structures. It is even possible to exchange graphs of +structures containing cycles between members. + +Actually, the main difficulty is to describe the data to be exchanged to +GRAS. This will be addressed in subsequent tutorial lessons, and we will +focus on exchanging data that GRAS already knows. Here is a list of such +data: + + - char + - short int + - int + - long int + - long long int + +For all these types, there is three variant: signed, unsigned and the +version where it is not specified. For example, "signed char", "char" and +"unsigned char" are all GRAS predefined datatype. The use of the unqualified +variant ("char") is not encouraged since you may gain some trouble +sometimes. On hppa, chars are unsigned by default where they are signed by +default on most archs. Use unqualified variant at your own risk ;) + + - float + - double + - data and function pointers (on some arch, both types are not of the same + size) + +You also have some more advanced types: + + - string (which are null-terminated char*, as usual in the libc) + - #xbt_ex_t (the exception types in GRAS, which can get automatically exchanged + over the network and are thus predefined) + - #xbt_peer_t (a datatype describing a peer. There is a plenty of situation + in which you want to exchange data of this type, so this is also predefined) + +\section GRAS_tut_tour_simpledata_example Back to our example + +We will now modify our example to add some data to the "hello" and the +"kill" messages. "hello" will convey a string being displayed in the logs +while "kill" will convey an double indicating the number of seconds to wait +before dying. + +The first thing is to modify the message declarations to specify that they +convey a payload. Of course, all nodes have to agree on message definitions, +and it would be very bad if the sender and the receiver would not agree on +the payload data type. GRAS checks for such discrepencies in the simulator +and dies loudly when something goes wrong. But in RL, GRAS do not check for +such things, and you are likely to get a segfault rather painful to debug. +To avoid such mistakes, it is a good habit to declare a function common to +any nodes declaring the message types involved in your application. Most of +the time, callbacks can't get declared in the same function since they +differ from node types to node types (the server attach 2 callbacks where +the client don't attach any). Here is the message declaring function in our +case: + +\dontinclude 09-simpledata.c +\skip message_declaration(void) +\until } + +It is very similar to what we had previously, we simply retrieve the +#gras_datadesc_type_t definitions of double and string and use them as payload +type of our messages. + +The next step is to change our calls to gras_msg_send() to pass the data to +send. The rule is that you should put the data into a variable and then pass +the address of this variable. It makes no difference whether the type +happens to be a pointer (as char*) or a scalar (as double). Just give +gras_msg_send the address of the variable, it will do the things right. + +\skip hello_payload +\until Gave + +Then, we have to retrieve the sent data from the callbacks. The syntax for +this is a bit crude, but at least it is very systematic so you don't have to +think too much about this. The payload argument of callbacks is +declared as void* and you can consider that it is the address of +the variable passed during the send. Ok, it got serialized, exchanged over +the network, converted and deserialized, but really, you can consider that +it's the exact copy of your variable. So, to retrieve the content, you have +to cast the void* pointer to a pointer on your datatype, and then +derefence it. + +So, it you want to retrieve a double, you have to cast the pointer using +(double*), and then dereference the obtained pointer by adding a +star before the cast. This is what we do here: + +\dontinclude 09-simpledata.c +\skip server_kill_cb +\until delay + +Again, it makes no difference whether the type happens to be a pointer or a +scalar. You simply end up with more stars in the cast for pointers: + +\skip server_hello_cb +\until char** + +That's it, you know how to exchange data between nodes. It's really simple +with GRAS, even if it's a nightmare to do so portably without it... \section GRAS_tut_tour_simpledata_recap Recapping everything together The program now reads: \include 09-simpledata.c -Which produces the expected output: +Which produces the following output: \include 09-simpledata.output +Now that you know how to exchange simple data along with messages, you can +proceed to the last lesson of the message exchanging part (\ref +GRAS_tut_tour_rpc) or jump to \ref GRAS_tut_tour_staticstruct to learn more +on data definition and see how to attach more complicated payloads to your +messages. */