doc/gtut-tour-09-simpledata.doc

   1 /**
   2 @page GRAS_tut_tour_simpledata Lesson 9: Exchanging simple data
   3
   4 \section GRAS_tut_tour_simpledata_toc Table of Contents
   5  - \ref GRAS_tut_tour_simpledata_intro
   6     - \ref GRAS_tut_tour_simpledata_intro_conv
   7     - \ref GRAS_tut_tour_simpledata_intro_gras
   8     - \ref GRAS_tut_tour_simpledata_use
   9  - \ref GRAS_tut_tour_simpledata_example
  10  - \ref GRAS_tut_tour_simpledata_recap
  11
  12 <hr>
  13
  14 \section GRAS_tut_tour_simpledata_intro Introduction
  15
  16 Until now, we only exchanged "empty" messages, ie messages with no data
  17 attached. Simply receiving the message was a sufficient information for the
  18 receiver to proceed. There is a similarity between them and procedures not
  19 accepting any argument in the sequential setting. For example, our "kill"
  20 message can be seen as a distributed version of the <tt>exit()</tt> system
  21 call, simply stopping the process receiving this call.
  22
  23 Of course, this is not enough for most applications and it is now time to
  24 see how to attach some arbitrary data to our messages. In the GRAS parlance,
  25 we will add a <i>payload</i> to the messages. Reusing the similarity between
  26 message exchanges and procedure calls, we will now add arguments to our
  27 calls.
  28
  29 Passing arguments in a distributed setting such as GRAS is a bit more
  30 complicated than when performing a local call.  The messaging layer must be
  31 aware of the type of data you want to send and be able to actually send them
  32 remotely, serializing them on sender side and deserializing them on the
  33 other side. Of course, GRAS can do so for you.
  34
  35 \subsection GRAS_tut_tour_simpledata_intro_conv Data conversion issues on heterogeneous platforms
  36
  37 The platforms targeted by GRAS complicate the data transfers since the
  38 machines may well be heterogeneous. You may want to exchange data between a
  39 regular x86 machine (Intel and assimilated) and amd64 machine or even a ppc
  40 machine (Mac).
  41
  42 The first problem comes from the fact that C datatypes are not always of the
  43 same size depending on the processor. On 32 bits machines (such as x86 and
  44 some ppc), they are stored on 4 bytes where they are stored on 8 bytes on 64
  45 bits machines (such as amd64).
  46
  47 Then, a second problem comes from the fact that datatypes are not
  48 represented the same way on these architectures. amd64 and x86 are called
  49 little-endian architectures (as opposed to big-endian architectures like
  50 ppc) because they store the bytes of a given integer in a right-to-left way.
  51 For example, the number 16909060 is written Ox01020304 in hexadecimal base.
  52 On big endian machines, it will be stored as for bytes ordered that way:
  53 01.02.03.04. On little-endian machines, it will be stored as 04.03.02.01, ie
  54 bytes are in reverse order.
  55
  56 A third problem comes from the so-called padding bytes. They come from the
  57 fact that it is for example much more efficient for the processor to load a
  58 4-bytes long data (such as an float) if it is aligned on a 4-bytes boundary,
  59 ie if its first byte is placed in a region of the memory which address is a
  60 multiple of 4. If it is not the case, the bus needs 2 cycles to retrieve the
  61 data.  That is why the compiler makes sure that any data declared in your
  62 program are aligned in memory. When manipulating structures, it means that
  63 the compiler may introduce some "spaces" between your fields to make sure
  64 that each of them is aligned on the right boundary. Then, the boundaries
  65 vary according to the aligned data. Most of the time, the alignment for a
  66 data type is the data size (2 bytes for shorts which are 2-bytes long and so
  67 on), but not always ;) And this all this was too easy, those values are not
  68 only processor dependent, but also compiler and OS dependent. For example,
  69 doubles (eight bytes) are 8-byte aligned on Windows and 4-byte aligned on
  70 Linux... Let's take an example:
  71
  72 \verbatim struct MixedData{
  73    char    data_1;
  74    short   data_2;
  75    char    data_3;
  76    int     data_4;
  77 };\endverbatim
  78
  79 One would say that the size of this structure should be 8 bytes long on x86
  80 (1+2+1+4), but in fact, it is 12 bytes long. To ensure that data_2 is
  81 2-aligned, one unused byte is added between data_1 and data_2 and 3 bytes
  82 are wasted between data_3 and data_4 to make sure that this integer is
  83 4-bytes aligned. Those bytes added by the compiler are called padding bytes.
  84 Some of them may be added at the end of the structure to make sure that the
  85 total size fulfill some criterions. On ARM machines, any structure size must
  86 be a multiple of 4, leading a structure containing two chars to be 4 bytes
  87 long instead of 2.
  88
  89 \subsection GRAS_tut_tour_simpledata_intro_gras Dealing with hardware heterogeneity in GRAS
  90
  91 All this certainly sounds scary and getting the things right can easily turn
  92 into a nightmare if you want to do so yourself. Lukily, GRAS converts your
  93 data seamlessly in heterogeneous exchanges. This is not really a revolution
  94 since most high-level data exchange solution do so. For this, most solutions
  95 convert any data to be exchanged from the sender representation into their
  96 own format on the sender side and convert it to the receiver representation
  97 on the other side. Sun RPC (used in NFS file systems) for example use the
  98 XDR representation for this.  When exchanging data between homogeneous
  99 hosts, this is a clear waste of time since no conversion at all is needed,
 100 but it is easier to implement. To deal with N kind of hardware architecture,
 101 you only have to implement 2*N conversion schema (from any arch into the
 102 exchange format, and from the exchange format into any arch).
 103
 104 In GRAS, we prefered performance over ease of implementation, and data won't
 105 get converted when it's not needed. Instead, data are sent in the sender
 106 representation and it is then the responsability of the receiver process to
 107 convert it on need. To deal with N architectures, there is N^2 conversion
 108 schema (from any arch to any arch). Nevertheless, GRAS known 9 different
 109 architectures, allowing it to run on almost any existing computer: Linux
 110 (x86, ia64, amd64, alpha, sparc, hppa and PPC), Solaris (Sparc and x86), Mac
 111 OSX, IRIX and AIX. The conversion mecanism also work with the Windows
 112 conventions, but other issues are still to be solved on this arch.
 113
 114 This approach, along with careful optimization, allows GRAS to offer very
 115 competitive performance. It is faster than CORBA, not speaking from web
 116 services which suffer badly from their textual data representation (XML).
 117
 118 \subsection GRAS_tut_tour_simpledata_use Actually exchanging data in GRAS messages
 119
 120 As stated above, all this conversion issues are dealed automatically by GRAS
 121 and there is very few thing you should do yourself to get it working.
 122 Simply, when you declare a message type with gras_msgtype_declare(), you
 123 should provide a description of the payload data type as last argument. GRAS
 124 will serialize the data, send it on the socket, convert it on need and
 125 deserialize it for you automatically.
 126
 127 That means that any given message type can only convey a payload of a
 128 predefined type. You cannot have a message type sometimes conveying an
 129 integer and sometimes conveying a double.  But in practice, this limitation
 130 is not very hard to live with. Comparing message exchanges to procedure
 131 calls again, you cannot have the same procedure accepting arbitrary argument
 132 types. What you have in Java, for example, is several functions of the same
 133 name accepting differing argument types, which is a bit different. In C, you
 134 can also trick the limitation by using <tt>void*</tt> arguments. And
 135 actually, you can do the same kind of tricks in GRAS, but this is really
 136 premature at this point of the tutorial. It is the subject of \ref
 137 GRAS_tut_tour_exchangecb.
 138
 139 Another limitation is that you can only convey one argument per message in
 140 GRAS. We when that way in GRAS mainly because otherwise, gras_msg_send() and
 141 the like would have to accept a variating number of parameters. It is
 142 possible in C, but this reveals rather cumbersome since the compiler do not
 143 check the number of arguments in any way, and the symptom on error is often
 144 a segfault. Try passing too few parameters to printf with regard to the
 145 format string if you want an example. Moreover, since you can convey
 146 structures, it is easy to overcome this limitation: if you want several
 147 arguments, simply pack them into a structure before doing so.
 148
 149 There is absolutely no limitation on the type of data you can exchange in
 150 GRAS. If you can build a C representation of your data, you can exchange it
 151 with GRAS. More precisely, you can exchange scalars, structures,
 152 enumerations, arrays (both static and dynamic), pointers, and even things
 153 like chained list of structures. It is even possible to exchange graphs of
 154 structures containing cycles between members.
 155
 156 Actually, the main difficulty is to describe the data to be exchanged to
 157 GRAS. This will be addressed in subsequent tutorial lessons, and we will
 158 focus on exchanging data that GRAS already knows. Here is a list of such
 159 data:
 160
 161  - char
 162  - short int
 163  - int
 164  - long int
 165  - long long int
 166
 167 For all these types, there is three variant: signed, unsigned and the
 168 version where it is not specified. For example, "signed char", "char" and
 169 "unsigned char" are all GRAS predefined datatype. The use of the unqualified
 170 variant ("char") is not encouraged since you may gain some trouble
 171 sometimes. On hppa, chars are unsigned by default where they are signed by
 172 default on most archs. Use unqualified variant at your own risk ;)
 173
 174  - float
 175  - double
 176  - data and function pointers (on some arch, both types are not of the same
 177    size)
 178
 179 You also have some more advanced types:
 180
 181  - string (which are null-terminated char*, as usual in the libc)
 182  - #xbt_ex_t (the exception types in GRAS, which can get automatically exchanged
 183    over the network and are thus predefined)
 184  - #xbt_peer_t (a datatype describing a peer. There is a plenty of situation
 185    in which you want to exchange data of this type, so this is also predefined)
 186
 187 \section GRAS_tut_tour_simpledata_example Back to our example
 188
 189 We will now modify our example to add some data to the "hello" and the
 190 "kill" messages. "hello" will convey a string being displayed in the logs
 191 while "kill" will convey an double indicating the number of seconds to wait
 192 before dying.
 193
 194 The first thing is to modify the message declarations to specify that they
 195 convey a payload. Of course, all nodes have to agree on message definitions,
 196 and it would be very bad if the sender and the receiver would not agree on
 197 the payload data type. GRAS checks for such discrepencies in the simulator
 198 and dies loudly when something goes wrong. But in RL, GRAS do not check for
 199 such things, and you are likely to get a segfault rather painful to debug.
 200 To avoid such mistakes, it is a good habit to declare a function common to
 201 any nodes declaring the message types involved in your application. Most of
 202 the time, callbacks can't get declared in the same function since they
 203 differ from node types to node types (the server attach 2 callbacks where
 204 the client don't attach any). Here is the message declaring function in our
 205 case:
 206
 207 \dontinclude 09-simpledata.c
 208 \skip message_declaration(void)
 209 \until }
 210
 211 It is very similar to what we had previously, we simply retrieve the
 212 #gras_datadesc_type_t definitions of double and string and use them as payload
 213 type of our messages.
 214
 215 The next step is to change our calls to gras_msg_send() to pass the data to
 216 send. The rule is that you should put the data into a variable and then pass
 217 the address of this variable. It makes no difference whether the type
 218 happens to be a pointer (as char*) or a scalar (as double). Just give
 219 gras_msg_send the address of the variable, it will do the things right.
 220
 221 \skip hello_payload
 222 \until Gave
 223
 224 Then, we have to retrieve the sent data from the callbacks. The syntax for
 225 this is a bit crude, but at least it is very systematic so you don't have to
 226 think too much about this. The <tt>payload</tt> argument of callbacks is
 227 declared as <tt>void*</tt> and you can consider that it is the address of
 228 the variable passed during the send. Ok, it got serialized, exchanged over
 229 the network, converted and deserialized, but really, you can consider that
 230 it's the exact copy of your variable. So, to retrieve the content, you have
 231 to cast the <tt>void*</tt> pointer to a pointer on your datatype, and then
 232 derefence it.
 233
 234 So, it you want to retrieve a double, you have to cast the pointer using
 235 <tt>(double*)</tt>, and then dereference the obtained pointer by adding a
 236 star before the cast. This is what we do here:
 237
 238 \dontinclude 09-simpledata.c
 239 \skip server_kill_cb
 240 \until delay
 241
 242 Again, it makes no difference whether the type happens to be a pointer or a
 243 scalar. You simply end up with more stars in the cast for pointers:
 244
 245 \skip server_hello_cb
 246 \until char**
 247
 248 That's it, you know how to exchange data between nodes. It's really simple
 249 with GRAS, even if it's a nightmare to do so portably without it...
 250
 251 \section GRAS_tut_tour_simpledata_recap Recapping everything together
 252
 253 The program now reads:
 254 \include 09-simpledata.c
 255
 256 Which produces the following output:
 257 \include 09-simpledata.output
 258
 259 Now that you know how to exchange simple data along with messages, you can
 260 proceed to the last lesson of the message exchanging part (\ref
 261 GRAS_tut_tour_rpc) or jump to \ref GRAS_tut_tour_staticstruct to learn more
 262 on data definition and see how to attach more complicated payloads to your
 263 messages.
 264
 265 */