/**
@page GRAS_tut_tour_simpledata Lesson 9: Exchanging simple data
\section GRAS_tut_tour_simpledata_toc Table of Contents
- \ref GRAS_tut_tour_simpledata_intro
- \ref GRAS_tut_tour_simpledata_intro_conv
- \ref GRAS_tut_tour_simpledata_intro_gras
- \ref GRAS_tut_tour_simpledata_use
- \ref GRAS_tut_tour_simpledata_example
- \ref GRAS_tut_tour_simpledata_recap
\section GRAS_tut_tour_simpledata_intro Introduction
Until now, we only exchanged "empty" messages, ie messages with no data
attached. Simply receiving the message was a sufficient information for the
receiver to proceed. There is a similarity between them and procedures not
accepting any argument in the sequential setting. For example, our "kill"
message can be seen as a distributed version of the exit() system
call, simply stopping the process receiving this call.
Of course, this is not enough for most applications and it is now time to
see how to attach some arbitrary data to our messages. In the GRAS parlance,
we will add a payload to the messages. Reusing the similarity between
message exchanges and procedure calls, we will now add arguments to our
calls.
Passing arguments in a distributed setting such as GRAS is a bit more
complicated than when performing a local call. The messaging layer must be
aware of the type of data you want to send and be able to actually send them
remotely, serializing them on sender side and deserializing them on the
other side. Of course, GRAS can do so for you.
\subsection GRAS_tut_tour_simpledata_intro_conv Data conversion issues on heterogeneous platforms
The platforms targeted by GRAS complicate the data transfers since the
machines may well be heterogeneous. You may want to exchange data between a
regular x86 machine (Intel and assimilated) and amd64 machine or even a ppc
machine (Mac).
The first problem comes from the fact that C datatypes are not always of the
same size depending on the processor. On 32 bits machines (such as x86 and
some ppc), they are stored on 4 bytes where they are stored on 8 bytes on 64
bits machines (such as amd64).
Then, a second problem comes from the fact that datatypes are not
represented the same way on these architectures. amd64 and x86 are called
little-endian architectures (as opposed to big-endian architectures like
ppc) because they store the bytes of a given integer in a right-to-left way.
For example, the number 16909060 is written Ox01020304 in hexadecimal base.
On big endian machines, it will be stored as for bytes ordered that way:
01.02.03.04. On little-endian machines, it will be stored as 04.03.02.01, ie
bytes are in reverse order.
A third problem comes from the so-called padding bytes. They come from the
fact that it is for example much more efficient for the processor to load a
4-bytes long data (such as an float) if it is aligned on a 4-bytes boundary,
ie if its first byte is placed in a region of the memory which address is a
multiple of 4. If it is not the case, the bus needs 2 cycles to retrieve the
data. That is why the compiler makes sure that any data declared in your
program are aligned in memory. When manipulating structures, it means that
the compiler may introduce some "spaces" between your fields to make sure
that each of them is aligned on the right boundary. Then, the boundaries
vary according to the aligned data. Most of the time, the alignment for a
data type is the data size (2 bytes for shorts which are 2-bytes long and so
on), but not always ;) And this all this was too easy, those values are not
only processor dependent, but also compiler and OS dependent. For example,
doubles (eight bytes) are 8-byte aligned on Windows and 4-byte aligned on
Linux... Let's take an example:
\verbatim struct MixedData{
char data_1;
short data_2;
char data_3;
int data_4;
};\endverbatim
One would say that the size of this structure should be 8 bytes long on x86
(1+2+1+4), but in fact, it is 12 bytes long. To ensure that data_2 is
2-aligned, one unused byte is added between data_1 and data_2 and 3 bytes
are wasted between data_3 and data_4 to make sure that this integer is
4-bytes aligned. Those bytes added by the compiler are called padding bytes.
Some of them may be added at the end of the structure to make sure that the
total size fulfill some criterions. On ARM machines, any structure size must
be a multiple of 4, leading a structure containing two chars to be 4 bytes
long instead of 2.
\subsection GRAS_tut_tour_simpledata_intro_gras Dealing with hardware heterogeneity in GRAS
All this certainly sounds scary and getting the things right can easily turn
into a nightmare if you want to do so yourself. Lukily, GRAS converts your
data seamlessly in heterogeneous exchanges. This is not really a revolution
since most high-level data exchange solution do so. For this, most solutions
convert any data to be exchanged from the sender representation into their
own format on the sender side and convert it to the receiver representation
on the other side. Sun RPC (used in NFS file systems) for example use the
XDR representation for this. When exchanging data between homogeneous
hosts, this is a clear waste of time since no conversion at all is needed,
but it is easier to implement. To deal with N kind of hardware architecture,
you only have to implement 2*N conversion schema (from any arch into the
exchange format, and from the exchange format into any arch).
In GRAS, we prefered performance over ease of implementation, and data won't
get converted when it's not needed. Instead, data are sent in the sender
representation and it is then the responsability of the receiver process to
convert it on need. To deal with N architectures, there is N^2 conversion
schema (from any arch to any arch). Nevertheless, GRAS known 9 different
architectures, allowing it to run on almost any existing computer: Linux
(x86, ia64, amd64, alpha, sparc, hppa and PPC), Solaris (Sparc and x86), Mac
OSX, IRIX and AIX. The conversion mecanism also work with the Windows
conventions, but other issues are still to be solved on this arch.
This approach, along with careful optimization, allows GRAS to offer very
competitive performance. It is faster than CORBA, not speaking from web
services which suffer badly from their textual data representation (XML).
\subsection GRAS_tut_tour_simpledata_use Actually exchanging data in GRAS messages
As stated above, all this conversion issues are dealed automatically by GRAS
and there is very few thing you should do yourself to get it working.
Simply, when you declare a message type with gras_msgtype_declare(), you
should provide a description of the payload data type as last argument. GRAS
will serialize the data, send it on the socket, convert it on need and
deserialize it for you automatically.
That means that any given message type can only convey a payload of a
predefined type. You cannot have a message type sometimes conveying an
integer and sometimes conveying a double. But in practice, this limitation
is not very hard to live with. Comparing message exchanges to procedure
calls again, you cannot have the same procedure accepting arbitrary argument
types. What you have in Java, for example, is several functions of the same
name accepting differing argument types, which is a bit different. In C, you
can also trick the limitation by using void* arguments. And
actually, you can do the same kind of tricks in GRAS, but this is really
premature at this point of the tutorial. It is the subject of \ref
GRAS_tut_tour_exchangecb.
Another limitation is that you can only convey one argument per message in
GRAS. We when that way in GRAS mainly because otherwise, gras_msg_send() and
the like would have to accept a variating number of parameters. It is
possible in C, but this reveals rather cumbersome since the compiler do not
check the number of arguments in any way, and the symptom on error is often
a segfault. Try passing too few parameters to printf with regard to the
format string if you want an example. Moreover, since you can convey
structures, it is easy to overcome this limitation: if you want several
arguments, simply pack them into a structure before doing so.
There is absolutely no limitation on the type of data you can exchange in
GRAS. If you can build a C representation of your data, you can exchange it
with GRAS. More precisely, you can exchange scalars, structures,
enumerations, arrays (both static and dynamic), pointers, and even things
like chained list of structures. It is even possible to exchange graphs of
structures containing cycles between members.
Actually, the main difficulty is to describe the data to be exchanged to
GRAS. This will be addressed in subsequent tutorial lessons, and we will
focus on exchanging data that GRAS already knows. Here is a list of such
data:
- char
- short int
- int
- long int
- long long int
For all these types, there is three variant: signed, unsigned and the
version where it is not specified. For example, "signed char", "char" and
"unsigned char" are all GRAS predefined datatype. The use of the unqualified
variant ("char") is not encouraged since you may gain some trouble
sometimes. On hppa, chars are unsigned by default where they are signed by
default on most archs. Use unqualified variant at your own risk ;)
- float
- double
- data and function pointers (on some arch, both types are not of the same
size)
You also have some more advanced types:
- string (which are null-terminated char*, as usual in the libc)
- #xbt_ex_t (the exception types in GRAS, which can get automatically exchanged
over the network and are thus predefined)
- #xbt_peer_t (a datatype describing a peer. There is a plenty of situation
in which you want to exchange data of this type, so this is also predefined)
\section GRAS_tut_tour_simpledata_example Back to our example
We will now modify our example to add some data to the "hello" and the
"kill" messages. "hello" will convey a string being displayed in the logs
while "kill" will convey an double indicating the number of seconds to wait
before dying.
The first thing is to modify the message declarations to specify that they
convey a payload. Of course, all nodes have to agree on message definitions,
and it would be very bad if the sender and the receiver would not agree on
the payload data type. GRAS checks for such discrepencies in the simulator
and dies loudly when something goes wrong. But in RL, GRAS do not check for
such things, and you are likely to get a segfault rather painful to debug.
To avoid such mistakes, it is a good habit to declare a function common to
any nodes declaring the message types involved in your application. Most of
the time, callbacks can't get declared in the same function since they
differ from node types to node types (the server attach 2 callbacks where
the client don't attach any). Here is the message declaring function in our
case:
\dontinclude 09-simpledata.c
\skip message_declaration(void)
\until }
It is very similar to what we had previously, we simply retrieve the
#gras_datadesc_type_t definitions of double and string and use them as payload
type of our messages.
The next step is to change our calls to gras_msg_send() to pass the data to
send. The rule is that you should put the data into a variable and then pass
the address of this variable. It makes no difference whether the type
happens to be a pointer (as char*) or a scalar (as double). Just give
gras_msg_send the address of the variable, it will do the things right.
\skip hello_payload
\until Gave
Then, we have to retrieve the sent data from the callbacks. The syntax for
this is a bit crude, but at least it is very systematic so you don't have to
think too much about this. The payload argument of callbacks is
declared as void* and you can consider that it is the address of
the variable passed during the send. Ok, it got serialized, exchanged over
the network, converted and deserialized, but really, you can consider that
it's the exact copy of your variable. So, to retrieve the content, you have
to cast the void* pointer to a pointer on your datatype, and then
derefence it.
So, it you want to retrieve a double, you have to cast the pointer using
(double*), and then dereference the obtained pointer by adding a
star before the cast. This is what we do here:
\dontinclude 09-simpledata.c
\skip server_kill_cb
\until delay
Again, it makes no difference whether the type happens to be a pointer or a
scalar. You simply end up with more stars in the cast for pointers:
\skip server_hello_cb
\until char**
That's it, you know how to exchange data between nodes. It's really simple
with GRAS, even if it's a nightmare to do so portably without it...
\section GRAS_tut_tour_simpledata_recap Recapping everything together
The program now reads:
\include 09-simpledata.c
Which produces the following output:
\include 09-simpledata.output
Now that you know how to exchange simple data along with messages, you can
proceed to the last lesson of the message exchanging part (\ref
GRAS_tut_tour_rpc) or jump to \ref GRAS_tut_tour_staticstruct to learn more
on data definition and see how to attach more complicated payloads to your
messages.
*/