Using CRAC is not more complicated than MPI since it is based on
message passing. Nevertheless, the biggest mistake that can be done is
to code an application in an "MPI state of mind". Indeed, the
communication semantic is very different and several mechanisms
characteristic of iterative and especially AIAC executions are
included in CRAC but not in MPI (message crushing, convergence
detection, ...). Thus, it is a good idea to carefully read the AIAC vs. SIAC and CRAC's core sections.
After that, proceed with the section Coding with CRAC and Executing a CRAC application.
This section is divided in three parts: the Tag notion, the API description
and Compiling user's tasks.
The two firsts should be read carefully before writing the least
line of code.
At the end of each iteration, a computing task needs to send local
data to several tasks. These data are very often elements of a vector
or a matrix, needed to update local data before going on with the next
iteration. The rank of these elements is determined by the problem to
solve, the number of tasks and the data distribution over the
tasks.
In most cases, a task X always sends the same set of elements to
the task Y at the end of each iteration. Obviously, if the rank of
these element is the same, their value may change from
iteration to iteration.
In AIAC execution, message
crushing accelerates the convergence. CRAC can do this
transparently, but for this, must be able to determine if two messages
coming from the same task contain the same set of elements, that is
will be used to update the same local data. This is done by
explicitly tagging the messages.
This gives the two first rules of the CRAC coder:
All along the code, the developer must use the same tag for all messages
containing the same set of element sent by A and received by B.
If two messages sent by A and received by B do not contain the
same set of elements, they must not use the same tag.
Do not follow these rules and your code won't be really AIAC or
even may crash.
back to top.
Even if CRAC is dedicated to AIAC executions, it also provides
methods particularly useful for any distributed computing code. The
table below summarizes all the methods a task can call. Just click on the method name to access to a detailed description of it.
Type
|
Prototype
|
Informations
|
int CRACGetTaskId()
|
int CRACGetTaskNb()
|
char** CRACGetArgValue(int taskId)
|
int CRACGetArgNb(int taskId)
|
int CRACGetIdInVDM()
|
int CRACGetIdInNode()
|
double CRACGetLocalTime()
|
Communications
|
void CRACSend(int dest, int tag, unsigned char *data, int size)
|
int CRACRecv(int src, int tag, unsigned char *data, int size, int mode)
|
bool CRACConvergence(bool set, int mode, int op)
|
void CRACBroadcast(int src, unsigned char* data, int size)
|
void CRACBarrier()
|
Others
|
void CRACFinalize()
|
void CRACSetEnv(int what, int value)
|
This method returns the current task identifier. This number ranges from
0 to nbTask-1.
N.B.: the task identifier is determined automatically by CRAC,
using the XML file that describes the execution context.
back to table.
This method returns the number of tasks involved in the current execution.
back to table.
char** CRACGetArgValue(int taskId)
|
int CRACGetArgNb(int taskId)
|
This method returns the number of argument passed to the task taskId
back to table.
CAUTION: this method should never be used in "normal"
code. It is only provided for debugging purposes
This method returns the daemon identifier that hosts the
current task. This number ranges from 0 to nbDaemon-1.
N.B.: the daemon identifier is determined automatically by CRAC,
using the XML file that describes the VDM.
back to table.
CAUTION: this method should never be used in "normal"
code. It is only provided for debugging purposes
A node represents the computing entity hosted by the daemon for
an execution. It may contains several tasks. This method returns the
rank of the current task in the node.
N.B.: the rank is determined automatically by CRAC,
using the XML file that describes the execution context.
back to table.
double CRACGetLocalTime()
|
This method returns an absolute time as a double. In order to
measure a time gap, two calls must be done, as shown below.
double gap = CRACGetLocalTime();
...
...
gap = CRACGetLocalTime() - gap;
cout << "elapsed time : " << gap << "s" << endl;
back to table.
void CRACSend(int dest, int tag, unsigned char *data, int size)
|
This method sends a message of size size to task dest,
containing bytes given in data and marked with the tag tag.
The first byte sent is byte 0 of data.
N.B.: this method never blocks and returns nearly immediately. It
simply copies the data in the sending queues used by the
Sender thread.
back to table.
int CRACRecv(int src, int tag, unsigned char *data, int size, int mode)
|
This method tries to receive a message of size size from task
src,
marked with the tag tag, storing bytes in data.
The first byte received is stored at index 0 of data.
mode
indicates if the call is blocking or not. Two values are possible :
- CRAC_NOBLOCK: if no message with
the given source and tag exists in the reception queues, the
method return -1. Else, it fills data and returns the real size of
the message (may be < size)
- CRAC_BLOCK: the method blocks
until a message with the given source and tag exists in the
reception queues. Then, it fills data and returns the real size of
the message (may be < size)
N.B.: obviously, this call must be chosen blocking for SIAC
executions and not blocking for AIAC.
back to table.
bool CRACConvergence(bool set, int mode, int op)
|
This method manages all operations bounded to local/global
convergence. There are two types of operation, depending on the
value of op:
- CRAC_GET_CONV:
the method sends the local convergence state set and
returns the global convergence state. It may return immediately or
not (see below).
- CRAC_RESET_CONV:
the method resets the local/global convergence state to false and
returns true. This
operation is used when an application does several series of
iterations and must detect the global convergence for each.
In case of op
= CRAC_RESET_CONV, mode indicates if the call is
blocking or not. Two values are possible :
- CRAC_NOBLOCK: if no message
saying the global convergence has been reached exists in the
reception queues, the method return false. Else, it returns true.
- CRAC_BLOCK: the method blocks
until a message with the global convergence state exists in the
reception queues. Then, it returns this state which may be true or false,
depending on if the global convergence has been reached.
N.B.: obviously, this call must be chosen blocking for SIAC
executions and not blocking for AIAC.
back to table.
void CRACBroadcast(int src, unsigned char *data, int size)
|
The task src broadcasts to all other tasks a
message of size size, containing bytes given in
data.
The first byte sent is byte 0 of data.
All other tasks store the message in data.
N.B. 1: this method use a binary tree over all tasks to achieve
the broadcast in a minimal communication steps. Thus, a task may
receive the message from a task != src.
N.B. 2: this method is blocking until the broadcasted message is
received. Internally, an acknowledgment is sent to the sender
immediately after the reception.
N.B. 3: This method (like CRACBarrier()) must be called by
all tasks. If it is not the case, some tasks may be blocked
because an acknowledgment is missing.
back to table.
This method allows to insert a synchronization point of all
tasks during the execution.
N.B. 1: this method uses a decentralized algorithm that implies
only one communication with each neighbors. Thus, it is minimal in
communication steps and bandwidth use.
N.B. 2: This method (like CRACBroadcast()) must be called by
all tasks. If it is not the case, some tasks may be blocked
because an internal acknowledgment is missing.
back to table.
This method allows to end correctly an execution. In fact, it is
a special barrier.
back to table.
void CRACSetEnv(int what, int value)
|
This method allows to set some internals parameters of CRAC. It
should never be used, except with what = CRAC_ITER_CONV. The second
parameter is an integer value. It is only used for AIAC
executions. It fixes the number of iterations the supermaster waits
for when the global convergence has been reached. Passed this
number, it warns all tasks. Waiting for few iterations allows to
possibly receive a message saying a task has no more a convergence
state set to true.
back to table.
A user's task is just a class, inheriting from the Task class. Its
run()
method is the execution entry point. Obviously, it is
not mandatory to put all the code in this method.
Compiling a user's task must produce a shared library (.so) that
will be loaded dynamically in the daemon context in order to
instantiate UserTask
objects.
The most simple is to modify the Makefile in the examples/CRAC
directory.
back to top.
Before we can execute any CRAC application, some constraints on ssh and searching path
must be checked in order to launch correctly the daemon and the
tasks. After that, you can create a user's task (as those of the examples directory), a VDM configuration file and an Execution configuration file.
Then, you can use cracboot and
cracrun to
create the VDM and launch your tasks.
Firstly, you must be sure that ssh client and server is installed
on each machine that will be used in the VDM. If it is not the case,
it won't be possible to spawn the daemons
Secondly, you must configure ssh in such a way it never asks you
for a password or passphrase. It is done by generating an asymmetric
key (see ssh-keygen) on
each machine in the VDM and put the public part in the file ~/.ssh/authorized_keys. This
file must contains all the public keys generated and be present on
each machine.
Sometimes, you have an unique NFS account for a set of
machines. Then, you only need to generate a single key for the set
since your ~/.ssh directory
is shared by all the machines.
Searching path configuration
There are four binaries essential for CRAC : cracd (the
daemon), cracboot (the
daemon spawner), cracwipe (the
daemon wiper), and cracrun (the
task spawner).
These binaries must be in the searching path of each
machine. Supposing you build CRAC and the binaries are in /home/bob/CRAC-project/bin,
then you must add in your shell initialization script (for example
~/.bashrc) :
export PATH=$PATH:/home/bob/CRAC-project/bin
This is an XML file that contains all needed informations on the
machines used in the VDM :
- the supermaster VDM name (see id field of
supermaster
tag),
- the sites VDM name (see id field of
site
tags),
- the masters VDM name (see id field of
master
tags),
- the VDM name of each machine (see id fields of
machine
tags),
- address types (see type fields of
address
tags),
- connection port for other machines (see ip tags),
- canonical name or IP addresses (see ip tags),
The examples below are quite simple but the give an overview of
the possibilities to define the VDM. If you want a more complex
example, see archi_3site.xml which
describes the architecture given in the CRAC's core section.
3 machines, single site, no cluster of private IPs.
|
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE architecture SYSTEM "sitevalide.dtd">
14568
mach1.site1.fr
14568
mach10.site1.fr
14568
mach14.site1.fr
Important points :
- the two first lines must never be omitted or changed
unless you also change the name of the .dtd file.
- the file sitevalide.dtd
must exist in the same directory that these XML files. It is
used to parse and check the correctness of the syntax.
- there must be only one supermaster defined.
- there must be at least one site, for which the master
identifier is the supermaster identifier
- there are four machine types, which must be set by the user:
- public:
machine can send/receive packets from outside its site.
- private:
machine can send/receive packets only to/from within its
site.
- noin: like
a private machine but that can send packets outside its
site.
- noout:
like a private machine but that can receive packets from
outside its site.
- the ip tag may
contain a canonical name or an IP.
- if the machine is private and a canonical name is given,
the frontal will do the conversion to an IP. Thus, it is not
mandatory that the name of private machine are on a DNS.
|
5 machines, 2 sites, 1 cluster of 2 machines
|
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE architecture SYSTEM "sitevalide.dtd">
14568
mach1.site1.fr
14568
mach10.site1.fr
16789
mach1.site3.fr
21567
priv1.site3.fr
21567
192.168.0.11
|
back to top.
Execution
configuration file
Launching a CRAC application consists in creating dynamically
instances of user's tasks on chosen machines. For this, CRAC defines
computing entities called node, that associate a user's task
code to a machine in the VDM and a number of instances. You can also
pass arguments to the task. These associations are defined in a XML
file that contains:
- the machine used to define a computing node (see machine field of
node
tag),
- the number of tasks to launch on each computing node (see machine field of
node
tag),
- the path to the task code (see class
tag),
- the arguments passed to the task (see args
tag),
Execution description of ping example (2 machines, 2 nodes, 2 tasks).
|
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE architecture SYSTEM "runvalide.dtd">
/home/bob/CRAC-project/examples/CRAC/pingpong.so
"1000" "10000" "1000"
/home/bob/CRAC-project/examples/CRAC/pingpong.so
"1000" "10000" "1000"
Execution description of broadcast example (4 machines, 5 nodes, 8 tasks).
|
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE architecture SYSTEM "runvalide.dtd">
/home/bob/CRAC-project/examples/CRAC/broad.so
/home/bob/CRAC-project/examples/CRAC/broad.so
/home/bob/CRAC-project/examples/CRAC/broad.so
/home/bob/CRAC-project/examples/CRAC/broad.so
/home/bob/CRAC-project/examples/CRAC/broad.so
Important points :
- the two first lines must never be omitted or changed
unless you also change the name of the .dtd file.
- the file runvalide.dtd
must exist in the same directory that these XML files. It is
used to parse and check the correctness of the syntax.
- the machine tag
must refer to a machine id field in
the VDM configuration file.
- the class paths must be absolute.
- there may be a different paths and arguments for each node
(-> MIMD execution).
- there may be several nodes on the same machine.
- CRAC determines the task identifier from the read order of this
file. For the second example, it gives tasks [0,2,3] on cluster1,
task [1] on cluster2, task [4] on cluster5 and tasks [5,6,7] on
cluster4.
back to top.
Once you have configuration files (supposing archi_mytest.xml
and run_mytest.xml),
just type :
> cracboot arch_mytest.xml
... few lines for initialization ending with "Waiting for running configuration"
> cracrun run_mytest.xml
back to top.