1er draft complet. A relire of course. Manque la biblio

[gpc2011.git] / gpc2011.tex
diff --git a/gpc2011.tex b/gpc2011.tex

index 15928ff..eaf7101 100644 (file)
--- a/gpc2011.tex
+++ b/gpc2011.tex
@@ -45,26 +45,39 @@
  
  \title{Gridification of a Radiotherapy Dose Computation Application with the XtremWeb-CH Environment}
  
-\author{Nabil Abdennhader\inst{1} \and Raphaël Couturier\inst{1} \and David \and
-  Julien  Henriet\inst{2} \and  Laiymani\inst{1}  \and Sébastien  Miquée\inst{1}
-  \and Marc Sauget\inst{2}}
  
-\institute{Laboratoire d'Informatique de l'universit\'{e}
+\author{Nabil Abdennhader\inst{1} \and Mohamed Ben Belgacem{1} \and Raphaël Couturier\inst{2} \and
+  David Laiymani\inst{2} \and Sébastien  Miquée\inst{2} \and Marko Niinimaki\inst{1} \and Marc Sauget\inst{2}}
+
+\institute{
+University of Applied Sciences Western Switzerland, hepia Geneva,
+Switzerland \\
+\email{nabil.abdennadher@hesge.ch, mohamed.benbelgacem@unige.ch, markopekka.niinimaeki@hesge.ch}
+\and
+Laboratoire d'Informatique de l'universit\'{e}
    de Franche-Comt\'{e} \\
    IUT Belfort-Montbéliard, Rue Engel Gros, 90016 Belfort - France \\
  \email{raphael.couturier, david.laiymani, sebastien.miquee@univ-fcomte.fr}
  \and
   FEMTO-ST, ENISYS/IRMA, F-25210 Montb\'{e}liard , FRANCE\\
+\email{marc.sauget@femtost.fr}
  }
-%\email{\texttt{[laiymani]@lifc.univ-fcomte.fr}}}
  
  
  \maketitle
  
  \begin{abstract} 
-  
+  This paper presents the design and the evaluation of the
+  gridification of a radiotherapy dose computation application. Due to
+  the inherent characteristics of the application and its execution,
+  we choose the architectural context of global (or volunteer) computing.
+  For this, we used the XtremWeb-CH environement. Experiments were
+  conducted on a real global computing testbed and show good speed-ups
+  and very acceptable platform overhead letting XtremWeb-CH a good candidate
+for deploying parallel applications over a global computing environment. 
  \end{abstract}
  
+
  %-------------INTRODUCTION--------------------
  \section{Introduction}
  
@@ -74,22 +87,23 @@ the domain of radiotherapy dose computation the problem is
  crucial. The main goal of external beam radiotherapy is the treatment
  of tumors while minimizing exposure to healthy tissue. Dosimetric
  planning has to be carried out in order to optimize the dose
-distribution within the patient is necessary. Thus, to determine the
-most accurate dose distribution during treatment planning, a
-compromise must be found between the precision and the speed of
-calculation. Current techniques, using analytic methods, models and
-databases, are rapid but lack precision. Enhanced precision can be
-achieved by using calculation codes based, for example, on Monte Carlo
-methods. In [] the authors proposed a novel approach using neural
-networks. This approach is based on the collaboration of computation
-codes and multi-layer neural networks used as universal
-approximators. It provides a fast and accurate evaluation of radiation
-doses in any given environment for given irradiation parameters. As
-the learning step is often very time consuming, in \cite{bcvsv08:ip}
-the authors proposed a parallel algorithm that enable to decompose the
-learning domain into subdomains. The decomposition has the advantage
-to significantly reduce the complexity of the target functions to
-approximate.
+distribution within the patient. Thus, to determine the most accurate
+dose distribution during treatment planning, a compromise must be
+found between the precision and the speed of calculation. Current
+techniques, using analytic methods, models and databases, are rapid
+but lack precision. Enhanced precision can be achieved by using
+calculation codes based, for example, on Monte Carlo methods. The main
+drawback of these methods is their computation times which can be
+rapidly be huge. In [] the authors proposed a novel approach, called
+Neurad, using neural networks. This approach is based on the
+collaboration of computation codes and multi-layer neural networks
+used as universal approximators. It provides a fast and accurate
+evaluation of radiation doses in any given environment for given
+irradiation parameters. As the learning step is often very time
+consuming, in \cite{bcvsv08:ip} the authors proposed a parallel
+algorithm that enable to decompose the learning domain into
+subdomains. The decomposition has the advantage to significantly
+reduce the complexity of the target functions to approximate.
  
  Now, as there exist several classes of distributed/parallel
  architectures (supercomputers, clusters, global computing...)  we have
@@ -117,8 +131,8 @@ XtremWeb-CH is very acceptable.
  
  The paper is organized as follows. In Section 2 we present the Neurad
  application and particularly it most time consuming part i.e. the
-learning step. Section 3 details the XtremWeb-CH environment while in
-Section 4 we expose the gridification of the Neurad
+learning step. Section 3 details the XtremWeb-CH environment and
+Section 4 exposes the gridification of the Neurad
  application. Experimental results are presented in Section 5 and we
  end in Section 6 by some concluding remarks and perspectives.
  
@@ -153,22 +167,24 @@ process and the dose deposit evaluation. The first step, the data
  production, is outside the {\it{Neurad}} project. They are many
  solutions to obtain data about the radiotherapy treatments like the
  measure or the simulation. The only essential criterion is that the
-result must be obtain in a homogeneous environment. We have chosen to
-use only a Monte Carlo simulation because this kind of tool is the
-reference in the radiotherapy domains. The advantages to use data
-obtained with a Monte Carlo simulator are the following: accuracy,
-profusion, quantified error and regularity of measure points. But,
-there exist also some disagreements and the most important is the
-statistical noise, forcing a data post treatment. Figure~\ref{f_tray}
-presents the general behavior of a dose deposit in water.
+result must be obtain in a homogeneous environment. 
  
+% We have chosen to
+% use only a Monte Carlo simulation because this kind of tool is the
+% reference in the radiotherapy domains. The advantages to use data
+% obtained with a Monte Carlo simulator are the following: accuracy,
+% profusion, quantified error and regularity of measure points. But,
+% there exist also some disagreements and the most important is the
+% statistical noise, forcing a data post treatment. Figure~\ref{f_tray}
+% presents the general behavior of a dose deposit in water.
  
-\begin{figure}[http]
-  \centering
-  \includegraphics[width=0.7\columnwidth]{figures/testC.pdf}
-  \caption{Dose deposit by a photon beam  of 24 mm of width in water (normalized value).}
-  \label{f_tray}
-\end{figure}
+
+% \begin{figure}[http]
+%   \centering
+%   \includegraphics[width=0.7\columnwidth]{figures/testC.pdf}
+%   \caption{Dose deposit by a photon beam  of 24 mm of width in water (normalized value).}
+%   \label{f_tray}
+% \end{figure}
  
  The secondary stage of the {\it{Neurad}} project is the learning step
  and this is the most time consuming step. This step is off-line but it
@@ -211,17 +227,188 @@ differences observed at the borders are no longer relevant.
  Nonetheless, in order to preserve the performance of the parallel
  algorithm, it is important to carefully set the overlapping ratio
  $\alpha$. It must be large enough to avoid the border's errors, and
-as small as possible to limit the size increase of the data subsets.
+as small as possible to limit the size increase of the data subsets
+(Qu'en est-il pour nos test ?).
  
  
  
  \section{The XtremWeb-CH environment}
  \input{xwch.tex}
  
+\section{The Neurad gridification}
+
+\label{sec:neurad_gridif}
+
+
+As previously exposed, the Neurad application can be divided into
+three steps.  The goal of the first step is to decompose the data
+representing the dose distribution on an area. This area contains
+various parameters, like the nature of the medium and its
+density. This part is out of the scope of this paper.
+%Multiple ``views'' can be
+%superposed in order to obtain a more accurate learning. 
+
+The second step of the application, and the most time consuming, is
+the learning itself. This is the one which has been parallelized,
+using the XWCH environment. As exposed in the section 2, the
+parallelization relies on a partitionning of the global
+dataset. Following this partitionning all learning tasks execute in
+parallel independently with their own local data part, with no
+communication, following the fork-join model. Clearly, this
+computation fits well with the model of the chosen middleware.
+
+The execution scheme is then the following (see Figure
+\ref{fig:neurad_grid}):
+\begin{enumerate}
+\item we first send the learning application and its data to the
+  middleware (more precisely on warehouses (DW)) and create the
+  computation module,
+\item when a worker (W) is ready to compute, it requests a task to
+  execute to the coordinator (Coord.),
+\item The coordinator assigns the worker a task. This last one retrieves the
+application and its assigned data and so can start the computation. 
+\item At the end of the learning process, the worker sends the result,, to a warehouse.
+\end{enumerate}
+
+The last step of the application is to retrieve these results (some
+weighted neural networks) and exploit them through a dose distribution
+process.
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=8cm]{figures/neurad_gridif}
+  \caption{The proposed Neurad gridification}
+  \label{fig:neurad_grid}
+\end{figure}
+
  \section{Experimental results}
-\section{Conclusion and future works}
+\label{sec:neurad_xp}
+
+The aim of this section is to describe and analyse the experimental
+results we have obtained with the parallel Neurad version previously
+described. Our goal was to carry out this application with real input
+data and on a real global computing testbed.
+
+\subsubsection{Experimental conditions}
+\label{sec:neurad_cond}
+
+The size of the input data is about 2.4Gb. In order to avoid that data
+noise appears and disturb the learning process, these data can be
+divided into 25 part, at most. This generates input data parts of
+about 15Mb (in a compressed format). The output data, which are
+retrieved after the process, are about 30Kb for each
+part. Unfortunately, the data decomposition limitation does not allow
+us to use more than 25 computers (XWCH workers). Nevertheless, we used two
+distincts deployments of XWCH:
+\begin{enumerate} 
+
+\item In the first one, called ``ditributed XWCH'' in the following,
+  the XWCH coordinator and the warehouses were situated in Geneva,
+  Switzerland while the workers were running in the same local cluster
+  in Belfort, France.
+
+\item The second deployment, called ``local XWCH'' is a local
+  deployment where both coordinator, warehouses and workers were in
+  the same local cluster.  
+
+\end{enumerate}
+For the both deployments, during the day these machines were used by
+students of the Computer Science Department of the IUT of Belfort.
+
+In order to evaluate the overhead induced by the use of the platform
+we have furthermore compared the execution of the Neurad application
+with and without the XWCH platform. For the latter case, we mean that the
+testbed consists only in workers deployed with their respective data
+by the use of shell scripts. No specific middleware was used and the
+workers were in the same local cluster.
+
+Finally, five computation precisions were used: $1e^{-1}$, $0.75e^{-1}$,
+$0.50e^{-1}$, $0.25e^{-1}$ and $1e^{-2}$.
+
+
+\subsubsection{Results}
+\label{sec:neurad_result}
+
+Table \ref{tab:neurad_res} presents the execution times of the Neurad
+application on 25 machines with XWCH (local and distributed
+deployment) and without XWCH. These results correspond to the measure
+of the same step for both kind of execution i.e. sending of local data and the
+executable, the learning process, and retrieving the results. The
+results represent the average time of $x$ executions.
+
+
+\begin{table}[h!]
+  \centering
+  \begin{tabular}[h!]{|c|c|c|c|c|}
+    \hline
+    Precision & 1 machine & Without XWCH & With XWCH & With local XWCH\\
+    \hline
+     $1e^{-1}$ & 5190 & 558 & 759 & 629\\
+    $0.75e^{-1}$ & 6307 & 792 & 1298 & 801 \\
+    $0.50e^{-1}$ & 7487 & 792 & 1010 & 844 \\
+    $0.25e^{-1}$ & 7787 & 791 & 1000 & 852\\
+    $1e^{-2}$ & 11030 & 1035 & 1447 & 1108 \\
+    \hline
+  \end{tabular}
+\caption{Execution time in seconds of the Neurad application, with and without using the XWCH platform}
+  \label{tab:neurad_res}
+\end{table}
+
+%\begin{table}[ht]
+%  \centering
+%  \begin{tabular}[h]{|c|c|c|}
+%    \hline
+%    Precision & Without XWCH & With XWCH \\
+%    \hline
+%    $1e^{-1}$ & $558$s & $759$s\\
+%    \hline
+%  \end{tabular}
+%  \caption{Execution time in seconds of Neurad application, with and without using XtremWeb-CH platform}
+%  \label{tab:neurad_res}
+%\end{table}
+
+
+As we can see, in the case of a local deployment the overhead induced
+by the use of the XWCH platform is about $7\%$. It is clearly a low
+overhead. Now, for the distributed deployment, the overhead is about
+$34\%$. Regarding to the benefits of the platform, it is a very
+acceptable overhead which can be explained by the following points.
+
+First, we point out that the conditions of executions are not really
+identical between with and without XWCH contexts. For this last one,
+though the same steps were done, all transfer processes are inside a
+local cluster with a high bandwidth and a low latency. Whereas when
+using XWCH, all transfer processes (between datawarehouses, workers,
+and the coordinator) used a wide network area with a smaller
+bandwidth.  In addition, in executions without XWCH, all the machines
+started immediately the computation, whereas when using the XWCH
+platform, a latency is introduced by the fact that a computation
+starts on a machine, only when this one requests a task.
+
+This underline that, unsurprisingly, deploying a local
+coordinator and one or more warehouses near a cluster of workers can
+enhance computations and platform performances. 
+
  
+\section{Conclusion and future works}
  
+In this paper, we have presented a gridification of a real medical
+application, the Neurad application. This radiotherapy application
+tries to optimize the irradiated dose distribution within a
+patient. Based on a multi-layer neural network, this applications
+present a very time consuming step i.e. the learning step. Due to the
+computing characteristics of this step, we choose to parallelize it
+using the XtremWeb-CH global computing environment. Obtained
+experimental results show good speed-ups and underline that overheads
+induced by XWCH are very acceptable, letting it be a good candidate
+for deploying parallel applications over a global computing environment.
+
+Our future works, include the testing of the application on a more
+large scale testbed. This implies, the choice of a data input set
+allowing a finer decomposition. Unfortunately, this choice of input
+data is not trivial and relies on a large number of parameters
+(demander ici des précisions à Marc).
  
  \bibliographystyle{plain}
  \bibliography{biblio}