4 %\documentclass[a4paper,11pt]{article}
6 \usepackage[T1]{fontenc}
7 \usepackage[utf8]{inputenc}
8 \usepackage{graphicx,subfigure,graphics}
10 %\usepackage[usenames]{color}
11 %\usepackage{latexsym,stmaryrd}
12 %\usepackage{amsfonts,amssymb}
13 \usepackage{verbatim,theorem,moreverb}
14 %\usepackage{float,floatflt}
15 \usepackage{boxedminipage}
21 \usepackage{algorithm}
22 \usepackage{algorithmic}
23 %\usepackage{floatfig}
28 \def\sfixme#1{\fbox{\textbf{FIXME: }#1}}
30 \newcommand{\fixme}[1]{%
32 \begin{boxedminipage}{.8\linewidth}
37 \newcommand{\FIXME}[1]{\marginpar[\null\hspace{2cm} FIXME]{FIXME} \fixme{#1}}
39 %\psfigurepath{.:fig:IMAGES}
40 \graphicspath{{.}{fig/}{IMAGES/}}
46 \title{Gridification of a Radiotherapy Dose Computation Application with the XtremWeb-CH Environment}
49 \author{Nabil Abdennhader\inst{1} \and Mohamed Ben Belgacem\inst{1} \and Raphaël Couturier\inst{2} \and
50 David Laiymani\inst{2} \and Sébastien Miquée\inst{2} \and Marko Niinimaki\inst{1} \and Marc Sauget\inst{3}}
53 University of Applied Sciences Western Switzerland, hepia Geneva,
55 \email{nabil.abdennadher@hesge.ch,mohamed.benbelgacem@unige.ch,markopekka.niinimaeki@hesge.ch}
57 Laboratoire d'Informatique de l'universit\'{e}
58 de Franche-Comt\'{e} \\
59 IUT Belfort-Montbéliard, Rue Engel Gros, 90016 Belfort - France \\
60 \email{\{raphael.couturier,david.laiymani,sebastien.miquee\}@univ-fcomte.fr}
62 FEMTO-ST, ENISYS/IRMA, F-25210 Montb\'{e}liard , FRANCE\\
63 \email{marc.sauget@univ-fcomte.fr}
70 This paper presents the design and the evaluation of the
71 gridification of a radiotherapy dose computation application. Due to
72 the inherent characteristics of the application and its execution,
73 we choose the architectural context of global (or volunteer)
74 computing. For this, we used the XtremWeb-CH
75 environment. Experiments were conducted on a real global computing
76 testbed and show good speed-ups and very acceptable platform
77 overhead letting XtremWeb-CH be a good candidate for deploying
78 parallel applications over a global computing environment.
82 %-------------INTRODUCTION--------------------
83 \section{Introduction}
85 The use of distributed architectures for solving large scientific
86 problems seems to become mandatory in a lot of cases. For example, in
87 the domain of radiotherapy dose computation the problem is
88 crucial. The main goal of external beam radiotherapy is the treatment
89 of tumors while minimizing exposure to healthy tissue. Dosimetric
90 planning has to be carried out in order to optimize the dose
91 distribution within the patient. Thus, to determine the most accurate
92 dose distribution during treatment planning, a compromise must be
93 found between the precision and the speed of calculation. Current
94 techniques, using analytic methods, models and databases, are rapid
95 but lack precision. Enhanced precision can be achieved by using
96 calculation codes based, for example, on Monte Carlo methods. The main
97 drawback of these methods is their computation times which can be
98 rapidly huge. In \cite{NIMB2008} the authors proposed a novel approach, called
99 Neurad, using neural networks. This approach is based on the
100 collaboration of computation codes and multi-layer neural networks
101 used as universal approximators. It provides a fast and accurate
102 evaluation of radiation doses in any given environment for given
103 irradiation parameters. As the learning step is often very time
104 consuming, in \cite{AES2009} the authors proposed a parallel
105 algorithm that enables to decompose the learning domain into
106 subdomains. The decomposition has the advantage to significantly
107 reduce the complexity of the target functions to approximate.
109 Now, as there exist several classes of distributed/parallel
110 architectures (supercomputers, clusters, global computing...) we have
111 to choose the best suited one for the parallel Neurad application.
112 The Global or Volunteer Computing model seems to be an interesting
113 approach. Here, the computing power is obtained by aggregating unused
114 (or volunteer) public resources connected to the Internet. For our
115 case, we can imagine for example, that a part of the architecture will
116 be composed of some of the different computers of the hospital. This
117 approach presents the advantage to be clearly cheaper than a more
118 dedicated approach like the use of supercomputers or clusters.
120 The aim of this paper is to propose and evaluate a gridification of
121 the Neurad application (more precisely, of the most time consuming
122 part, the learning step) using a Global Computing approach. For this,
123 we focus on the XtremWeb-CH environment\cite{}. We choose this environment
124 because it tackles the centralized aspect of other global computing
125 environments such as XtremWeb\cite{} or Seti\cite{}. It tends to a
126 peer-to-peer approach by distributing some components of the
127 architecture. For instance, the computing nodes are allowed to
128 directly communicate. Experiments were conducted on a real Global
129 Computing testbed. The results are very encouraging. They exhibit an
130 interesting speed-up and show that the overhead induced by the use of
131 XtremWeb-CH is very acceptable.
133 The paper is organized as follows. In Section 2 we present the Neurad
134 application and particularly its most time consuming part, i.e. the
135 learning step. Section 3 details the XtremWeb-CH environment and
136 Section 4 exposes the gridification of the Neurad
137 application. Experimental results are presented in Section 5 and we
138 end in Section 6 by some concluding remarks and perspectives.
140 \section{The Neurad application}
144 \includegraphics[width=0.7\columnwidth]{figures/neurad.pdf}
145 \caption{The Neurad project}
149 The \emph{Neurad}~\cite{Neurad} project presented in this paper takes place in a
150 multi-disciplinary project, involving medical physicists and computer scientists
151 whose goal is to enhance the treatment planning of cancerous tumors by external
152 radiotherapy. In our previous works~\cite{RADIO09,ICANN10,NIMB2008}, we have
153 proposed an original approach to solve scientific problems whose accurate
154 modeling and/or analytical description are difficult. That method is based on
155 the collaboration of computational codes and neural networks used as universal
156 interpolator. Thanks to that method, the \emph{Neurad} software provides a fast
157 and accurate evaluation of radiation doses in any given environment (possibly
158 inhomogeneous) for given irradiation parameters. We have shown in a previous
159 work (\cite{AES2009}) the interest to use a distributed algorithm for the neural
160 network learning. We use a classical RPROP~\footnote{Resilient backpropagation}
161 algorithm with a HPU~\footnote{High order processing units} topology to do the
162 training of our neural network.
164 Figure~\ref{f_neurad} presents the {\it{Neurad}} scheme. Three parts are clearly
165 independent: the initial data production, the learning process and the dose
166 deposit evaluation. The first step, the data production, is outside of the
167 {\it{Neurad}} project. They are many solutions to obtain data about the
168 radiotherapy treatments like the measure or the simulation. The only essential
169 criterion is that the result must be obtained in an homogeneous environment.
172 % use only a Monte Carlo simulation because this kind of tool is the
173 % reference in the radiotherapy domains. The advantages to use data
174 % obtained with a Monte Carlo simulator are the following: accuracy,
175 % profusion, quantified error and regularity of measure points. But,
176 % there exist also some disagreements and the most important is the
177 % statistical noise, forcing a data post treatment. Figure~\ref{f_tray}
178 % presents the general behavior of a dose deposit in water.
181 % \begin{figure}[http]
183 % \includegraphics[width=0.7\columnwidth]{figures/testC.pdf}
184 % \caption{Dose deposit by a photon beam of 24 mm of width in water (normalized value).}
188 The secondary stage of the {\it{Neurad}} project is the learning step and this
189 is the most time consuming step. This step is performed off-line but it is
190 important to reduce the time used for the learning process to keep a workable
191 tool. Indeed, if the learning time is too huge (for the moment, this time could
192 reach one week for a limited domain), this process should not be launched at any
193 time, but only when a major modification occurs in the environment, like a
194 change of context for instance. However, it is interesting to update the
195 knowledge of the neural network, by using the learning process, when the domain
196 evolves (evolution in material used for the prosthesis or evolution on the beam
197 (size, shape or energy)). The learning time is related to the volume of data who
198 could be very important in a real medical context. A work has been done to
199 reduce this learning time with the parallelization of the learning process by
200 using a partitioning method of the global dataset. The goal of this method is to
201 train many neural networks on sub-domains of the global dataset. After this
202 training, the use of these neural networks all together allows to obtain a
203 response for the global domain of study.
208 \includegraphics[width=0.5\columnwidth]{figures/overlap.pdf}
209 \caption{Overlapping for a sub-network in a two-dimensional domain with ratio
214 % j'ai relu mais pas vu le probleme
216 However, performing the learning on sub-domains constituting a partition of the
217 initial domain is not satisfying according to the quality of the results. This
218 comes from the fact that the accuracy of the approximation performed by a neural
219 network is not constant over the learned domain. Thus, it is necessary to use an
220 overlapping of the sub-domains. The overall principle is depicted in
221 Figure~\ref{fig:overlap}. In this way, each sub-network has an exploitation
222 domain smaller than its training domain and the differences observed at the
223 borders are no longer relevant. Nonetheless, in order to preserve the
224 performance of the parallel algorithm, it is important to carefully set the
225 overlapping ratio $\alpha$. It must be large enough to avoid the border's
226 errors, and as small as possible to limit the size increase of the data
227 subsets~\cite{AES2009}.
229 %(Qu'en est-il pour nos test ?).
230 % Ce paramètre a deja été etudié dans un précédent papier, il a donc choisi d'être fixe
234 \section{The XtremWeb-CH environment}
237 \section{The Neurad gridification}
239 \label{sec:neurad_gridif}
242 As previously exposed, the Neurad application can be divided into
243 three steps. The goal of the first step is to decompose the data
244 representing the dose distribution on an area. This area contains
245 various parameters, like the nature of the medium and its
246 density. This part is out of the scope of this paper.
247 %Multiple ``views'' can be
248 %superposed in order to obtain a more accurate learning.
250 The second step of the application, and the most time consuming, is
251 the learning itself. This is the one which has been parallelized,
252 using the XWCH environment. As exposed in the section 2, the
253 parallelization relies on a partitionning of the global
254 dataset. Following this partitionning all learning tasks are executed
255 in parallel independently with their own local data part, with no
256 communication, following the fork/join model. Clearly, this
257 computation fits well with the model of the chosen middleware.
259 The execution scheme is then the following (see Figure
260 \ref{fig:neurad_grid}):
262 \item We first send the learning application and its data to the
263 middleware (more precisely on warehouses (DW)) and create the
265 \item When a worker (W) is ready to compute, it requests a task to
266 execute to the coordinator (Coord.);
267 \item The coordinator assigns the worker a task. This last one retrieves the
268 application and its assigned data and so can start the computation.
269 \item At the end of the learning process, the worker sends the result to a warehouse.
272 The last step of the application is to retrieve these results (some
273 weighted neural networks) and exploit them through a dose distribution
274 process. This latter step is out of the scope of this paper.
279 \includegraphics[width=8cm]{figures/neurad_gridif}
280 \caption{The proposed Neurad gridification}
281 \label{fig:neurad_grid}
284 \section{Experimental results}
285 \label{sec:neurad_xp}
287 The aim of this section is to describe and analyze the experimental
288 results we have obtained with the parallel Neurad version previously
289 described. Our goal was to carry out this application with real input
290 data and on a real global computing testbed.
292 \subsubsection{Experimental conditions}
293 \label{sec:neurad_cond}
295 The size of the input data is about 2.4Gb. In order to avoid that data
296 noise appears and disturbs the learning process, these data can be
297 divided into, at most, 25 parts. This generates input data parts of
298 about 15Mb (in a compressed format). The output data, which are
299 retrieved after the process, are about 30Kb for each
300 part. Unfortunately, the data decomposition limitation does not allow
301 us to use more than 25 computers (XWCH workers). Nevertheless, we used two
302 distinct deployments of XWCH:
305 \item In the first one, called ``distributed XWCH'' in the following,
306 the XWCH coordinator and the warehouses were located in Geneva,
307 Switzerland while the workers were running in the same local cluster
310 \item The second deployment, called ``local XWCH'' is a local
311 deployment where both coordinator, warehouses and workers were in
312 the same local cluster.
315 For both deployments, during the day these machines were used by
316 students of the Computer Science Department of the IUT of Belfort.
318 In order to evaluate the overhead induced by the use of the platform
319 we have furthermore compared the execution of the Neurad application
320 with and without the XWCH platform. For the latter case, we mean that the
321 testbed consists only in workers deployed with their respective data
322 by the use of shell scripts. No specific middleware was used and the
323 workers were in the same local cluster.
325 Finally, five computation precisions were used: $1e^{-1}$, $0.75e^{-1}$,
326 $0.50e^{-1}$, $0.25e^{-1}$, and $1e^{-2}$.
329 \subsubsection{Results}
330 \label{sec:neurad_result}
332 Table \ref{tab:neurad_res} presents the execution times of the Neurad
333 application on 25 machines with XWCH (local and distributed
334 deployment) and without XWCH. These results correspond to the measures
335 of the same steps for both kinds of execution, i.e. sending of local
336 data and the executable, the learning process, and retrieving the
337 results. Results represent the average time of $?? x ??$ executions.
341 \renewcommand{\arraystretch}{1.7}
343 \begin{tabular}[h!]{|c|c|c|c|c|}
345 ~Precision~ & ~1 machine~ & ~Without XWCH~ & ~With XWCH~ & ~With
348 $1e^{-1}$ & 5190 & 558 & 759 & 629\\
349 $0.75e^{-1}$ & 6307 & 792 & 1298 & 801 \\
350 $0.50e^{-1}$ & 7487 & 792 & 1010 & 844 \\
351 $0.25e^{-1}$ & 7787 & 791 & 1000 & 852\\
352 $1e^{-2}$ & 11030 & 1035 & 1447 & 1108 \\
356 \caption{Execution time in seconds of the Neurad application, with and without using the XWCH platform}
357 \label{tab:neurad_res}
362 % \begin{tabular}[h]{|c|c|c|}
364 % Precision & Without XWCH & With XWCH \\
366 % $1e^{-1}$ & $558$s & $759$s\\
369 % \caption{Execution time in seconds of Neurad application, with and without using XtremWeb-CH platform}
370 % \label{tab:neurad_res}
374 As we can see, in the case of a local deployment the overhead induced
375 by the use of the XWCH platform is about $7\%$. It is clearly a low
376 overhead. Now, for the distributed deployment, the overhead is about
377 $34\%$. Regarding to the benefits of the platform, it is a very
378 acceptable overhead which can be explained by the following points.
380 First, we point out that the conditions of executions are not really
381 identical between with and without XWCH contexts. For this last one,
382 though the same steps were done, all transfer processes are inside a
383 local cluster with a high bandwidth and a low latency. Whereas when
384 using XWCH, all transfer processes (between datawarehouses, workers,
385 and the coordinator) used a wide network area with a smaller
386 bandwidth. In addition, in executions without XWCH, all the machines
387 started immediately the computation, whereas when using the XWCH
388 platform, a latency is introduced by the fact that a computation
389 starts on a machine, only when this one requests a task.
391 This underlines that, unsurprisingly, deploying a local
392 coordinator and one or more warehouses near a cluster of workers can
393 enhance computations and platform performances.
396 \section{Conclusion and future works}
398 In this paper, we have presented a gridification of a real medical
399 application, the Neurad application. This radiotherapy application
400 tries to optimize the irradiated dose distribution within a
401 patient. Based on a multi-layer neural network, this application
402 presents a very time consuming step, i.e. the learning step. Due to the
403 computing characteristics of this step, we choose to parallelize it
404 using the XtremWeb-CH global computing environment. Obtained
405 experimental results show good speed-ups and underline that overheads
406 induced by XWCH are very acceptable, letting it be a good candidate
407 for deploying parallel applications over a global computing environment.
409 Our future works include the testing of the application on a more
410 large scale testbed. This implies, the choice of a data input set
411 allowing a finer decomposition. Unfortunately, this choice of input
412 data is not trivial and relies on a large number of parameters
414 %(demander ici des précisions à Marc).
415 % Si tu veux parler de l'ensembles des paramètres que l'on peut utiliser pour caractériser les conditions d'irradiations
417 % - caracteristiques du faisceaux d'irradiation (beam size (de quelques mm à plus de 40 cm), energy, SSD (source surface distance),
418 % - caractéritiques de la matière : density
422 \bibliographystyle{plain}
423 \bibliography{biblio}