From: Sébastien Miquée Date: Thu, 16 Dec 2010 15:53:58 +0000 (+0100) Subject: English correction and rephrasing. X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/gpc2011.git/commitdiff_plain/a1415d6476d377efc33ae06dac9c949ee467e0ee English correction and rephrasing. --- diff --git a/gpc2011.tex b/gpc2011.tex index 027142a..15928ff 100644 --- a/gpc2011.tex +++ b/gpc2011.tex @@ -68,113 +68,126 @@ %-------------INTRODUCTION-------------------- \section{Introduction} -The use of distributed architectures for solving large scientific problems seems -to become mandatory in a lot of cases. For example, in the domain of -radiotherapy dose computation the problem is crucial. The main goal of external -beam radiotherapy is the treatment of tumours while minimizing exposure to -healthy tissue. Dosimetric planning has to be carried out in order to optimize -the dose distribution within the patient is necessary. Thus, for determining the -most accurate dose distribution during treatment planning, a compromise must be -found between the precision and the speed of calculation. Current techniques, -using analytic methods, models and databases, are rapid but lack -precision. Enhanced precision can be achieved by using calculation codes based, -for example, on Monte Carlo methods. In [] the authors proposed a novel approach -based on the use of neural networks. The approach is based on the collaboration -of computation codes and multi-layer neural networks used as universal -approximators. It provides a fast and accurate evaluation of radiation doses in -any given environment for given irradiation parameters. As the learning step is -often very time consumming, in \cite{bcvsv08:ip} the authors proposed a parallel -algorithm that enable to decompose the learning domain into subdomains. The -decomposition has the advantage to significantly reduce the complexity of the -target functions to approximate. - -Now, as there exist several classes of distributed/parallel architectures -(supercomputers, clusters, global computing...) we have to choose the best -suited one for the parallel Neurad application. The Global or Volunteer -computing model seems to be an interesting approach. Here, the computing power -is obtained by agregating unused (or volunteer) public resources connected to -the Internet. For our case, we can imagine for example, that a part of the -architecture will be composed of some of the different computers of the -hospital. This approach present the advantage to be clearly cheaper than a more -dedicated approach like the use of supercomputer or clusters. - -The aim of this paper is to propose and evaluate a gridification of the Neurad -application (more precisely, of the most time consuming part, the learning step) -using a Global computing approach. For this, we focus on the XtremWeb-CH -environnement []. We choose this environnent because it tackles the centralized -aspect of other global computing environments such as XTremWeb [] or Seti []. It -tends to a peer-to-peer approach by distributing some components of the -architecture. For instance, the computing nodes are allowed to directly -communicate. Experimentations were conducted on a real Global Computing -testbed. The results are very encouraging. They exhibit an interesting speed-up -and show that the overhead induced by the use of XTremWeb-CH is very acceptable. - -The paper is organized as follows. In section 2 we present the Neurad -application and particularly it most time consuming part i.e. the learning -step. Section 3 details the XtremWeb-CH environnement while in section 4 we -expose the gridification of the Neurad application. Experimental results are -presented in section 5 and we end in section 6 by some concluding remarks and -perspectives. +The use of distributed architectures for solving large scientific +problems seems to become mandatory in a lot of cases. For example, in +the domain of radiotherapy dose computation the problem is +crucial. The main goal of external beam radiotherapy is the treatment +of tumors while minimizing exposure to healthy tissue. Dosimetric +planning has to be carried out in order to optimize the dose +distribution within the patient is necessary. Thus, to determine the +most accurate dose distribution during treatment planning, a +compromise must be found between the precision and the speed of +calculation. Current techniques, using analytic methods, models and +databases, are rapid but lack precision. Enhanced precision can be +achieved by using calculation codes based, for example, on Monte Carlo +methods. In [] the authors proposed a novel approach using neural +networks. This approach is based on the collaboration of computation +codes and multi-layer neural networks used as universal +approximators. It provides a fast and accurate evaluation of radiation +doses in any given environment for given irradiation parameters. As +the learning step is often very time consuming, in \cite{bcvsv08:ip} +the authors proposed a parallel algorithm that enable to decompose the +learning domain into subdomains. The decomposition has the advantage +to significantly reduce the complexity of the target functions to +approximate. + +Now, as there exist several classes of distributed/parallel +architectures (supercomputers, clusters, global computing...) we have +to choose the best suited one for the parallel Neurad application. +The Global or Volunteer Computing model seems to be an interesting +approach. Here, the computing power is obtained by aggregating unused +(or volunteer) public resources connected to the Internet. For our +case, we can imagine for example, that a part of the architecture will +be composed of some of the different computers of the hospital. This +approach present the advantage to be clearly cheaper than a more +dedicated approach like the use of supercomputers or clusters. + +The aim of this paper is to propose and evaluate a gridification of +the Neurad application (more precisely, of the most time consuming +part, the learning step) using a Global Computing approach. For this, +we focus on the XtremWeb-CH environment []. We choose this environment +because it tackles the centralized aspect of other global computing +environments such as XtremWeb [] or Seti []. It tends to a +peer-to-peer approach by distributing some components of the +architecture. For instance, the computing nodes are allowed to +directly communicate. Experiments were conducted on a real Global +Computing testbed. The results are very encouraging. They exhibit an +interesting speed-up and show that the overhead induced by the use of +XtremWeb-CH is very acceptable. + +The paper is organized as follows. In Section 2 we present the Neurad +application and particularly it most time consuming part i.e. the +learning step. Section 3 details the XtremWeb-CH environment while in +Section 4 we expose the gridification of the Neurad +application. Experimental results are presented in Section 5 and we +end in Section 6 by some concluding remarks and perspectives. \section{The Neurad application} \begin{figure}[http] \centering \includegraphics[width=0.7\columnwidth]{figures/neurad.pdf} - \caption{The Neurad projects} + \caption{The Neurad project} \label{f_neurad} \end{figure} -The \emph{Neurad}~\cite{Neurad} project presented in this paper takes place in a -multi-disciplinary project , involving medical physicists and computer -scientists whose goal is to enhance the treatment planning of cancerous tumors -by external radiotherapy. In our previous -works~\cite{RADIO09,ICANN10,NIMB2008}, we have proposed an original approach to -solve scientific problems whose accurate modeling and/or analytical description -are difficult. That method is based on the collaboration of computational codes -and neural networks used as universal interpolator. Thanks to that method, the -\emph{Neurad} software provides a fast and accurate evaluation of radiation -doses in any given environment (possibly inhomogeneous) for given irradiation -parameters. We have shown in a previous work (\cite{AES2009}) the interest to -use a distributed algorithm for the neural network learning. We use a classical -RPROP algorithm with a HPU topology to do the training of our neural network. - -The Figure~\ref{f_neurad} presents the {\it{Neurad}} scheme. Three parts are -clearly independant : the initial data production, the learning process and the -dose deposit evaluation. The first step, the data production, is outside the -{\it{Neurad}} project. They are many solutions to obtains data about the -radiotherapy treatments like the measure or the simulation. The only essential -criterion is that the result must be obtain in a homogeneous environment. We -have chosen to use only a Monte Carlo simulation because this tools are the -references in the radiotherapy domains. The advantages to use data obtain with a -Monte Carlo simulator are the following : accuracy, profusing, quantify error -and regularity of measure point. But, they are too disagreement and the most -important is the statistical noise forcing a data post treatment. The -Figure~\ref{f_tray} present the general behavior of a dose deposit in water. +The \emph{Neurad}~\cite{Neurad} project presented in this paper takes +place in a multi-disciplinary project, involving medical physicists +and computer scientists whose goal is to enhance the treatment +planning of cancerous tumors by external radiotherapy. In our +previous works~\cite{RADIO09,ICANN10,NIMB2008}, we have proposed an +original approach to solve scientific problems whose accurate modeling +and/or analytical description are difficult. That method is based on +the collaboration of computational codes and neural networks used as +universal interpolator. Thanks to that method, the \emph{Neurad} +software provides a fast and accurate evaluation of radiation doses in +any given environment (possibly inhomogeneous) for given irradiation +parameters. We have shown in a previous work (\cite{AES2009}) the +interest to use a distributed algorithm for the neural network +learning. We use a classical RPROP algorithm with a HPU topology to do +the training of our neural network. + +Figure~\ref{f_neurad} presents the {\it{Neurad}} scheme. Three parts +are clearly independent: the initial data production, the learning +process and the dose deposit evaluation. The first step, the data +production, is outside the {\it{Neurad}} project. They are many +solutions to obtain data about the radiotherapy treatments like the +measure or the simulation. The only essential criterion is that the +result must be obtain in a homogeneous environment. We have chosen to +use only a Monte Carlo simulation because this kind of tool is the +reference in the radiotherapy domains. The advantages to use data +obtained with a Monte Carlo simulator are the following: accuracy, +profusion, quantified error and regularity of measure points. But, +there exist also some disagreements and the most important is the +statistical noise, forcing a data post treatment. Figure~\ref{f_tray} +presents the general behavior of a dose deposit in water. \begin{figure}[http] \centering \includegraphics[width=0.7\columnwidth]{figures/testC.pdf} - \caption{Dose deposit by a photon beam of 24 mm of width in water (Normalized value). } + \caption{Dose deposit by a photon beam of 24 mm of width in water (normalized value).} \label{f_tray} \end{figure} -The secondary stage of the {\it{Neurad}} project is about the learning step and -it is the most time consuming step. This step is off-line but is it important to -reduce the time used for the learning process to keep a workable tools. Indeed, -if the learning time is too important (for the moment, this time could reach one -week for a limited works domain), the use of this process could be be limited -only at a major modification of the use context. However, it is interesting to -do an update to the learning process when the bound of the learning domain -evolves (evolution in material used for the prosthesis or evolution on the beam -(size, shape or energy)). The learning time is linked with the volume of data -who could be very important in real medical context. We have work to reduce -this learning time with a parallel method of the learning process using a -partitioning method of the global dataset. The goal of this method is to train -many neural networks on sub-domain of the global dataset. After this training, -the use of this neural networks together allows to obtain a response for the -global domain of study. +The secondary stage of the {\it{Neurad}} project is the learning step +and this is the most time consuming step. This step is off-line but it +is important to reduce the time used for the learning process to keep +a workable tool. Indeed, if the learning time is too huge (for the +moment, this time could reach one week for a limited domain), this +process should not be launched at any time, but only when a major +modification occurs in the environment, like a change of context for +instance. However, it is interesting to update the knowledge of the +neural network, by using the learning process, when the domain evolves +(evolution in material used for the prosthesis or evolution on the +beam (size, shape or energy)). The learning time is related to the +volume of data who could be very important in a real medical context. +A work has been done to reduce this learning time with the +parallelization of the learning process by using a partitioning method +of the global dataset. The goal of this method is to train many neural +networks on sub-domains of the global dataset. After this training, +the use of these neural networks all together allows to obtain a +response for the global domain of study. \begin{figure}[h] @@ -186,25 +199,25 @@ global domain of study. \end{figure} -However, performing the learnings on sub-domains constituting a partition of the -initial domain is not satisfying according to the quality of the results. This -comes from the fact that the accuracy of the approximation performed by a neural -network is not constant over the learned domain. Thus, it is necessary to use -an overlapping of the sub-domains. The overall principle is depicted in -Figure~\ref{fig:overlap}. In this way, each sub-network has an exploitation -domain smaller than its training domain and the differences observed at the -borders are no longer relevant. Nonetheless, in order to preserve the -performances of the parallel algorithm, it is important to carefully set the -overlapping ratio $\alpha$. It must be large enough to avoid the border's -errors, and as small as possible to limit the size increase of the data subsets. - - +However, performing the learning on sub-domains constituting a +partition of the initial domain is not satisfying according to the +quality of the results. This comes from the fact that the accuracy of +the approximation performed by a neural network is not constant over +the learned domain. Thus, it is necessary to use an overlapping of +the sub-domains. The overall principle is depicted in +Figure~\ref{fig:overlap}. In this way, each sub-network has an +exploitation domain smaller than its training domain and the +differences observed at the borders are no longer relevant. +Nonetheless, in order to preserve the performance of the parallel +algorithm, it is important to carefully set the overlapping ratio +$\alpha$. It must be large enough to avoid the border's errors, and +as small as possible to limit the size increase of the data subsets. \section{The XtremWeb-CH environment} -\section{The XtremWeb-CH environment (\textit{XWCH})} \input{xwch.tex} + \section{Experimental results} \section{Conclusion and future works} diff --git a/xwch.tex b/xwch.tex index d38b719..ea36ed9 100644 --- a/xwch.tex +++ b/xwch.tex @@ -1,53 +1,72 @@ -%----------------------------------------------------------------------------------------------------------------------------- +%------------------------------------ % The XtremWeb-CH environment -%----------------------------------------------------------------------------------------------------------------------------- -XtremWeb-CH (XWCH) is a volunteer computing inspired, large-scale computing platform for distributed applications. It consist of three -components: one coordinator, a set of workers and at least one warehouse. Client programs utilise these components. +%------------------------------------ +XtremWeb-CH (XWCH) is a volunteer computing inspired, large-scale +computing platform for distributed applications. It consists of three +components: one coordinator, a set of workers and at least one +warehouse. Client programs use these components. -The coordinator is the main component of the XWCH platform. It controls user access and schedules jobs to workers. It provides a web -interface for managing jobs and users, and a set of web services. These are user service and worker/warehouse services implemented using -WSDL \cite{WebServ2002}. +The coordinator is the main component of the XWCH platform. It +controls user access and schedules jobs to workers. It provides a web +interface for managing jobs and users, and a set of web +services. These are user service and worker/warehouse services +implemented using WSDL \cite{WebServ2002}. -A worker is a Java daemon that runs on the user machine. Assumed to be volatile, the workers reports periodically -themselves to the coordinator, accept jobs, retrieve input, compute jobs, and store the results of the computation on warehouses. If the -coordinator does not receive a signal from a worker, it will simply remove it from the scheduling list, and if a job had been assigned to that -worker, it will be re-assigned to another one. A schema of the architecture is shown in Figure 4. +A worker is a Java daemon that runs on the user machine. Assumed to be +volatile, the workers report periodically themselves to the +coordinator, accept jobs, retrieve input, compute jobs, and store the +results of the computation on warehouses. If the coordinator does not +receive a signal from a worker, it will simply remove it from the +scheduling list, and if a job had been assigned to that worker, it +will be re-assigned to another one. A schema of the architecture is +shown in Figure 4. -\begin{figure}[hb] +\begin{figure}[htp] \begin{centering} \includegraphics [scale=0.2]{figures/xwcharchitecture.pdf} \caption{The XWCH Architecture} - % \label{Figure 4: The XWCH Architecture} \end{centering} \end{figure} -A warehouse is a file server that acts as a data storage system for workers and client programs. -Workers may not necessarily be able to communicate directly with each other, due to firewalls and NAT subnetworks. -For these reasons, warehouses are used as intermediaries to exchange, store and retrieve data. +A warehouse is a file server that acts as a data storage system for +workers and client programs. Workers may not necessarily be able to +communicate directly with each other, due to firewalls and NAT +sub-networks. For these reasons, warehouses are used as intermediaries +to exchange, store and retrieve data. -Job submission is done by a client program which is written using a flexible API, available for Java and C/C++ programs. The client program -runs on a “client node” and calls the user services to submit jobs (Figure 1, (1)). The main flexibility provided by the use of this -architecture is to control and generate dynamically jobs especially when their number can not be known in advance. Communications between -the coordinator and the workers are always initiated by the workers following a pull model (Figure 1, (2)): +Job submission is done by a client program which is written using a +flexible API, available for Java and C/C++ programs. The client +program runs on a “client node” and calls the user services to submit +jobs (Figure 1, (1)). The main flexibility provided by the use of this +architecture is to control and generate dynamically jobs especially +when their number cannot be known in advance. Communications between +the coordinator and the workers are always initiated by the workers +following a pull model (Figure 1, (2)): \begin{itemize} - \item Workers receive jobs (Figure 1, (3)) only if they send a “work request” signal. - \item When a worker finishes its job, it stores its output file on warehouse and sends a “work result” signal to the coordinator. - \item During its execution, a worker (respectively warehouse) periodically sends “work alive” to the worker service (respectively warehouse service) -to report itself to the coordinator. +\item Workers receive jobs (Figure 1, (3)) only if they send a “work + request” signal; +\item When a worker finishes its job, it stores its output file on + warehouse and sends a “work result” signal to the coordinator; +\item During its execution, a worker (respectively warehouse) + periodically sends “work alive” to the worker services (respectively + warehouse services) to report itself to the coordinator. \end{itemize} -As a whole, XWCH is easy to install, maintain ans use. Its components are programmed mainly using Java, and their process memory sizes in a typical 32-bit Linux computer are shown below. + +As a whole, XWCH is easy to install, maintain ans use. Its components +are programmed mainly using Java, and their process memory sizes in a +typical 32-bit GNU/Linux computer are: \begin{itemize} - \item Coordinator 190 MB including the Glassfish Java container - \item Worker 40 MB - \item Warehouse 80 MB + \item Coordinator 190 MB including the Glassfish Java container; + \item Worker 40 MB; + \item Warehouse 80 MB. \end{itemize} -Experiments, presented in \cite{ccgridpaper}, shows that the performance of XWCH is comparable with Condor \cite{Condor1988}, another -non-intrusive computing system that has similar functionality but is somewhat more difficult to install . - -The main characteristics of the new version of XWCH, compared to its previous versions, are: dynamic job generation, flexible data -sharing (data replication) and persistent jobs. These features are presented in \cite{VEZGrid} and will not be detailed in this paper. - - - +Experiments presented in \cite{ccgridpaper} show that the +performance of XWCH is comparable with Condor \cite{Condor1988}, +another non-intrusive computing system that has similar functionality +but is somewhat more difficult to install. +The main characteristics of the new version of XWCH, compared to +previous ones, are: dynamic job generation, flexible data sharing +(data replication) and persistent jobs. These features are presented +in \cite{VEZGrid} and will not be detailed in this paper.