+\label{sec:neurad_xp}
+
+The aim of this section is to describe and analyze the experimental
+results we have obtained with the parallel Neurad version previously
+described. Our goal was to carry out this application with real input
+data and on a real global computing testbed.
+
+\subsubsection{Experimental conditions}
+\label{sec:neurad_cond}
+
+The size of the input data is about 2.4Gb. In order to avoid that data
+noise appears and disturbs the learning process, these data can be
+divided into, at most, 25 parts. This generates input data parts of
+about 15Mb (in a compressed format). The output data, which are
+retrieved after the process, are about 30Kb for each
+part. Unfortunately, the data decomposition limitation does not allow
+us to use more than 25 computers (XWCH workers). Nevertheless, we used two
+distinct deployments of XWCH:
+\begin{enumerate}
+
+\item In the first one, called ``distributed XWCH'' in the following,
+ the XWCH coordinator and the warehouses were located in Geneva,
+ Switzerland while the workers were running in the same local cluster
+ in Belfort, France.
+
+\item The second deployment, called ``local XWCH'' is a local
+ deployment where both coordinator, warehouses and workers were in
+ the same local cluster.
+
+\end{enumerate}
+For both deployments, during the day these machines were used by
+students of the Computer Science Department of the IUT of Belfort.
+
+In order to evaluate the overhead induced by the use of the platform
+we have furthermore compared the execution of the Neurad application
+with and without the XWCH platform. For the latter case, we mean that the
+testbed consists only in workers deployed with their respective data
+by the use of shell scripts. No specific middleware was used and the
+workers were in the same local cluster.
+
+Finally, five computation precisions were used: $1e^{-1}$, $0.75e^{-1}$,
+$0.50e^{-1}$, $0.25e^{-1}$, and $1e^{-2}$.
+
+
+\subsubsection{Results}
+\label{sec:neurad_result}
+
+Table \ref{tab:neurad_res} presents the execution times of the Neurad
+application on 25 machines with XWCH (local and distributed
+deployment) and without XWCH. These results correspond to the measures
+of the same steps for both kinds of execution, i.e. sending of local
+data and the executable, the learning process, and retrieving the
+results. Results represent the average time of $?? x ??$ executions.
+
+
+\begin{table}[h!]
+ \renewcommand{\arraystretch}{1.7}
+ \centering
+ \begin{tabular}[h!]{|c|c|c|c|c|}
+ \hline
+ ~Precision~ & ~1 machine~ & ~Without XWCH~ & ~With XWCH~ & ~With
+ local XWCH~ \\
+ \hline
+ $1e^{-1}$ & 5190 & 558 & 759 & 629\\
+ $0.75e^{-1}$ & 6307 & 792 & 1298 & 801 \\
+ $0.50e^{-1}$ & 7487 & 792 & 1010 & 844 \\
+ $0.25e^{-1}$ & 7787 & 791 & 1000 & 852\\
+ $1e^{-2}$ & 11030 & 1035 & 1447 & 1108 \\
+ \hline
+ \end{tabular}
+ \vspace{0.3cm}
+\caption{Execution time in seconds of the Neurad application, with and without using the XWCH platform}
+ \label{tab:neurad_res}
+\end{table}
+
+%\begin{table}[ht]
+% \centering
+% \begin{tabular}[h]{|c|c|c|}
+% \hline
+% Precision & Without XWCH & With XWCH \\
+% \hline
+% $1e^{-1}$ & $558$s & $759$s\\
+% \hline
+% \end{tabular}
+% \caption{Execution time in seconds of Neurad application, with and without using XtremWeb-CH platform}
+% \label{tab:neurad_res}
+%\end{table}
+
+
+As we can see, in the case of a local deployment the overhead induced
+by the use of the XWCH platform is about $7\%$. It is clearly a low
+overhead. Now, for the distributed deployment, the overhead is about
+$34\%$. Regarding to the benefits of the platform, it is a very
+acceptable overhead which can be explained by the following points.
+
+First, we point out that the conditions of executions are not really
+identical between with and without XWCH contexts. For this last one,
+though the same steps were done, all transfer processes are inside a
+local cluster with a high bandwidth and a low latency. Whereas when
+using XWCH, all transfer processes (between datawarehouses, workers,
+and the coordinator) used a wide network area with a smaller
+bandwidth. In addition, in executions without XWCH, all the machines
+started immediately the computation, whereas when using the XWCH
+platform, a latency is introduced by the fact that a computation
+starts on a machine, only when this one requests a task.
+
+This underlines that, unsurprisingly, deploying a local
+coordinator and one or more warehouses near a cluster of workers can
+enhance computations and platform performances.