+As detailed in the reference article (available at
+http://hal.inria.fr/hal-01415484), you may want to adapt your code
+to improve the simulation performance. But these tricks may seriously
+hinder the result quality (or even prevent the app to run) if used
+wrongly. We assume that if you want to simulate an HPC application,
+you know what you are doing. Don't prove us wrong!
+
+@subsection SMPI_adapting_size Reducing your memory footprint
+
+If you get short on memory (the whole app is executed on a single node when
+simulated), you should have a look at the SMPI_SHARED_MALLOC and
+SMPI_SHARED_FREE macros. It allows to share memory areas between processes: The
+purpose of these macro is that the same line malloc on each process will point
+to the exact same memory area. So if you have a malloc of 2M and you have 16
+processes, this macro will change your memory consumption from 2M*16 to 2M
+only. Only one block for all processes.
+
+If your program is ok with a block containing garbage value because all
+processes write and read to the same place without any kind of coordination,
+then this macro can dramatically shrink your memory consumption. For example,
+that will be very beneficial to a matrix multiplication code, as all blocks will
+be stored on the same area. Of course, the resulting computations will useless,
+but you can still study the application behavior this way.
+
+Naturally, this won't work if your code is data-dependent. For example, a Jacobi
+iterative computation depends on the result computed by the code to detect
+convergence conditions, so turning them into garbage by sharing the same memory
+area between processes does not seem very wise. You cannot use the
+SMPI_SHARED_MALLOC macro in this case, sorry.
+
+This feature is demoed by the example file
+<tt>examples/smpi/NAS/dt.c</tt>
+
+@subsection SMPI_adapting_speed Toward faster simulations
+
+If your application is too slow, try using SMPI_SAMPLE_LOCAL,
+SMPI_SAMPLE_GLOBAL and friends to indicate which computation loops can
+be sampled. Some of the loop iterations will be executed to measure
+their duration, and this duration will be used for the subsequent
+iterations. These samples are done per processor with
+SMPI_SAMPLE_LOCAL, and shared between all processors with
+SMPI_SAMPLE_GLOBAL. Of course, none of this will work if the execution
+time of your loop iteration are not stable.
+
+This feature is demoed by the example file
+<tt>examples/smpi/NAS/ep.c</tt>
+
+@section SMPI_accuracy Ensuring accurate simulations
+
+Out of the box, SimGrid may give you fairly accurate results, but
+there is a plenty of factors that could go wrong and make your results
+inaccurate or even plainly wrong. Actually, you can only get accurate
+results of a nicely built model, including both the system hardware
+and your application. Such models are hard to pass over and reuse in
+other settings, because elements that are not relevant to an
+application (say, the latency of point-to-point communications,
+collective operation implementation details or CPU-network
+interaction) may be irrelevant to another application. The dream of
+the perfect model, encompassing every aspects is only a chimera, as
+the only perfect model of the reality is the reality. If you go for
+simulation, then you have to ignore some irrelevant aspects of the
+reality, but which aspects are irrelevant is actually
+application-dependent...
+
+The only way to assess whether your settings provide accurate results
+is to double-check these results. If possible, you should first run
+the same experiment in simulation and in real life, gathering as much
+information as you can. Try to understand the discrepancies in the
+results that you observe between both settings (visualization can be
+precious for that). Then, try to modify your model (of the platform,
+of the collective operations) to reduce the most preeminent differences.
+
+If the discrepancies come from the computing time, try adapting the \c
+smpi/host-speed: reduce it if your simulation runs faster than in
+reality. If the error come from the communication, then you need to
+fiddle with your platform file.
+
+Be inventive in your modeling. Don't be afraid if the names given by
+SimGrid does not match the real names: we got very good results by
+modeling multicore/GPU machines with a set of separate hosts
+interconnected with very fast networks (but don't trust your model
+because it has the right names in the right place either).
+
+Finally, you may want to check [this
+article](https://hal.inria.fr/hal-00907887) on the classical pitfalls
+in modeling distributed systems.
+
+@section SMPI_troubleshooting Troubleshooting with SMPI
+
+@subsection SMPI_trouble_configure_refuses_smpicc ./configure refuses to use smpicc
+
+If your <tt>./configure</tt> reports that the compiler is not
+functional or that you are cross-compiling, try to define the
+<tt>SMPI_PRETEND_CC</tt> environment variable before running the
+configuration.
+
+@verbatim
+SMPI_PRETEND_CC=1 ./configure # here come the configure parameters
+make
+@endverbatim
+
+Indeed, the programs compiled with <tt>smpicc</tt> cannot be executed
+without <tt>smpirun</tt> (they are shared libraries, and they do weird
+things on startup), while configure wants to test them directly.
+With <tt>SMPI_PRETEND_CC</tt> smpicc does not compile as shared,
+and the SMPI initialization stops and returns 0 before doing anything
+that would fail without <tt>smpirun</tt>.
+
+\warning
+
+ Make sure that SMPI_PRETEND_CC is only set when calling ./configure,
+ not during the actual execution, or any program compiled with smpicc
+ will stop before starting.
+
+@subsection SMPI_trouble_configure_dont_find_smpicc ./configure does not pick smpicc as a compiler
+
+In addition to the previous answers, some projects also need to be
+explicitely told what compiler to use, as follows:
+
+@verbatim
+SMPI_PRETEND_CC=1 ./configure CC=smpicc # here come the other configure parameters
+make
+@endverbatim
+
+Maybe your configure is using another variable, such as <tt>cc</tt> or
+similar. Just check the logs.
+
+@subsection SMPI_trouble_useconds_t error: unknown type name 'useconds_t'
+
+Try to add <tt>-D_GNU_SOURCE</tt> to your compilation line to get ride
+of that error.
+
+The reason is that SMPI provides its own version of <tt>usleep(3)</tt>
+to override it and to block in the simulation world, not in the real
+one. It needs the <tt>useconds_t</tt> type for that, which is declared
+only if you declare <tt>_GNU_SOURCE</tt> before including
+<tt>unistd.h</tt>. If your project includes that header file before
+SMPI, then you need to ensure that you pass the right configuration
+defines as advised above.