Logo AND Algorithmique Numérique Distribuée

Public GIT Repository
SMPI: redesign the end of actors/ranks' lifetime
authorMartin Quinson <martin.quinson@ens-rennes.fr>
Thu, 1 Aug 2019 07:34:53 +0000 (09:34 +0200)
committerMartin Quinson <martin.quinson@ens-rennes.fr>
Thu, 1 Aug 2019 07:54:44 +0000 (09:54 +0200)
commit8fe7143ac15490fc64aaf5f88c08bcf489a1e9f1
treea4e8b202863d27afd8fe738b5690f1c079dc8ad2
parent13c82410baba7be2780087e4a9d96b008393d4d0
SMPI: redesign the end of actors/ranks' lifetime

The problem is that we don't use enough of refcounting in SMPI, so we
should not let any rank finish before the others, because it may be
involved in a communication or something.

Previously, there were a barrier at the end of the user code, so that
every ranks finishes exactly at the same time.

Now, the MPI instance keeps a reference on every actor it contains,
and the actor terminates with no delay after its code. The terminating
actors unregister from their MPI instance, but they are still
referenced until the last actor unregisters from the MPI instance.
Once the MPI instance is empty, it unregisters all the actors,
allowing their collection by the refcounting.

This commit changes the ending time of many ranks in many examples, as
expected. The ranks now terminate as soon as they are done, they are
not waiting the others anymore.

It introduces a segfault in ampi that I fail to understand. It seems
that a container is used after being collected in this example, but I
fail to see the reason so far.
12 files changed:
examples/smpi/ampi_test/ampi_test.tesh
examples/smpi/replay/replay-override-replayer.tesh
examples/smpi/replay/replay.tesh
examples/smpi/replay_multiple_manual_deploy/replay_multiple_manual_coll1.tesh
examples/smpi/replay_multiple_manual_deploy/replay_multiple_manual_coll2_st_sr_noise.tesh
examples/smpi/smpi_msg_masterslave/msg_smpi.tesh
examples/smpi/trace/trace.tesh
src/smpi/include/private.hpp
src/smpi/include/smpi_actor.hpp
src/smpi/internals/smpi_actor.cpp
src/smpi/internals/smpi_deployment.cpp
teshsuite/smpi/fort_args/fort_args.f90