Logo AND Algorithmique Numérique Distribuée

Public GIT Repository
simgrid.git
5 years agotuto smpi: finish (for now) the platform section; draft the install section
Martin Quinson [Tue, 18 Sep 2018 10:58:27 +0000 (12:58 +0200)]
tuto smpi: finish (for now) the platform section; draft the install section

5 years agocosmetics
Martin Quinson [Tue, 18 Sep 2018 07:50:32 +0000 (09:50 +0200)]
cosmetics

5 years agocosmetics in graphical representations of cluster descriptions
Martin Quinson [Tue, 18 Sep 2018 07:41:57 +0000 (09:41 +0200)]
cosmetics in graphical representations of cluster descriptions

5 years agohopefully fix the NS3 test
Martin Quinson [Tue, 18 Sep 2018 06:39:59 +0000 (08:39 +0200)]
hopefully fix the NS3 test

5 years agofix make distcheck, as usual :(
Martin Quinson [Tue, 18 Sep 2018 05:39:18 +0000 (07:39 +0200)]
fix make distcheck, as usual :(

5 years agoignore a directory generated by sphinx
Martin Quinson [Tue, 18 Sep 2018 00:03:38 +0000 (02:03 +0200)]
ignore a directory generated by sphinx

5 years agodocs: sphinx 1.8.0 was released, so use it
Martin Quinson [Tue, 18 Sep 2018 00:02:56 +0000 (02:02 +0200)]
docs: sphinx 1.8.0 was released, so use it

5 years agoMerge branch 'master' of github.com:simgrid/simgrid
Martin Quinson [Mon, 17 Sep 2018 23:58:32 +0000 (01:58 +0200)]
Merge branch 'master' of github.com:simgrid/simgrid

5 years agoRename cluster.xml to cluster_backbone.xml
Martin Quinson [Mon, 17 Sep 2018 22:47:16 +0000 (00:47 +0200)]
Rename cluster.xml to cluster_backbone.xml

also, fix the make dist and some cosmetics.

5 years agocleanups in the cluster platform files
Martin Quinson [Mon, 17 Sep 2018 22:30:18 +0000 (00:30 +0200)]
cleanups in the cluster platform files

5 years agodocs: prefer svg to png, and inclusion to copy/paste
Martin Quinson [Mon, 17 Sep 2018 22:16:46 +0000 (00:16 +0200)]
docs: prefer svg to png, and inclusion to copy/paste

5 years agocosmetics on the graphical TOC
Martin Quinson [Mon, 17 Sep 2018 21:46:10 +0000 (23:46 +0200)]
cosmetics on the graphical TOC

5 years agoMerge branch 'master' of scm.gforge.inria.fr:/gitroot/simgrid/simgrid
Martin Quinson [Mon, 17 Sep 2018 07:54:46 +0000 (09:54 +0200)]
Merge branch 'master' of scm.gforge.inria.fr:/gitroot/simgrid/simgrid

5 years agofix sectionning and one typo
Martin Quinson [Fri, 14 Sep 2018 20:54:04 +0000 (22:54 +0200)]
fix sectionning and one typo

5 years agoGraphical representation of example platforms
Arnaud Legrand [Fri, 14 Sep 2018 09:17:09 +0000 (11:17 +0200)]
Graphical representation of example platforms

5 years agoMerge pull request #292 from kovin/master
Martin Quinson [Thu, 13 Sep 2018 22:31:12 +0000 (00:31 +0200)]
Merge pull request #292 from kovin/master

Cover with a test Mailbox::ready() method introduced in commit 1ed0e64dc40

5 years agoMerge branch 'master' into master
Martin Quinson [Thu, 13 Sep 2018 20:00:17 +0000 (22:00 +0200)]
Merge branch 'master' into master

5 years agoSMPI tuto: Start stealing content from SMPI courseware
Martin Quinson [Tue, 11 Sep 2018 23:53:17 +0000 (01:53 +0200)]
SMPI tuto: Start stealing content from SMPI courseware

5 years agotuto smpi: add a picture explaining how it works
Martin Quinson [Tue, 11 Sep 2018 23:17:36 +0000 (01:17 +0200)]
tuto smpi: add a picture explaining how it works

5 years agoallow to have hidden/shown code blocks in the doc
Martin Quinson [Tue, 11 Sep 2018 23:16:33 +0000 (01:16 +0200)]
allow to have hidden/shown code blocks in the doc

5 years agoAdd an assert/fixme around Actor::set_auto_restart.
Arnaud Giersch [Tue, 11 Sep 2018 20:35:20 +0000 (22:35 +0200)]
Add an assert/fixme around Actor::set_auto_restart.

5 years agoUse a std::vector for actors_at_boot_.
Arnaud Giersch [Tue, 11 Sep 2018 20:27:20 +0000 (22:27 +0200)]
Use a std::vector for actors_at_boot_.

Several actors may use the same name (e.g. app-masterworker-multicore).
Also fixes a memory leak.

5 years agostart the SMPI tuto
Martin Quinson [Tue, 11 Sep 2018 16:37:58 +0000 (18:37 +0200)]
start the SMPI tuto

5 years agoTypo.
Arnaud Giersch [Fri, 31 Aug 2018 11:19:34 +0000 (13:19 +0200)]
Typo.

5 years agotuto: don't speak of s4u processes (but actors)
Martin Quinson [Mon, 10 Sep 2018 21:30:38 +0000 (23:30 +0200)]
tuto: don't speak of s4u processes (but actors)

5 years agodocs: simplify and document that file
Martin Quinson [Mon, 10 Sep 2018 21:17:30 +0000 (23:17 +0200)]
docs: simplify and document that file

5 years agokilling trailing whitespaces on png files is not cleaver
Martin Quinson [Mon, 10 Sep 2018 21:01:01 +0000 (23:01 +0200)]
killing trailing whitespaces on png files is not cleaver

5 years agoDTD: remove the last occurence of <gpu>
Martin Quinson [Mon, 10 Sep 2018 20:33:39 +0000 (22:33 +0200)]
DTD: remove the last occurence of <gpu>

5 years agotesh: informative message for another error condition
Martin Quinson [Mon, 10 Sep 2018 20:30:49 +0000 (22:30 +0200)]
tesh: informative message for another error condition

5 years agoFix the DTD to not allow to mix internal node content with leaf content in a given...
Martin Quinson [Mon, 10 Sep 2018 19:58:04 +0000 (21:58 +0200)]
Fix the DTD to not allow to mix internal node content with leaf content in a given zone

Fix https://github.com/simgrid/simgrid/issues/296

5 years agofix the SMPI tests that mandate smpi/wtime == 0
Martin Quinson [Mon, 10 Sep 2018 14:19:17 +0000 (16:19 +0200)]
fix the SMPI tests that mandate smpi/wtime == 0

5 years agoalign doc and code on a more sensible value
Martin Quinson [Mon, 10 Sep 2018 13:03:49 +0000 (15:03 +0200)]
align doc and code on a more sensible value

5 years agoMerge branch 'master' of framagit.org:simgrid/simgrid
Martin Quinson [Mon, 10 Sep 2018 12:42:52 +0000 (14:42 +0200)]
Merge branch 'master' of framagit.org:simgrid/simgrid

5 years agoMerge branch 'master' of scm.gforge.inria.fr:/gitroot/simgrid/simgrid
Martin Quinson [Mon, 10 Sep 2018 12:39:55 +0000 (14:39 +0200)]
Merge branch 'master' of scm.gforge.inria.fr:/gitroot/simgrid/simgrid

5 years agoImprove option smpi/wtime
Martin Quinson [Mon, 10 Sep 2018 12:35:57 +0000 (14:35 +0200)]
Improve option smpi/wtime

- Set default value to 1ms instead of 0. This default settings may
  lead to slower simulation, but it works in more situations.
- Also apply this delay in gettimeofday() and clock_gettime()
- Improve the documentation.

5 years agoAllow insertion of time inside gettimeofday and clock_gettime
Augustin Degomme [Mon, 10 Sep 2018 11:39:29 +0000 (13:39 +0200)]
Allow insertion of time inside gettimeofday and clock_gettime
Done with --cfg=smpi/wtime, which was previously only for MPI_Wtime.
This should avoid some infinite loops. Keep 0 as default for now.

5 years agomove smpi_mpi_wtime near to the other time-related functions
Martin Quinson [Mon, 10 Sep 2018 11:02:22 +0000 (13:02 +0200)]
move smpi_mpi_wtime near to the other time-related functions

5 years agodont use send/receive on mailboxes, but put/get
Martin Quinson [Thu, 6 Sep 2018 19:39:26 +0000 (21:39 +0200)]
dont use send/receive on mailboxes, but put/get

5 years agoUpdate app_s4u.rst
FREDERIC SUTER [Wed, 5 Sep 2018 10:56:09 +0000 (12:56 +0200)]
Update app_s4u.rst

5 years agoUpdate application.rst
FREDERIC SUTER [Wed, 5 Sep 2018 10:17:03 +0000 (12:17 +0200)]
Update application.rst

5 years agotry to fix windows builds
Martin Quinson [Mon, 3 Sep 2018 19:41:38 +0000 (21:41 +0200)]
try to fix windows builds

ContextJava uses ContextThread as a superclass now, but they are not
in the same lib, so ContextThread must be exported as public.

5 years agoUpdate intro_yours.rst
FREDERIC SUTER [Mon, 3 Sep 2018 12:37:59 +0000 (14:37 +0200)]
Update intro_yours.rst

5 years agoMultiply memset size by size of element in umpire.
Augustin Degomme [Wed, 29 Aug 2018 12:31:17 +0000 (14:31 +0200)]
Multiply memset size by size of element in umpire.

5 years agoUpdate intro_install.rst
FREDERIC SUTER [Mon, 3 Sep 2018 12:00:55 +0000 (14:00 +0200)]
Update intro_install.rst

5 years agoUpdate intro_concepts.rst
FREDERIC SUTER [Mon, 3 Sep 2018 11:17:08 +0000 (13:17 +0200)]
Update intro_concepts.rst

5 years agofix make distcheck
Martin Quinson [Mon, 3 Sep 2018 07:34:38 +0000 (09:34 +0200)]
fix make distcheck

5 years agoSomehow fix the killing of actors in Java
Martin Quinson [Mon, 3 Sep 2018 07:20:56 +0000 (09:20 +0200)]
Somehow fix the killing of actors in Java

Things are somehow fixed, as all tests seem to pass, but the situation
is still very messy after this commit. Contents:

- Reimplement ContextJava as subclass of ContextThread to reduce duplication.
- Don't send the StopRequest exception on host failure if we are in
  Java because *some* of the actors don't catch it well, resulting in
  simulation failure.
- Forcefully kill the process ("exit(0)" in C) after MSG_run() because
  dead actors are sometimes not completely killed, preventing the
  simulation from ending.

See the comment in ActorImpl for a better understanding of this mess
and how to fix it in the future.

5 years agocosmetics while debuging backtraces
Martin Quinson [Sun, 2 Sep 2018 19:35:09 +0000 (21:35 +0200)]
cosmetics while debuging backtraces

5 years agojava: obey our coding standard
Martin Quinson [Sun, 2 Sep 2018 00:17:06 +0000 (02:17 +0200)]
java: obey our coding standard

5 years agodon't catch an exception that is never thrown
Martin Quinson [Sun, 2 Sep 2018 00:09:27 +0000 (02:09 +0200)]
don't catch an exception that is never thrown

xbt_os_thread_create() asserts that it succeeds, it does not throw
anything. So put the documentation in the doc instead of displaying it
when that non-existent exception is received.

5 years agojava: cosmetics
Martin Quinson [Sun, 2 Sep 2018 00:02:21 +0000 (02:02 +0200)]
java: cosmetics

5 years agothat was converted to sphinx
Martin Quinson [Sat, 1 Sep 2018 23:11:54 +0000 (01:11 +0200)]
that was converted to sphinx

5 years agoRemove the deprecated 'state' attribute from the doc
Martin Quinson [Sat, 1 Sep 2018 20:56:32 +0000 (22:56 +0200)]
Remove the deprecated 'state' attribute from the doc

This fixes https://github.com/simgrid/simgrid/issues/295

5 years agodocs: write the overall section of 'Applications'
Martin Quinson [Sat, 1 Sep 2018 20:53:51 +0000 (22:53 +0200)]
docs: write the overall section of 'Applications'

5 years agosphinx: one warning less
Martin Quinson [Fri, 31 Aug 2018 15:58:58 +0000 (17:58 +0200)]
sphinx: one warning less

5 years agoBummer. Really fix out of tree builds (I hope)
Martin Quinson [Thu, 30 Aug 2018 09:37:40 +0000 (11:37 +0200)]
Bummer. Really fix out of tree builds (I hope)

5 years agofix out of tree builds
Martin Quinson [Thu, 30 Aug 2018 07:38:36 +0000 (09:38 +0200)]
fix out of tree builds

5 years agofix maestro-set
Martin Quinson [Wed, 29 Aug 2018 21:11:37 +0000 (23:11 +0200)]
fix maestro-set

5 years agodisable the platform-failure tests for now, sorry
Martin Quinson [Wed, 29 Aug 2018 20:50:07 +0000 (22:50 +0200)]
disable the platform-failure tests for now, sorry

I fail to debug such complex tests, I need smaller ones such as the
activity-lifecycle that I'm currently growing.

But broken tests in the git prevents everybody from working, including
me. I broke msg-maestro-set-thread at some point and did not even
notice :(

Sorry for breaking the failure platform tests in the first place.

5 years agokill a superseeded sub-test, and fix another one
Martin Quinson [Wed, 29 Aug 2018 20:31:09 +0000 (22:31 +0200)]
kill a superseeded sub-test, and fix another one

Processes on failing host are killed right away, so it cannot report
that the host failed as expected.

This whole test should be converted to activity-lifecycle.

5 years agofix make dist
Martin Quinson [Wed, 29 Aug 2018 20:13:26 +0000 (22:13 +0200)]
fix make dist

5 years agothis test is superseeded by activity-lifecycle
Martin Quinson [Wed, 29 Aug 2018 20:11:31 +0000 (22:11 +0200)]
this test is superseeded by activity-lifecycle

5 years agosimplify the actor finalization a tiny bit by using a callback
Martin Quinson [Wed, 29 Aug 2018 20:04:11 +0000 (22:04 +0200)]
simplify the actor finalization a tiny bit by using a callback

This is part of the removal of all trace-related pimpl all over the
code of MSG (my goal is to kill MSG_process_cleanup_from_SIMIX() all
together).

Note that I changed from Container::by_name() to
Container::by_name_or_null. It seems that not all actors have a
container by their name, not sure why.

5 years agoConvert all xbt_ex(network_error) throwing locations
Martin Quinson [Wed, 29 Aug 2018 19:24:26 +0000 (21:24 +0200)]
Convert all xbt_ex(network_error) throwing locations

5 years agotypo
Martin Quinson [Wed, 29 Aug 2018 19:19:40 +0000 (21:19 +0200)]
typo

5 years agosonar
Martin Quinson [Wed, 29 Aug 2018 13:19:17 +0000 (15:19 +0200)]
sonar

5 years agowoops
Martin Quinson [Wed, 29 Aug 2018 13:18:47 +0000 (15:18 +0200)]
woops

5 years agofix 32b builds
Martin Quinson [Wed, 29 Aug 2018 12:17:35 +0000 (14:17 +0200)]
fix 32b builds

5 years agoplease sonar on rethrow
Martin Quinson [Wed, 29 Aug 2018 11:14:01 +0000 (13:14 +0200)]
please sonar on rethrow

5 years agoDisplay a msg when contexts are killed by uncatched exceptions
Martin Quinson [Wed, 29 Aug 2018 09:35:10 +0000 (11:35 +0200)]
Display a msg when contexts are killed by uncatched exceptions

and when I want to really kill an actor (eg when its host is turned
off), I launch an uncatchable kernel::Context::StopRequest instead of
a catchable simgrid::HostFailureException (which will be used in case
of remote exec and similar)

Maybe there should be a config flag to decide if we want to kill the
simulation when an actor fails. The current setting forces the user to
add try/catch (simgrid::Exception) around their main functions. That's
not a bad thing either, not sure.

5 years agoLet's exhaustively test the activity lifecycle
Martin Quinson [Wed, 29 Aug 2018 00:10:12 +0000 (02:10 +0200)]
Let's exhaustively test the activity lifecycle

This test is not complete yet. It aims at being as exhaustive and
paranoid as possible, just like cloud-sharing even if I didn't find a
good DSL to specify the tests this time.

5 years agoimprove debug messages and error reporting
Martin Quinson [Wed, 29 Aug 2018 00:08:18 +0000 (02:08 +0200)]
improve debug messages and error reporting

5 years agoProperly kill the context on HostFailureException
Martin Quinson [Tue, 28 Aug 2018 23:59:17 +0000 (01:59 +0200)]
Properly kill the context on HostFailureException

Before, simix was kinda thinking that the actor was dead, but the
context was still running, leading to a Holy Big Mess!

5 years agoupdate doc
Augustin Degomme [Tue, 28 Aug 2018 15:46:08 +0000 (17:46 +0200)]
update doc

5 years agoSwitch to ompi for umpire tests.
Augustin Degomme [Tue, 28 Aug 2018 15:39:33 +0000 (17:39 +0200)]
Switch to ompi for umpire tests.
MPICH changes brought SMP-aware algorithm, which MC does not really like.
I guess the init_smp is the culprit here, as it uses badly various collectives.

5 years agoupdate ompi selector as well with "recent" version
Augustin Degomme [Tue, 28 Aug 2018 15:37:50 +0000 (17:37 +0200)]
update ompi selector as well with "recent" version

5 years agoRequalify automatic tesh, as another algorithm is used in init_smp now.
Augustin Degomme [Tue, 28 Aug 2018 14:32:41 +0000 (16:32 +0200)]
Requalify automatic tesh, as another algorithm is used in init_smp now.

5 years agoupdate doc with new algo
Augustin Degomme [Tue, 28 Aug 2018 14:29:22 +0000 (16:29 +0200)]
update doc with new algo

5 years agoUpgrade MPICH collective selector to 3.3.
Augustin Degomme [Tue, 28 Aug 2018 14:23:24 +0000 (16:23 +0200)]
Upgrade MPICH collective selector to 3.3.
Add SMP variants of some algorithms, and protect against side effects.

5 years agocircleci: do not optimise builds, you're supposed to be as fast as hell
Martin Quinson [Sun, 26 Aug 2018 23:49:51 +0000 (01:49 +0200)]
circleci: do not optimise builds, you're supposed to be as fast as hell

5 years agoFix https://github.com/simgrid/simgrid/issues/294
Augustin Degomme [Tue, 28 Aug 2018 08:19:07 +0000 (10:19 +0200)]
Fix https://github.com/simgrid/simgrid/issues/294

5 years agoNot sure of why it helps now
Martin Quinson [Sun, 26 Aug 2018 23:42:53 +0000 (01:42 +0200)]
Not sure of why it helps now

5 years agofix travis builds
Martin Quinson [Sun, 26 Aug 2018 23:42:32 +0000 (01:42 +0200)]
fix travis builds

5 years agostrenghten this test
Martin Quinson [Sun, 26 Aug 2018 22:37:52 +0000 (00:37 +0200)]
strenghten this test

5 years agoMSG_process_sleep should intercept HostFailureException and report it accordingly
Martin Quinson [Sun, 26 Aug 2018 20:50:16 +0000 (22:50 +0200)]
MSG_process_sleep should intercept HostFailureException and report it accordingly

Don't ask me how it could have worked before, but they were a
C++ try/catch in teshsuite/msg/host_on_off_processes. In a MSG code!!

5 years agoWhen the host dies, the actor need an exception even if it's not blocked on an activity
Martin Quinson [Sun, 26 Aug 2018 20:14:26 +0000 (22:14 +0200)]
When the host dies, the actor need an exception even if it's not blocked on an activity

5 years agosome compilers cannot see that this value is initialized in all cases
Martin Quinson [Sun, 26 Aug 2018 11:30:15 +0000 (13:30 +0200)]
some compilers cannot see that this value is initialized in all cases

5 years agoDo not throw exception in maestro when host->is_off + sleep()
Martin Quinson [Sun, 26 Aug 2018 09:49:40 +0000 (11:49 +0200)]
Do not throw exception in maestro when host->is_off + sleep()

Also, use ActorImpl::throw_exception() instead of messing with its
wannabe private exception_ field.

5 years agoconvert some xbt_ex(tracing_error) into xbt_assert
Martin Quinson [Sun, 26 Aug 2018 00:56:22 +0000 (02:56 +0200)]
convert some xbt_ex(tracing_error) into xbt_assert

5 years agoimplement THROW_IMPOSSIBLE and THROW_UNIMPLEMENTED with std::runtime_error directly
Martin Quinson [Sun, 26 Aug 2018 00:09:00 +0000 (02:09 +0200)]
implement THROW_IMPOSSIBLE and THROW_UNIMPLEMENTED with std::runtime_error directly

5 years agoconvert a xbt_ex(arg_error) into a std::invalid_argument
Martin Quinson [Sat, 25 Aug 2018 23:31:24 +0000 (01:31 +0200)]
convert a xbt_ex(arg_error) into a std::invalid_argument

5 years agodeprecate SIMIX_process_throw for ActorImpl::throw_exception
Martin Quinson [Sat, 25 Aug 2018 23:14:32 +0000 (01:14 +0200)]
deprecate SIMIX_process_throw for ActorImpl::throw_exception

5 years agoconvert xbt_ex(timeout_error) catching locations to TimeoutError
Martin Quinson [Sat, 25 Aug 2018 21:38:15 +0000 (23:38 +0200)]
convert xbt_ex(timeout_error) catching locations to TimeoutError

5 years agoDo not swallow exceptions I don't know
Martin Quinson [Sat, 25 Aug 2018 22:51:00 +0000 (00:51 +0200)]
Do not swallow exceptions I don't know

5 years agoDo not convert TimeoutError to xbt_ex(timeout) in case they were a wait_any
Martin Quinson [Sat, 25 Aug 2018 22:36:16 +0000 (00:36 +0200)]
Do not convert TimeoutError to xbt_ex(timeout) in case they were a wait_any

If there is an issue while dealing with a test_any or a wait_any, the
caller must be told which activity failed. I'm not sure of how to
cleanly do so. For now, we use exception.value to store the rank of
that activity in the container.

To modify the exception, C++ leaves us no way but to rethrow it and
recatch it, change its value field, and re-store it in the
issuer->exception.  But then, the exception become of the catching
type. Wicked! Vicious! It means that since we were catching (xbt_ex&
e), we actually converted the simgrid::TimeoutException into a xbt_ex.

And this conversion was done in any case, even if the value was set
only if the simcall was actually a wait_any or test_any...

With this commit, we catch, extend and rethrow any TimeoutException,
and if it's not such an xbt_ex, we do the same for a xbt_ex.

A proper version could involve a WaitAnyException (with failing_rank
and cause fields), or maybe the TimeoutException could contain a
pointer to the timeouted activity so that the caller can find its rank
by itself.

5 years agoconvert all xbt_ex(timeout_error) throwing locations to simgrid::TimeoutError
Martin Quinson [Sat, 25 Aug 2018 20:02:07 +0000 (22:02 +0200)]
convert all xbt_ex(timeout_error) throwing locations to simgrid::TimeoutError

5 years agoconvert all xbt_ex(host_error) catching locations to simgrid::HostFailureException
Martin Quinson [Sat, 25 Aug 2018 19:51:19 +0000 (21:51 +0200)]
convert all xbt_ex(host_error) catching locations to simgrid::HostFailureException

5 years agoReplace all xbt_ex(host_error) throwing location with HostFailureException
Martin Quinson [Sat, 25 Aug 2018 19:35:15 +0000 (21:35 +0200)]
Replace all xbt_ex(host_error) throwing location with HostFailureException

5 years agoconvert some catch locations to simgrid::HostFailureException
Martin Quinson [Sat, 25 Aug 2018 13:57:02 +0000 (15:57 +0200)]
convert some catch locations to simgrid::HostFailureException

5 years agoallow to pass a std::string as message to Exceptions
Martin Quinson [Sat, 25 Aug 2018 13:32:58 +0000 (15:32 +0200)]
allow to pass a std::string as message to Exceptions