Martin Quinson [Mon, 22 Dec 2014 22:03:23 +0000 (23:03 +0100)]
Ensure that MSG_host_self() works from maestro context
Gabriel Corona [Fri, 19 Dec 2014 10:56:52 +0000 (11:56 +0100)]
[mc] Add some FIXMEs for cross-process support
Gabriel Corona [Fri, 19 Dec 2014 08:51:16 +0000 (09:51 +0100)]
[mc] Cross-process support for MC_ignore
Gabriel Corona [Thu, 18 Dec 2014 15:03:03 +0000 (16:03 +0100)]
[mc] Implements privatization support for MC_process_read
This is currently needed for cross-process MC in order to read the
heap state.
Gabriel Corona [Tue, 16 Dec 2014 12:06:37 +0000 (13:06 +0100)]
[mc] Abstract the process and a snapshot types with a address_space superclass
Add a `address_space`, superclass of `process` and `snapshot`.
In order to do this, the contract of MC_process_read and
MC_snapshot_read has been uniformized:
* the order of arguments has been harmonized;
* a new flag MC_ADDRESS_SPACE_READ_FLAGS_LAZY is used to avoid copy
when the data is in the current memory;
* MC_NO_PROCESS_INDEX has been renamed into MC_PROCESS_INDEX_MISSING;
* MC_ANY_PROCESS_INDEX has been renamed into MC_PROCESS_INDEX_ANY;
* MC_PROCESS_INDEX_DISABLED is used to access the raw address space
(without privatisation support);
* `const void*` is used instead of `void*` when it possible.
Soem cleanup things are still to be done:
* remove special NULL handling;
* add support for SMPI privatization in the process object.
Gabriel Corona [Tue, 16 Dec 2014 11:05:14 +0000 (12:05 +0100)]
[mc] More comments for mc_dwarf_execute_expression()
Give some basic explanation about the DWARF operations.
Gabriel Corona [Tue, 16 Dec 2014 10:34:38 +0000 (11:34 +0100)]
[mc] Add more information about mc_dwarf_register_to_libunwind()
Augustin Degomme [Mon, 15 Dec 2014 21:00:27 +0000 (22:00 +0100)]
avoid word being recognized as special by doxygen
Augustin Degomme [Mon, 15 Dec 2014 20:52:40 +0000 (21:52 +0100)]
typos
Augustin Degomme [Mon, 15 Dec 2014 20:48:56 +0000 (21:48 +0100)]
update doc about privatization .. \nThis was not updated yet. The option was documented in options.doc, so this my be a duplication of information
Augustin Degomme [Fri, 12 Dec 2014 16:59:19 +0000 (17:59 +0100)]
fix tesh
Augustin Degomme [Fri, 12 Dec 2014 16:57:10 +0000 (17:57 +0100)]
requalify teshes for vm migration
Augustin Degomme [Fri, 12 Dec 2014 16:29:08 +0000 (17:29 +0100)]
update changelog
Augustin Degomme [Fri, 12 Dec 2014 16:27:42 +0000 (17:27 +0100)]
doc update for new appenders
Augustin Degomme [Fri, 12 Dec 2014 15:46:01 +0000 (16:46 +0100)]
Patch by F.Chaix : add two "new" log appender methods : split and roll
split will create new files when a specified size is reached
roll will overwrite the file when this size is reached
example syntax is : --log=root.appender:splitfile:10000:myfilename_%.txt
The % is a wildcard that will be replaced by the number of the file. If no % is present, it will be at the end
Gabriel Corona [Thu, 11 Dec 2014 12:14:32 +0000 (13:14 +0100)]
[mc] Support for reading heap state from another process
Martin Quinson [Fri, 12 Dec 2014 12:57:29 +0000 (13:57 +0100)]
more documentation about the simcall mechanism
Gabriel Corona [Thu, 11 Dec 2014 13:51:39 +0000 (14:51 +0100)]
[mc] Fix error handling in MC_process{read,write}
Gabriel Corona [Tue, 9 Dec 2014 14:29:16 +0000 (15:29 +0100)]
[mc] Access memory from another process
The goal is to be able to move MC in a separate process which should
be more robust and easier to develop:
* avoid using two heaps (which is combersome);
* avoid weird interactions bewteen the MC and the application;
* use optimisation for the whole MC process;
* avoid the stack-cleaner for the whole MC process.
The functions MC_process_read and MC_process_write are defined to
abstract memory access:
* when the target process is the current processs, they call
`memcpy`;
* otherwise they call `read` or `write` on `/proc/$pid/mem` (on newer
kernels, `process_vm_readv` and `process_vm_writev`) could be used.
A lot of bits are missing such as:
* access to `std_heap` is currently not process-aware (the current
process is used);
* access to SIMIX layer from MC;
* communcation/synchronisation between the processes;
* …
Limitations:
* for the per-page/chunked snapshot the current implementation uses
an extra copy (and one syscall per page), we can do better than
this.
Augustin Degomme [Wed, 10 Dec 2014 13:25:01 +0000 (14:25 +0100)]
(try to) avoid looping forever, temporarily.
This value was 10 before, and was set to 3 recently, but weirdly this caused vm migration simulations to loop forever.
It works with an infinite timeout, but the message is lost with a few values (1,2 ,3, 6, 9 were tested and failed, other ones up to 13 at least do succeed), while succeeding with other ones (after a few hundred timeouts, so this seems to be working)
So if an MSG expert sees this and can find the problem here ..
Test should still fail with this patch, but just because teshes were not requalified yet
Augustin Degomme [Wed, 10 Dec 2014 09:30:42 +0000 (10:30 +0100)]
Fix windows build (this is now used from the java library)
Marion Guthmuller [Wed, 10 Dec 2014 12:39:16 +0000 (13:39 +0100)]
fix dot output with file descriptor checkpoint/restore
Augustin Degomme [Tue, 9 Dec 2014 15:37:00 +0000 (16:37 +0100)]
Put a 300 seconds timeout on each test with ctest
This should really be enough (the default was 1500...)
Augustin Degomme [Tue, 9 Dec 2014 09:46:18 +0000 (10:46 +0100)]
Fix build
Some vm tests are still broken and some are looping, they need to be fixed
Gabriel Corona [Tue, 9 Dec 2014 12:15:38 +0000 (13:15 +0100)]
[mc] Enable the custom mm malloc only in MC
We can do better in the future: we can avoid using the main mm malloc
in many cases even for MC.
Gabriel Corona [Tue, 9 Dec 2014 11:17:18 +0000 (12:17 +0100)]
[mm] Allow to disable the mm based `malloc` at runtime
The goal is to enable the mm based `malloc` only when needed and fall
back to the (more efficient) builtin/next implementation when it is
not needed:
* run instrospection-less jobs without it;
* whene the MC and the application are in different processes, the
MC will be able to run with the standard `malloc` and the
application will use mm.
As malloc is needed very early in the application initialisation, an
environment variable is used to change the behaviour.
Gabriel Corona [Tue, 9 Dec 2014 10:21:49 +0000 (11:21 +0100)]
[mc] Remove redundant typedefs
Clang complains about typedef redefinitions in pre-C11 mode.
Gabriel Corona [Tue, 9 Dec 2014 08:16:22 +0000 (09:16 +0100)]
[mc] Optimise most of XBT
degomme [Mon, 8 Dec 2014 22:37:07 +0000 (23:37 +0100)]
bashism -- (fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=772329 )
Adrien Lebre [Mon, 8 Dec 2014 17:14:47 +0000 (18:14 +0100)]
Merge branch 'master' of git+ssh://scm.gforge.inria.fr//gitroot/simgrid/simgrid
Gabriel Corona [Mon, 8 Dec 2014 13:48:47 +0000 (14:48 +0100)]
[mc] Optimise all the MC compilation units
Gabriel Corona [Fri, 5 Dec 2014 15:05:50 +0000 (16:05 +0100)]
[mc] Multiple .so support for region snapshots
The region snapshoting logic can handle a variable number of .so
files:
* add more informations to the snapshot regions,
* the type (heap, library/executable);
* the correspoding library/executable;
* the type of storage (dense/flat, chunked/sparse or privatised)
and the type-specific variables are defined in an enum
(variant/tagged enum).
* SMPI privatisation snapshot regions are stored as children of a
parent snapshot region
Limitation:
* we might want to use a more modular/extensible approach OO for the
snapshot region storage type instead of variant-based approach;
* SMPI can currently only handle privatisation for the local
variables of the executable so this is only supported in the MC as
well for this reason but otherwise the MC is ready to support the
SMPI privatisation of libraries.
Christian Heinrich [Mon, 8 Dec 2014 12:38:34 +0000 (13:38 +0100)]
Fixed return values for several MPI_WIN functions
(statements had to be wrapped in else-clauses)
Adrien Lebre [Mon, 8 Dec 2014 09:20:23 +0000 (10:20 +0100)]
Merge branch 'master' of git+ssh://scm.gforge.inria.fr//gitroot/simgrid/simgrid
Adrien Lebre [Mon, 8 Dec 2014 09:20:12 +0000 (10:20 +0100)]
Few temporary debug messages - adrien
Augustin Degomme [Sat, 6 Dec 2014 15:05:38 +0000 (16:05 +0100)]
Add a mutex to lock access to the SMPI mailboxes when a message is posted
Should avoid race condition, while keeping isolation from SIMIX
Augustin Degomme [Sat, 6 Dec 2014 02:00:14 +0000 (03:00 +0100)]
Revert "Avoid using simcalls here, as by descheduling the process, we could misplace some messages in mailboxes, and end up deadlocking."
This reverts commit
63ba498484d4fcfc706be1e514d7fedbc6c9f4be.
Gabriel Corona [Fri, 5 Dec 2014 14:50:56 +0000 (15:50 +0100)]
[mc] Multiple .so support in MC_ignore_local_variable()
Gabriel Corona [Thu, 4 Dec 2014 15:03:20 +0000 (16:03 +0100)]
[mc] Basic support for more other libraries than libsimgrid.so
Augustin Degomme [Fri, 5 Dec 2014 13:59:20 +0000 (14:59 +0100)]
activate mpich3 tests for post/wait/start/complete RMA calls
Augustin Degomme [Fri, 5 Dec 2014 13:51:37 +0000 (14:51 +0100)]
Add MPI_Win_post, MPI_Win_start, MPI_Win_complete, and MPI_Win_wait support.
This is the second (out of 3) of the classic MPI RMA synchronization methods.
This version is naive and may not be what real MPI lib do, as the standard lets the implementer chose the behavior of theses calls.
Gabriel Corona [Thu, 4 Dec 2014 10:02:16 +0000 (11:02 +0100)]
[mc] Move process info in a new s_mc_process_t structure
This is a beginning of the refactoring in order to support MC-ing a
remote process.
Gabriel Corona [Thu, 4 Dec 2014 10:41:02 +0000 (11:41 +0100)]
[mc] Fix distcheck
Gabriel Corona [Thu, 4 Dec 2014 10:07:40 +0000 (11:07 +0100)]
[mc] Don't include libunwind.h in non MC builds
Gabriel Corona [Thu, 4 Dec 2014 09:45:49 +0000 (10:45 +0100)]
Merge branch 'master'
Gabriel Corona [Thu, 4 Dec 2014 09:43:01 +0000 (10:43 +0100)]
[mc] Don't use unprototyped functions
Gabriel Corona [Tue, 2 Dec 2014 13:02:26 +0000 (14:02 +0100)]
[mc] Remove useless header #includes
Augustin Degomme [Tue, 2 Dec 2014 18:17:49 +0000 (19:17 +0100)]
forgot to add this include
Augustin Degomme [Tue, 2 Dec 2014 17:20:13 +0000 (18:20 +0100)]
let's try to please windows
Augustin Degomme [Tue, 2 Dec 2014 14:50:12 +0000 (15:50 +0100)]
Avoid using simcalls here, as by descheduling the process, we could misplace some messages in mailboxes, and end up deadlocking.
Calling directly SIMIX functions is not really the best, but it may fix a bad heisenbug
Augustin Degomme [Tue, 2 Dec 2014 14:44:45 +0000 (15:44 +0100)]
typos-=2
Augustin Degomme [Tue, 2 Dec 2014 14:44:22 +0000 (15:44 +0100)]
avoid problem when freeing pointer with lb!=0
Augustin Degomme [Mon, 1 Dec 2014 13:53:19 +0000 (14:53 +0100)]
do the same thing as before with IB model parameters
Gabriel Corona [Tue, 2 Dec 2014 09:41:38 +0000 (10:41 +0100)]
[mc] Modularise header files for MC
This is a preparation step for the upcoming refactorisation of the MC
code in order to MC an external process.
Gabriel Corona [Tue, 2 Dec 2014 09:00:28 +0000 (10:00 +0100)]
[mc] Define a type for MC object information flags
Gabriel Corona [Mon, 1 Dec 2014 14:31:29 +0000 (15:31 +0100)]
[mc] Remove MC_ignore_global_variable() calls
- compared_pointer which does not exist;
- smpi_current_rank does not exist;
- maestro_stack_start and mastro_stack_end doe not need to be ignored.
Gabriel Corona [Mon, 1 Dec 2014 12:47:04 +0000 (13:47 +0100)]
[mc] Enable MC specific behaviour in replay mode
Gabriel Corona [Mon, 1 Dec 2014 13:01:43 +0000 (14:01 +0100)]
Revert "[mc] Enable MC specific behaviour in replay mode"
This reverts commit
33eca433c4f055cdfcc55e46d125f8708e1848c7.
Build is broken.
Gabriel Corona [Mon, 1 Dec 2014 12:47:04 +0000 (13:47 +0100)]
[mc] Enable MC specific behaviour in replay mode
Gabriel Corona [Mon, 1 Dec 2014 12:17:59 +0000 (13:17 +0100)]
[mc] Remove useless condition check
Gabriel Corona [Mon, 1 Dec 2014 11:33:16 +0000 (12:33 +0100)]
[mc] Only enable the umpire test for MC builds
Gabriel Corona [Thu, 27 Nov 2014 10:25:22 +0000 (11:25 +0100)]
Use pthread mutex instead of semaphore in mm
Gabriel Corona [Mon, 1 Dec 2014 11:16:45 +0000 (12:16 +0100)]
Fix dist
Gabriel Corona [Fri, 28 Nov 2014 13:38:31 +0000 (14:38 +0100)]
s/formated/formatted/
Gabriel Corona [Thu, 30 Oct 2014 13:39:17 +0000 (14:39 +0100)]
[mc] Initial support MC record/replay
The idea is to record an execution path in MC mode inorder to be able
to replay it outside of the MC (event with a non-MC build). Some very
basic (an unobtrusive) MC code is compiled even when MC is disabled.
Adrien Lebre [Mon, 1 Dec 2014 09:55:12 +0000 (10:55 +0100)]
Merge branch 'master' of git+ssh://scm.gforge.inria.fr//gitroot/simgrid/simgrid
Martin Quinson [Sat, 29 Nov 2014 13:29:31 +0000 (14:29 +0100)]
and now, fix the java teshsuite, re-sorry
I shouldnt try to hack on simgrid at week-ends :-(
Martin Quinson [Sat, 29 Nov 2014 13:03:13 +0000 (14:03 +0100)]
fix the build of java bundles, sorry
Martin Quinson [Sat, 29 Nov 2014 12:31:19 +0000 (13:31 +0100)]
Dont produce that pdf output that we dont use
Martin Quinson [Sat, 29 Nov 2014 10:56:42 +0000 (11:56 +0100)]
reindent and improve displayed message
Martin Quinson [Sat, 29 Nov 2014 10:53:58 +0000 (11:53 +0100)]
put together the java-based tests
Adrien Lebre [Sat, 29 Nov 2014 08:59:40 +0000 (09:59 +0100)]
Merge branch 'master' of git+ssh://scm.gforge.inria.fr//gitroot/simgrid/simgrid
Augustin Degomme [Fri, 28 Nov 2014 17:04:29 +0000 (18:04 +0100)]
move smpi bandwidth and latency factors out of the ifdef HAVE_SMPI. SMPI and IB network models can be used without using SMPI
Adrien Lebre [Fri, 28 Nov 2014 16:29:16 +0000 (17:29 +0100)]
merge msg_vm.c - adrien (please note that there is one line (destruction of the tx_process) where I don't not remind whether it is usefull or not... this can lead to a crash unfortunately, I will fix it later)
Augustin Degomme [Fri, 28 Nov 2014 09:29:01 +0000 (10:29 +0100)]
Remove warnings in vm
Augustin Degomme [Fri, 28 Nov 2014 09:20:33 +0000 (10:20 +0100)]
Fix dist
Augustin Degomme [Tue, 25 Nov 2014 12:16:19 +0000 (13:16 +0100)]
remove potential bug / clang warning
size_t being undefined, the comparison < 0 was never true
Takahiro Hirofuchi [Thu, 27 Nov 2014 11:24:07 +0000 (20:24 +0900)]
support timeout of migration
Fixme: The default timeout value is hard-coded. Modify it and compile
the code if necessary.
Takahiro Hirofuchi [Thu, 27 Nov 2014 11:14:17 +0000 (20:14 +0900)]
fix indent in migration code
Takahiro Hirofuchi [Thu, 27 Nov 2014 11:10:55 +0000 (20:10 +0900)]
remove unnecessary comment out
Takahiro Hirofuchi [Thu, 27 Nov 2014 10:27:48 +0000 (19:27 +0900)]
remove the unnecessary vm object in migration
Takahiro Hirofuchi [Thu, 27 Nov 2014 08:55:02 +0000 (17:55 +0900)]
remove unused code in migration
Takahiro Hirofuchi [Thu, 27 Nov 2014 06:49:41 +0000 (15:49 +0900)]
remove trailing space in the migration code
Takahiro Hirofuchi [Thu, 27 Nov 2014 06:28:34 +0000 (15:28 +0900)]
remove unused migration code for CPU overheads
This commit should not affect anything.
Gabriel Corona [Mon, 24 Nov 2014 15:03:10 +0000 (16:03 +0100)]
[mc] Test if the stack-cleaner has any effect
In order to test this:
* we compile the same test program with and without the stack cleaner
(`-fstack-cleaner`, `-fno-stack-cleaner`);
* in this program, we move random bytes in the stack;
* we expect the stack-cleaner to zero them out.
This test in only used if the configure stack-cleaner is detected to
support the `-fstack-cleaner` CLI option (it is the stack-cleaner
compiler wrapper).
Gabriel Corona [Mon, 24 Nov 2014 14:33:47 +0000 (15:33 +0100)]
[mc] Disable/enable the stack-cleaner from a CLI argument (-f[no-]stack-cleaner)
Gabriel Corona [Mon, 24 Nov 2014 12:19:14 +0000 (13:19 +0100)]
[mc] Fix the stack cleaner
The condition was broken and the %rsp limit was too high.
Gabriel Corona [Mon, 24 Nov 2014 09:19:56 +0000 (10:19 +0100)]
[mc] Fix umpire tests
Gabriel Corona [Fri, 21 Nov 2014 07:56:14 +0000 (08:56 +0100)]
Merge branch 'xp'
Gabriel Corona [Fri, 21 Nov 2014 07:55:38 +0000 (08:55 +0100)]
Revert "Temporarily disable an option"
Back to normal.
gabriel corona [Thu, 20 Nov 2014 15:06:21 +0000 (16:06 +0100)]
Temporarily disable an option
The option somehow changes the results in the MC from previous
experiments. It is disabled temporarily in this commit in order to be
able to reproduce those results with the new commits.
Gabriel Corona [Fri, 7 Nov 2014 15:17:18 +0000 (16:17 +0100)]
Infrastructure for statically defined tracepoints
3 modes are supported on compilation:
* normal (no SDT);
* SDT (systemtap statically defined tracepoint);
* UST (lttng userspace static tracepoint, compatible with systemtap
if LTTNG_UST_HAVE_SDT_INTEGRATION).
Gabriel Corona [Tue, 18 Nov 2014 09:46:19 +0000 (10:46 +0100)]
[mc] Remove reference to DW_TAG_mutable_type:
Is was in a DWARFv3 draft but was removed from the final version: it
was removed from libdw which breaks compilation.
Martin Quinson [Tue, 18 Nov 2014 08:44:34 +0000 (09:44 +0100)]
typo -= 2
degomme [Tue, 18 Nov 2014 07:00:13 +0000 (08:00 +0100)]
protect these calls to smpi_datatype_size as they are not always relevant
degomme [Mon, 17 Nov 2014 22:42:46 +0000 (23:42 +0100)]
Fix problem with unknown datatypes in replay/tracing.
When datatype was unknown to replay, it was replayed as MPI_BYTE.
This modification adds a parameter to encode_datatype, to tell tracing that the datatype size has to be taken into account in the count parameter
This results in the fact that a message of count*datatype_size being replayed as a message of (count*datatype_size)*sizeof(MPI_BYTE), which is the same.
This is not a perfect or elegant solution, but :
- it works.
- it handles manually created datatypes
- it doesn't break previously generated replay files
- it avoids testing each time 50 different datatypes (see encode_datatype function)
- the new parameter avoids doing strcmp with "-1" at each time, performance should not be too bad
degomme [Mon, 17 Nov 2014 22:24:32 +0000 (23:24 +0100)]
There should not be msg datatypes here
Gabriel Corona [Mon, 17 Nov 2014 14:34:34 +0000 (15:34 +0100)]
[mm] Disable HAVE_GNU_LD code in order to get rid of the junkarea
The HAVE_GNU_LD mode of mmalloc delegates in some cases to standard
malloc()/free() which are resolved with dlsym(). This cause some
bootstrap problems which are only resolved with the junkarea: the
junkarea is regularly broken when adding dependencies because the
junkarea is then too small.
By disabling the HAVE_GNU_LD path, we get rid of the junkarea hack.
Martin Quinson [Thu, 13 Nov 2014 21:01:35 +0000 (22:01 +0100)]
clean after augustin, as usual
Martin Quinson [Thu, 13 Nov 2014 20:59:09 +0000 (21:59 +0100)]
Remove the unmodified NAS examples as they are really useless nowadays
I'm still unsure of what to do with the modified ones. I vote for
removing them if we have enough examples already.