Gabriel Corona [Fri, 6 Feb 2015 12:55:18 +0000 (13:55 +0100)]
[mc] Flag global variables in mc_ignore as belonging to the MCer
Gabriel Corona [Fri, 6 Feb 2015 11:58:46 +0000 (12:58 +0100)]
[mc] Communicate MC_remove_ignore_heap to the remote model-checker
Gabriel Corona [Fri, 6 Feb 2015 11:08:12 +0000 (12:08 +0100)]
[mc] Do not call malloc_no_memset in mc_snapshot
The main code should not called mmalloc/mmfree/mmrealloc directly
because it will fail in client/server mode: the server does not have
separate heap and will choke on this.
Replace mmalloc_no_memset with malloc_no_memset which does the right
thing:
* either call mmalloc_no_memset;
* or call malloc.
Gabriel Corona [Fri, 6 Feb 2015 10:29:56 +0000 (11:29 +0100)]
[mc] Implement remote support for MC_ignore
Gabriel Corona [Fri, 6 Feb 2015 09:52:48 +0000 (10:52 +0100)]
[mc] Move mc_model_checker in its own .c file
Gabriel Corona [Thu, 5 Feb 2015 14:03:51 +0000 (15:03 +0100)]
[mc] Communication of heap_area_to_ignore to the remote MCer
Gabriel Corona [Thu, 5 Feb 2015 14:03:36 +0000 (15:03 +0100)]
[mc] Define one struct per MC message type
Gabriel Corona [Tue, 3 Feb 2015 14:44:28 +0000 (15:44 +0100)]
[mc] Move MC_init_pid outside of mc_server
Gabriel Corona [Tue, 3 Feb 2015 14:36:02 +0000 (15:36 +0100)]
[mc] Remove some functions in mc_server
Gabriel Corona [Tue, 3 Feb 2015 10:26:44 +0000 (11:26 +0100)]
[mc] Basic infrastructure for a real model-checker process
The model checker process communicates with the model-checked
application using socket (and wait). Currently it waits for the MCed
process initialisation and fetch its system state, DWARF information,
etc but does not do anything else.
The previous (standalone) mode is currently used by default. The new
behaviour is triggered with the SIMGRID_MC_MODE=server
environment. The idea is to keep the standalone version at least as
long as the new version is not stable/working.
Gabriel Corona [Mon, 2 Feb 2015 13:37:28 +0000 (14:37 +0100)]
Enable C++11
Gabriel Corona [Fri, 30 Jan 2015 12:51:07 +0000 (13:51 +0100)]
[mc] Remove useless code in ~DWARF test
Gabriel Corona [Mon, 19 Jan 2015 14:54:13 +0000 (15:54 +0100)]
[mc] Remote unwinding support
The contexts are still read directly from the current process memory
however.
Gabriel Corona [Fri, 23 Jan 2015 09:11:15 +0000 (10:11 +0100)]
[mc] Make a copy of the libunwind context when snapshoting the stacks
The libunwind cursors used in `mc_snapshot_stack_t` were referencing
the real/live libunwind contexts: this is wrong because those contexts
change with the simulated application. Instead, we need to take a copy
of the context.
Gabriel Corona [Fri, 16 Jan 2015 13:10:13 +0000 (14:10 +0100)]
[mc] Add some comments
Gabriel Corona [Fri, 9 Jan 2015 15:04:25 +0000 (16:04 +0100)]
[mc] Create a separate simgrid-mc program
We create a separate program for the model-checker. The goal is that
this program will:
- prepare the environment for the child/main process (environment
variables, maybe LD_PRELOAD a library, pass file descriptors);
- hold all the model-checker state;
- communicate with the child process;
- handle some part of the snapshoting/restoration logic;
- handle the state comparison logic.
Currently it only enables the custom heap in the child process.
Gabriel Corona [Fri, 9 Jan 2015 10:18:26 +0000 (11:18 +0100)]
[mc] Remove remaining bits on hardcoded object list
Gabriel Corona [Fri, 19 Dec 2014 10:56:52 +0000 (11:56 +0100)]
[mc] Add some FIXMEs for cross-process support
Gabriel Corona [Fri, 19 Dec 2014 08:51:16 +0000 (09:51 +0100)]
[mc] Cross-process support for MC_ignore
Gabriel Corona [Thu, 18 Dec 2014 15:03:03 +0000 (16:03 +0100)]
[mc] Implements privatization support for MC_process_read
This is currently needed for cross-process MC in order to read the
heap state.
Gabriel Corona [Tue, 16 Dec 2014 12:06:37 +0000 (13:06 +0100)]
[mc] Abstract the process and a snapshot types with a address_space superclass
Add a `address_space`, superclass of `process` and `snapshot`.
In order to do this, the contract of MC_process_read and
MC_snapshot_read has been uniformized:
* the order of arguments has been harmonized;
* a new flag MC_ADDRESS_SPACE_READ_FLAGS_LAZY is used to avoid copy
when the data is in the current memory;
* MC_NO_PROCESS_INDEX has been renamed into MC_PROCESS_INDEX_MISSING;
* MC_ANY_PROCESS_INDEX has been renamed into MC_PROCESS_INDEX_ANY;
* MC_PROCESS_INDEX_DISABLED is used to access the raw address space
(without privatisation support);
* `const void*` is used instead of `void*` when it possible.
Soem cleanup things are still to be done:
* remove special NULL handling;
* add support for SMPI privatization in the process object.
Gabriel Corona [Tue, 16 Dec 2014 11:05:14 +0000 (12:05 +0100)]
[mc] More comments for mc_dwarf_execute_expression()
Give some basic explanation about the DWARF operations.
Gabriel Corona [Tue, 16 Dec 2014 10:34:38 +0000 (11:34 +0100)]
[mc] Add more information about mc_dwarf_register_to_libunwind()
Gabriel Corona [Thu, 11 Dec 2014 12:14:32 +0000 (13:14 +0100)]
[mc] Support for reading heap state from another process
Gabriel Corona [Thu, 11 Dec 2014 13:51:39 +0000 (14:51 +0100)]
[mc] Fix error handling in MC_process{read,write}
Gabriel Corona [Tue, 9 Dec 2014 14:29:16 +0000 (15:29 +0100)]
[mc] Access memory from another process
The goal is to be able to move MC in a separate process which should
be more robust and easier to develop:
* avoid using two heaps (which is combersome);
* avoid weird interactions bewteen the MC and the application;
* use optimisation for the whole MC process;
* avoid the stack-cleaner for the whole MC process.
The functions MC_process_read and MC_process_write are defined to
abstract memory access:
* when the target process is the current processs, they call
`memcpy`;
* otherwise they call `read` or `write` on `/proc/$pid/mem` (on newer
kernels, `process_vm_readv` and `process_vm_writev`) could be used.
A lot of bits are missing such as:
* access to `std_heap` is currently not process-aware (the current
process is used);
* access to SIMIX layer from MC;
* communcation/synchronisation between the processes;
* …
Limitations:
* for the per-page/chunked snapshot the current implementation uses
an extra copy (and one syscall per page), we can do better than
this.
Gabriel Corona [Tue, 9 Dec 2014 12:15:38 +0000 (13:15 +0100)]
[mc] Enable the custom mm malloc only in MC
We can do better in the future: we can avoid using the main mm malloc
in many cases even for MC.
Gabriel Corona [Tue, 9 Dec 2014 11:17:18 +0000 (12:17 +0100)]
[mm] Allow to disable the mm based `malloc` at runtime
The goal is to enable the mm based `malloc` only when needed and fall
back to the (more efficient) builtin/next implementation when it is
not needed:
* run instrospection-less jobs without it;
* whene the MC and the application are in different processes, the
MC will be able to run with the standard `malloc` and the
application will use mm.
As malloc is needed very early in the application initialisation, an
environment variable is used to change the behaviour.
Gabriel Corona [Tue, 9 Dec 2014 08:16:22 +0000 (09:16 +0100)]
[mc] Optimise most of XBT
Gabriel Corona [Mon, 8 Dec 2014 13:48:47 +0000 (14:48 +0100)]
[mc] Optimise all the MC compilation units
Gabriel Corona [Fri, 5 Dec 2014 15:05:50 +0000 (16:05 +0100)]
[mc] Multiple .so support for region snapshots
The region snapshoting logic can handle a variable number of .so
files:
* add more informations to the snapshot regions,
* the type (heap, library/executable);
* the correspoding library/executable;
* the type of storage (dense/flat, chunked/sparse or privatised)
and the type-specific variables are defined in an enum
(variant/tagged enum).
* SMPI privatisation snapshot regions are stored as children of a
parent snapshot region
Limitation:
* we might want to use a more modular/extensible approach OO for the
snapshot region storage type instead of variant-based approach;
* SMPI can currently only handle privatisation for the local
variables of the executable so this is only supported in the MC as
well for this reason but otherwise the MC is ready to support the
SMPI privatisation of libraries.
Gabriel Corona [Fri, 5 Dec 2014 14:50:56 +0000 (15:50 +0100)]
[mc] Multiple .so support in MC_ignore_local_variable()
Gabriel Corona [Thu, 4 Dec 2014 15:03:20 +0000 (16:03 +0100)]
[mc] Basic support for more other libraries than libsimgrid.so
Gabriel Corona [Thu, 4 Dec 2014 10:02:16 +0000 (11:02 +0100)]
[mc] Move process info in a new s_mc_process_t structure
This is a beginning of the refactoring in order to support MC-ing a
remote process.
Gabriel Corona [Thu, 4 Dec 2014 10:41:02 +0000 (11:41 +0100)]
[mc] Fix distcheck
Gabriel Corona [Thu, 4 Dec 2014 10:07:40 +0000 (11:07 +0100)]
[mc] Don't include libunwind.h in non MC builds
Gabriel Corona [Thu, 4 Dec 2014 09:45:49 +0000 (10:45 +0100)]
Merge branch 'master'
Gabriel Corona [Thu, 4 Dec 2014 09:43:01 +0000 (10:43 +0100)]
[mc] Don't use unprototyped functions
Gabriel Corona [Tue, 2 Dec 2014 13:02:26 +0000 (14:02 +0100)]
[mc] Remove useless header #includes
Augustin Degomme [Tue, 2 Dec 2014 18:17:49 +0000 (19:17 +0100)]
forgot to add this include
Augustin Degomme [Tue, 2 Dec 2014 17:20:13 +0000 (18:20 +0100)]
let's try to please windows
Augustin Degomme [Tue, 2 Dec 2014 14:50:12 +0000 (15:50 +0100)]
Avoid using simcalls here, as by descheduling the process, we could misplace some messages in mailboxes, and end up deadlocking.
Calling directly SIMIX functions is not really the best, but it may fix a bad heisenbug
Augustin Degomme [Tue, 2 Dec 2014 14:44:45 +0000 (15:44 +0100)]
typos-=2
Augustin Degomme [Tue, 2 Dec 2014 14:44:22 +0000 (15:44 +0100)]
avoid problem when freeing pointer with lb!=0
Augustin Degomme [Mon, 1 Dec 2014 13:53:19 +0000 (14:53 +0100)]
do the same thing as before with IB model parameters
Gabriel Corona [Tue, 2 Dec 2014 09:41:38 +0000 (10:41 +0100)]
[mc] Modularise header files for MC
This is a preparation step for the upcoming refactorisation of the MC
code in order to MC an external process.
Gabriel Corona [Tue, 2 Dec 2014 09:00:28 +0000 (10:00 +0100)]
[mc] Define a type for MC object information flags
Gabriel Corona [Mon, 1 Dec 2014 14:31:29 +0000 (15:31 +0100)]
[mc] Remove MC_ignore_global_variable() calls
- compared_pointer which does not exist;
- smpi_current_rank does not exist;
- maestro_stack_start and mastro_stack_end doe not need to be ignored.
Gabriel Corona [Mon, 1 Dec 2014 12:47:04 +0000 (13:47 +0100)]
[mc] Enable MC specific behaviour in replay mode
Gabriel Corona [Mon, 1 Dec 2014 13:01:43 +0000 (14:01 +0100)]
Revert "[mc] Enable MC specific behaviour in replay mode"
This reverts commit
33eca433c4f055cdfcc55e46d125f8708e1848c7.
Build is broken.
Gabriel Corona [Mon, 1 Dec 2014 12:47:04 +0000 (13:47 +0100)]
[mc] Enable MC specific behaviour in replay mode
Gabriel Corona [Mon, 1 Dec 2014 12:17:59 +0000 (13:17 +0100)]
[mc] Remove useless condition check
Gabriel Corona [Mon, 1 Dec 2014 11:33:16 +0000 (12:33 +0100)]
[mc] Only enable the umpire test for MC builds
Gabriel Corona [Thu, 27 Nov 2014 10:25:22 +0000 (11:25 +0100)]
Use pthread mutex instead of semaphore in mm
Gabriel Corona [Mon, 1 Dec 2014 11:16:45 +0000 (12:16 +0100)]
Fix dist
Gabriel Corona [Fri, 28 Nov 2014 13:38:31 +0000 (14:38 +0100)]
s/formated/formatted/
Gabriel Corona [Thu, 30 Oct 2014 13:39:17 +0000 (14:39 +0100)]
[mc] Initial support MC record/replay
The idea is to record an execution path in MC mode inorder to be able
to replay it outside of the MC (event with a non-MC build). Some very
basic (an unobtrusive) MC code is compiled even when MC is disabled.
Martin Quinson [Sat, 29 Nov 2014 13:29:31 +0000 (14:29 +0100)]
and now, fix the java teshsuite, re-sorry
I shouldnt try to hack on simgrid at week-ends :-(
Martin Quinson [Sat, 29 Nov 2014 13:03:13 +0000 (14:03 +0100)]
fix the build of java bundles, sorry
Martin Quinson [Sat, 29 Nov 2014 12:31:19 +0000 (13:31 +0100)]
Dont produce that pdf output that we dont use
Martin Quinson [Sat, 29 Nov 2014 10:56:42 +0000 (11:56 +0100)]
reindent and improve displayed message
Martin Quinson [Sat, 29 Nov 2014 10:53:58 +0000 (11:53 +0100)]
put together the java-based tests
Augustin Degomme [Fri, 28 Nov 2014 17:04:29 +0000 (18:04 +0100)]
move smpi bandwidth and latency factors out of the ifdef HAVE_SMPI. SMPI and IB network models can be used without using SMPI
Augustin Degomme [Fri, 28 Nov 2014 09:29:01 +0000 (10:29 +0100)]
Remove warnings in vm
Augustin Degomme [Fri, 28 Nov 2014 09:20:33 +0000 (10:20 +0100)]
Fix dist
Augustin Degomme [Tue, 25 Nov 2014 12:16:19 +0000 (13:16 +0100)]
remove potential bug / clang warning
size_t being undefined, the comparison < 0 was never true
Takahiro Hirofuchi [Thu, 27 Nov 2014 11:24:07 +0000 (20:24 +0900)]
support timeout of migration
Fixme: The default timeout value is hard-coded. Modify it and compile
the code if necessary.
Takahiro Hirofuchi [Thu, 27 Nov 2014 11:14:17 +0000 (20:14 +0900)]
fix indent in migration code
Takahiro Hirofuchi [Thu, 27 Nov 2014 11:10:55 +0000 (20:10 +0900)]
remove unnecessary comment out
Takahiro Hirofuchi [Thu, 27 Nov 2014 10:27:48 +0000 (19:27 +0900)]
remove the unnecessary vm object in migration
Takahiro Hirofuchi [Thu, 27 Nov 2014 08:55:02 +0000 (17:55 +0900)]
remove unused code in migration
Takahiro Hirofuchi [Thu, 27 Nov 2014 06:49:41 +0000 (15:49 +0900)]
remove trailing space in the migration code
Takahiro Hirofuchi [Thu, 27 Nov 2014 06:28:34 +0000 (15:28 +0900)]
remove unused migration code for CPU overheads
This commit should not affect anything.
Gabriel Corona [Mon, 24 Nov 2014 15:03:10 +0000 (16:03 +0100)]
[mc] Test if the stack-cleaner has any effect
In order to test this:
* we compile the same test program with and without the stack cleaner
(`-fstack-cleaner`, `-fno-stack-cleaner`);
* in this program, we move random bytes in the stack;
* we expect the stack-cleaner to zero them out.
This test in only used if the configure stack-cleaner is detected to
support the `-fstack-cleaner` CLI option (it is the stack-cleaner
compiler wrapper).
Gabriel Corona [Mon, 24 Nov 2014 14:33:47 +0000 (15:33 +0100)]
[mc] Disable/enable the stack-cleaner from a CLI argument (-f[no-]stack-cleaner)
Gabriel Corona [Mon, 24 Nov 2014 12:19:14 +0000 (13:19 +0100)]
[mc] Fix the stack cleaner
The condition was broken and the %rsp limit was too high.
Gabriel Corona [Mon, 24 Nov 2014 09:19:56 +0000 (10:19 +0100)]
[mc] Fix umpire tests
Gabriel Corona [Fri, 21 Nov 2014 07:56:14 +0000 (08:56 +0100)]
Merge branch 'xp'
Gabriel Corona [Fri, 21 Nov 2014 07:55:38 +0000 (08:55 +0100)]
Revert "Temporarily disable an option"
Back to normal.
gabriel corona [Thu, 20 Nov 2014 15:06:21 +0000 (16:06 +0100)]
Temporarily disable an option
The option somehow changes the results in the MC from previous
experiments. It is disabled temporarily in this commit in order to be
able to reproduce those results with the new commits.
Gabriel Corona [Fri, 7 Nov 2014 15:17:18 +0000 (16:17 +0100)]
Infrastructure for statically defined tracepoints
3 modes are supported on compilation:
* normal (no SDT);
* SDT (systemtap statically defined tracepoint);
* UST (lttng userspace static tracepoint, compatible with systemtap
if LTTNG_UST_HAVE_SDT_INTEGRATION).
Gabriel Corona [Tue, 18 Nov 2014 09:46:19 +0000 (10:46 +0100)]
[mc] Remove reference to DW_TAG_mutable_type:
Is was in a DWARFv3 draft but was removed from the final version: it
was removed from libdw which breaks compilation.
Martin Quinson [Tue, 18 Nov 2014 08:44:34 +0000 (09:44 +0100)]
typo -= 2
degomme [Tue, 18 Nov 2014 07:00:13 +0000 (08:00 +0100)]
protect these calls to smpi_datatype_size as they are not always relevant
degomme [Mon, 17 Nov 2014 22:42:46 +0000 (23:42 +0100)]
Fix problem with unknown datatypes in replay/tracing.
When datatype was unknown to replay, it was replayed as MPI_BYTE.
This modification adds a parameter to encode_datatype, to tell tracing that the datatype size has to be taken into account in the count parameter
This results in the fact that a message of count*datatype_size being replayed as a message of (count*datatype_size)*sizeof(MPI_BYTE), which is the same.
This is not a perfect or elegant solution, but :
- it works.
- it handles manually created datatypes
- it doesn't break previously generated replay files
- it avoids testing each time 50 different datatypes (see encode_datatype function)
- the new parameter avoids doing strcmp with "-1" at each time, performance should not be too bad
degomme [Mon, 17 Nov 2014 22:24:32 +0000 (23:24 +0100)]
There should not be msg datatypes here
Gabriel Corona [Mon, 17 Nov 2014 14:34:34 +0000 (15:34 +0100)]
[mm] Disable HAVE_GNU_LD code in order to get rid of the junkarea
The HAVE_GNU_LD mode of mmalloc delegates in some cases to standard
malloc()/free() which are resolved with dlsym(). This cause some
bootstrap problems which are only resolved with the junkarea: the
junkarea is regularly broken when adding dependencies because the
junkarea is then too small.
By disabling the HAVE_GNU_LD path, we get rid of the junkarea hack.
Martin Quinson [Thu, 13 Nov 2014 21:01:35 +0000 (22:01 +0100)]
clean after augustin, as usual
Martin Quinson [Thu, 13 Nov 2014 20:59:09 +0000 (21:59 +0100)]
Remove the unmodified NAS examples as they are really useless nowadays
I'm still unsure of what to do with the modified ones. I vote for
removing them if we have enough examples already.
Gabriel Corona [Thu, 13 Nov 2014 15:09:57 +0000 (16:09 +0100)]
Remove warning about uninitialized variable
Gabriel Corona [Thu, 13 Nov 2014 12:10:59 +0000 (13:10 +0100)]
Don't use xbt_os_time() when not needed
Gabriel Corona [Thu, 13 Nov 2014 11:03:11 +0000 (12:03 +0100)]
Fix small leak in NetworkIBModel::NetworkIBMode()
Gabriel Corona [Thu, 13 Nov 2014 09:47:20 +0000 (10:47 +0100)]
[mc] Disable timer in MC
Timers break state comparison.
Gabriel Corona [Fri, 7 Nov 2014 13:29:06 +0000 (14:29 +0100)]
[mc] Fix distcheck
Gabriel Corona [Fri, 7 Nov 2014 12:46:51 +0000 (13:46 +0100)]
[mc] Add useless parends to remove WTF warning-which-is-an-error
Augustin Degomme [Fri, 7 Nov 2014 10:46:13 +0000 (11:46 +0100)]
Simplify use of dict for smpi attr handling.
Switch from char keys to direct int keys, as in MPI, because we can, actually (with dict_ext functions)
Augustin Degomme [Fri, 7 Nov 2014 09:11:51 +0000 (10:11 +0100)]
replay_multiple should really work with out of build tests, now
Gabriel Corona [Fri, 7 Nov 2014 10:30:50 +0000 (11:30 +0100)]
[mc] Don't fork another process in the hop spot, MC_get_current()
Augustin Degomme [Thu, 6 Nov 2014 16:24:50 +0000 (17:24 +0100)]
use the manually privatized version of this algorithm only when needed.
Augustin Degomme [Thu, 6 Nov 2014 15:08:28 +0000 (16:08 +0100)]
add mpi_info_* support to fortran, and activate relevant tests